22. Models of Cooperation
No one has ever become poor by giving.
—Anne Frank
Experts asked to name the most important scientific questions produce a
limited set of responses: How did the universe form? How does
consciousness emerge? Can we find a cure for cancer? One question experts
identify spans the social and biological sciences: How does cooperation
arise?1 Cooperation entails taking an action that is not in one’s self-interest,
which suggests that we should not expect to see much of it. And yet we see
cooperation in myriad domains and at multiple scales. Cells cooperate
through adhesion, where one cell produces extracellular material to which
others can attach. We see cooperation among ants, bees, humans,
organizations of humans, and even nations, which cooperate in the creation
of treaties and international laws.
In this chapter, we use models to take up the questions of how
cooperation emerges, how it is maintained, and how we might create more
of it. These models cannot explain in full the variety of cooperation that
exists in the world—why ravens share their discoveries of carrion, why
naked mole rats collectively defend against predators, why climbing vines
lay down fewer roots when planted adjacent to kin, why termites and bees
build elaborate nests, and why ants lock appendages to form bridges for the
carrying of food—but they will produce insights.2
Although we see many examples of cooperation within and across
species, we also see failures. The extent of cooperation depends on the
circumstances. Federations gain and then lose members; Britain participated
in the creation of the European Union and then exited from it. The same

people who volunteer labor for a school fundraiser may cut in line at the
supermarket or cheat on their taxes. A lion who hunts water buffalo in a
pack may secrete away a warthog kill. And not every species cooperates.
The roots of black walnut trees release juglone, an herbicide, into the soil,
to inhibit the growth of nearby plants.
The diversity of behaviors of cooperating entities—cells, roots, ravens,
people, business firms, and nations—obliges a many-model approach. We
might best model cells and plants as following fixed rules; ravens, ants, and
lions as using more rules that condition on the environment or on past
outcomes; and people, business firms, and nations as looking ahead and
performing cost-benefit calculations.
The first key takeaway from this chapter will be that cooperation can
emerge and be maintained through a variety of mechanisms. We highlight
four mechanisms that enable cooperation: repetition, reputation, local
clustering, and group selection. These mechanisms all enable cooperation
without external intervention or management. They can apply to
cooperating mole rats, bees, and humans alike. Humans also have other
more formal ways to induce and maintain cooperation. In the discussion at
the end of the chapter, we describe institutional solutions such as paying
people to cooperate, punishing them if they do not, and legally mandating
cooperative behavior.
The second takeaway will be that the efficacy of any one of these
mechanisms depends on the behavioral repertoires of those cooperating.
Some mechanisms, notably repetition, work for almost any behavior.
Reputations and norms require forward-looking behavior and information
sharing. They will be most effective for more sophisticated actors.
The effect of clustering, on the other hand, depends on the model.
Cooperation among actors who are selected for or against by evolutionary
forces emerges most often on sparse networks. Cooperation through norms
requires dense networks. The efficacy of group selection depends in a
nuanced way on the ability of the actors to be forward-looking and on their
speed of adaptation. Making actors more forward-looking enhances the
power of group selection. Allowing them to adapt more quickly can hinder
it. To explore these questions and to unpack the interplay between
behavioral assumptions and cooperation, we rely on the familiar Prisoners’
Dilemma game as well as a cooperative action model. The second model

allows us to capture actions that benefit multiple players as well as to model
cooperation on networks.
The remainder of the chapter takes the following form. We begin with a
description of the Prisoners’ Dilemma and show how cooperation can be
maintained among rational actors. We then show how repetition also can
induce cooperation between rule-based actors and why evolving
cooperation is more difficult than maintaining it. We then consider less
sophisticated biological actors and show how kin selection and local
clustering can promote cooperation. The last two sections cover group
selection and the question of how we use these models to produce more
cooperation.

The Prisoners’ Dilemma
The name Prisoners’ Dilemma derives from a story of two people accused
of jointly committing a crime. The authorities have circumstantial evidence
so they offer each person a chance to confess. The accused confront a
dilemma. If neither confesses, each receives a minor sentence based on the
evidence. If only one confesses, then that person receives no punishment
while the other is punished severely. If both confess, both receive harsh
punishment, though not as extreme as in the case where only one confesses.
Figure 22.1 represents this story as a two-player game. Each player can
either cooperate (C) or defect (D). The gray numbers represent the payoff to
the column player and the black numbers the payoff to the row player. Each
player has a dominant strategy to defect: whatever the action of the other
player, defecting produces a higher payoff. However, if both players defect,
each receives a lower payoff than if both cooperated. Thus, self-interest
leads to actions that are collectively worse.

image

Figure 22.1: An Example of a Prisoners’ Dilemma Game

The Prisoners’ Dilemma captures the core incentives of many real-world
contexts. It can model the arms race between the United States and the
former Soviet Union, where defecting corresponds to spending money on
weapons and cooperating to economic development. It can model political
campaigning and whether to go negative (defect) or to run positive
campaign ads (cooperate). It can even explain why male peacocks have
such long tails: each peacock has an incentive to appear stronger and more
robust than the others.
Some instances of the Prisoners’ Dilemma can only be recognized after
the fact. The first adopters of many technologies, such as banks that moved
early into ATM machines, saw their profits increase. When others followed,
profits fell from increased competition. Choosing to put in ATM machines
proved to be an analog of the choice to defect.3

image

Figure 22.2: The Prisoners’ Dilemma

The general form of the Prisoners’ Dilemma, shown in figure 22.2,
assumes a baseline payoff of zero if both players defect. The game can then
be expressed with three variables: a reward, R, from cooperating, a
temptation, T, to defect, and a sucker’s payoff, S, from being exploited (see
box). The inequalities shown in the box ensure that choosing defect is a
dominant strategy and cooperating produces the efficient outcome.

Cooperation Through Repetition and Reputation
We first show how repetition of the game and the building of reputations
can maintain cooperation among rational actors. The fact that cooperation
can be maintained does not guarantee that it will be realized; it says only
that if cooperation happens to emerge, rational players can sustain it. To
prove that repetition maintains cooperation, we construct a repeated game
model in which after each play of the game, with probability P, the game
will be played again. In theory, play could go on forever.
The players apply repeated game strategies, which give an action based
on the history of past play. Here we consider a repeated game strategy
known as Grim Trigger, which cooperates in the first play of the game and
cooperates in any future play of the game so long as the other player has
never defected. If the other player ever defects, Grim Trigger defects
forever. It is unforgiving. If both players use the Grim Trigger strategy, both
cooperate forever.
To prove Grim Trigger maintains cooperation in the repeated game, we
need only show that if one player chooses Grim Trigger, then the other
player receives the highest possible payoff by also playing Grim Trigger.
Given that a deviation by the second player produces endless defection by
the first player, the second player need only compare the expected payoff
from always cooperating to the expected payoff from the one-time benefit
of defecting plus the payoff when both players defect thereafter.4 Whether
of not Grim Trigger produces the higher payoff depends on the extent of
temptation, the reward from cooperating, and the probability that the game
repeats.

Repetition Maintains Cooperation
In the repeated Prisoners’ Dilemma, Grim Trigger maintains
cooperation if the probability of continued play, P, exceeds the ratio of
the difference in the temptation payoff, T, and the reward payoff, R, to
the temptation payoff:5

image

The result tells us that if the temptation payoff exceeds triple the reward
payoff, T > 3R, the game must be repeated with a probability in excess of
two-thirds. The inequality also tells us that cooperation becomes easier to
maintain if the reward increases, the probability of continued play
increases, or the temptation to defect decreases. Each of these implications
reveals an intuitive route to more cooperation: increase the reward, make
continued interaction more probable, and reduce the temptation to defect.
Though these are quite straightforward inferences, they might not have been
at top of mind prior to writing the model.
In pondering the necessary condition for cooperation, we can also infer
less straightforward insights. The expression implies that if players thought
that the probability of continuation would fall below the threshold in the
future, then rational players would stop cooperating before the probability
change occurs, not when the change occurs.6
The logic that repetition supports cooperation among rational players
hinges on a particular feature of the model: a probability of continuing play.
If, instead, we had assumed a fixed number of repetitions—say, that the
game was to be played three times—rational players would not cooperate,
which we can prove by backward induction. Suppose that the game is only
played three times and that the first player announces that she will play
Grim Trigger. Assume that T = 3, R = 2, and S = 1. Given these payoffs, if
the second player cooperates in all three rounds, she earns a total payoff of
6. We need to check that no other strategy generates a higher payoff.
Defecting in the first round produces a payoff of only 2, because after her
defection the first player will defect in the last two rounds. Defecting in the
second round produces a payoff of 5. Neither would be rational. Defecting
in the third round, though, produces a payoff of 7: 2 in each of the first two
periods, and 3 in the last period. Therefore, a rational player defects in the
last round.
The first player, who played Grim Trigger, should recognize the
defection will occur in the third round and also defect. It then follows that
the other player would realize that both players are going to defect in the
third round and so would defect in the second round of the game. By the

same logic, the first player would also defect. This unraveling would
continue to the first round. The same reasoning applies if we repeat the
game any finite number of times. In the last round played, rational players
defect. As result, both have an incentive to defect in the second-to-last
round, and so on and so on. The only rational strategy is to always defect.
Our analysis so far considers two players in isolation. It does not take
into account how a person’s defection might influence how others treat that
person in future interactions. In effect, we drew a boundary around the two
people playing the game. We can extend the model to include a community
of people who monitor the behavior of one another and punish people who
deviate.
To do this formally, we assume that each day people randomly form
pairs and play Prisoners’ Dilemma games. The members of the community
believe that these games will go on forever, so the probability of future play
equals 1. Under these assumptions, an individual will not be likely to play
against the same person the next day, so the incentive to defect will be
higher. However, we allow for the possibility that a defection can be
recognized by the community. If so, the person earns a bad reputation and,
by agreement, no one in the community will cooperate with that individual
in the future. If we let PD denote the probability that a person gets caught
defecting, earns a reputation as a defector, and is punished in all future
games, then the condition for cooperation to be maintained through
reputations,
, is identical to the condition for repetition to maintain
cooperation, except that PD, the probability that a person has been caught
defecting, replaces P, the probability of repeated play.
In the reputation model, the community enforces cooperation. Someone
who has defected and has been caught will be defected against by all future
players. Here again, individuals calculate the benefits and costs of
defecting. They must also believe that others will adhere to the punishment,
which in this case means that all others will defect. For that to be true,
individuals must either know one another or have some method of
identifying or tagging past defenders. It follows that, all else equal, people
in small communities should be better able to enforce cooperation through
repetition. In small northern towns, people leave their cars running in store
parking lots during the winter. They do not fear the car being stolen (a

defection) because they know everyone in the town. Anyone who stole a
car, even as a prank, would incur a reputation loss.
Physical tags can make reputations public information in order to
maintain cooperation. In Nathaniel Hawthorne’s novel The Scarlet Letter,
Hester Prynne is forced to wear a scarlet A for committing adultery. Some
cultures amputate the hands of convicted thieves, a rather costly tag.
Tagging of defectors even occurs in other species. The cleaner fish,
Labroides dimidiatus, can clean parasites from other fish (cooperate) or
consume tastier alternatives (defect). If a fish cooperates, its neighbors will
be free of parasites. The lack of parasites is observable to other fish. The
cleanliness of neighboring fish becomes a tag, a visual reputation.7

Connectedness and Reputation
To support cooperation through a reputation mechanism requires that
an individual’s neighbors know of a deviation. To assess the
likelihood of word of a deviation spreading, we can apply three
insights we learned when adding networks to the contagion model.
First, the greater the degree of the network, the more likely it is that
word of deviation would spread. Second, variation in the distribution
of degrees, in particular the existence of superspreaders, would
amplify the likelihood. Third, if an individual defects against someone
who is not connected to any of the individual’s other neighbors, then
the neighbors will not be likely to hear of the defection. Therefore, for
reputations to spread, the network must have a high clustering
coefficient. The clustering coefficient is a proxy for social capital.

Cooperation Among Rule-Playing Behaviors
We now relax the assumption of rationality and assume that players apply
rule-based strategies such as Grim Trigger. We will use our model to
understand whether and how cooperation can emerge. Our model assumes a
population of individuals who play repeated rounds of a Prisoners’
Dilemma game against one another. We assume that each interaction
continues with some probability as above. That construction could induce
rational players to cooperate if the probability of continuation is sufficiently
high.
Unlike above, here we assume that players apply behavioral rules. Some
may play Grim Trigger. Others may always cooperate, and others may
always defect. Variants of these strategies may be played by other species.
Warbler males adopt a “dear enemy” strategy in which they do not engage
in loud singing or fighting to extend their property at the expense of their
neighbors. We can think of this as a cooperative action.8
For ease of explanation, we assume that each individual plays with
every other individual. After every individual has played all her games,
each announces a performance equal to her average payoff in a play of the
game. We use average per-game payoff rather than total payoff because
some players may, by chance, play many more games than others given
probabilistic continuation. In this model setup, a strategy’s performance
depends on the distribution of strategies. It follows that the winning strategy
can then also depend on the initial distribution. If cooperative strategies
perform best initially, cooperation will likely grow in the population.
For our analysis, we randomly assign to each player one of five
behavioral rule strategies: always cooperate (All C), always defect (All D),
Grim Trigger (GRIM), Tit for Tat (TFT), or TROLL. GRIM cooperates
initially and continues cooperating until the opposing player defects, after
which it defects forever. All C and All D do what their names imply: they
blindly cooperate or defect regardless of the other player’s action. TFT
cooperates in the first period and thereafter copies the action of the other
player from the previous period; two players who both use TFT will always
cooperate. TROLL seeks to exploit players who always cooperate. It

defects in the first two periods, and if the other player does not defect in
either of those periods, TROLL defects forever. If the other player does
defect, TROLL switches to cooperate for two periods and thereafter plays
GRIM.
We first calculate the payoffs for each behavioral rule strategy playing
against every other strategy using the payoffs from the Prisoners’ Dilemma
in figure 22.1. We start with the strategy All D. If it plays against All C, it
receives a payoff of 4 in every play of the game. All C, on the other hand,
receives an average payoff of only 1 in those interactions. If All D plays
against either TFT or GRIM, it receives a payoff of 4 in the first play and 2
thereafter. If we assume the game is repeated many times, this will average
out to a little more than 2, so we write it as 2+. When All D plays TROLL,
both defect in the first two periods, and TROLL cooperates in periods three
and four but defects thereafter. All D again earns an average payoff of 2+.
TROLL earns an average payoff of a little less than 2, which we write as 2−.
We can perform similar exercises and compute the expected payoffs for
every pair of strategies.9 Table 22.1 shows the payoff for each strategy
against each of the other strategies.

image

Table 22.1: Average Payoffs for Row Strategies Against Column Strategies
The table reveals a mix of mutual cooperation, mutual defection, and
strategies taking advantage of flaws in other strategies. A careful
examination of the table reveals that four of the five strategies cooperate
with themselves. We will refer to these as the potentially cooperative
strategies. Only TFT cooperates with all four of these potentially
cooperative strategies. So if any combination of those four accounted for
the bulk of a population, TFT would perform well, if not best.10
The thousands of human experiments run on the Prisoners’ Dilemma
reveal tremendous heterogeneity in the strategies people choose. We will
therefore use the payoffs in the table to think through the outcomes given
different distributions. Based on the diversity of payoffs for the different
combinations of strategies, the best strategy will depend on the composition
of the population. In a population that consists mostly of All C, the strategy
All D performs best. If individuals choose to adopt the best strategy, or if
selection operates quickly, then the population might never manage to
cooperate. If learning or selection happens at a moderate rate, players
should move away from All C. Once the population contains few All C, All
D will perform less well than GRIM, TROLL, and TFT. One of these
strategies should take hold. This pattern of defectors performing well
initially and then cooperation taking hold can be found in many
experiments with human subjects as well as in simulations with computerbased artificial agents. We might describe what happens in those cases as
the emergence or evolution of cooperation.
One can imagine any distribution across these five strategies or any
other ensemble of strategies, compute average payoffs, and then think
through what might occur through learning or selection. In a later chapter,
we construct formal models of learning and selection. We rely here on
informal arguments, as we only wish to make the point that whether
cooperation emerges depends on the initial strategies in population and how
people learn or evolve new strategies.
A necessary condition for cooperation to emerge or evolve is that the
payoff from cooperating exceeds the payoff from defecting given the

population. Otherwise, both selection and learning would lead the
population toward defection. To simplify the analysis, we can imagine a
population that consists of cooperative strategies, such as GRIM, All C, and
TFT, and defecting strategies, such as All D. We can then calculate what
would have to be true for the cooperative strategies to perform better on
average. That calculation reveals that evolving cooperation is more difficult
than maintaining cooperation, and that cooperation cannot bootstrap itself—
a small population of cooperators cannot cause cooperation to emerge.11
This distinction between maintaining coordination, emerging or
evolving coordination, and bootstrapping coordination merits revisiting.
Cooperation can be maintained if, when all players cooperate, cooperation
performs best. Maintenance corresponds to cooperation through GRIM
being a Nash equilibrium of the repeated game. Cooperation can emerge or
evolve if the strategies that cooperate when paired in a population
outperform, on average, those that do not. As just argued, the conditions for
emergence of cooperation are harder to satisfy than the conditions for
maintenance of cooperation. In fact, the mathematics shows us that
bootstrapping is all but impossible. If the proportion of cooperators is near
zero, then cooperators earn lower payoffs than defectors. The takeaway
should not be that bootstrapping coordination can never occur, only that it
cannot happen in this model. To obtain cooperation, we need a proportion
of people to cooperate initially. That could happen with people who reflect
on the game, but it seems less likely for bees and tree roots. To understand
how bootstrapping could occur, we need more elaborate models that allow
for local learning, evolution, and group selection. We turn to those now.

Cooperative Action Model
To study how cooperation can emerge, we introduce a cooperative action
model in which individuals can either take a cooperative action or refrain
from doing so.12
The cooperative action imposes a cost on the individual and produces a
benefit to others. Refraining from action imposes no cost and produces no
benefit.
There are several differences between the cooperation action model and
the repeated Prisoners’ Dilemma. First, the individuals in the cooperative
action model are not playing a repeated pairwise game in which they apply
strategies and earn payoffs. Instead, individuals are either cooperators or
non-cooperators. Second, the model does not assume rational actors or
individuals who apply more sophisticated rules. Third, the individuals
belong to an interaction network. Their cooperative actions impact only
those with whom they are connected, their neighbors. Last, because the
individuals have fixed types, they take the same action with all of their
neighbors. A cooperating individual with five neighbors pays the cost of
cooperating five times and produces a benefit to five others.

Cooperative Action Model
A population of N individuals consists of cooperators and defectors
connected in a network. Cooperation incurs a cost C and produces a
benefit B to the other player for each interaction. Defecting produces
no cost or benefit. The ratio of cooperative advantage,
, captures
the potential gains from cooperation.

The network plays a key role in allowing cooperation to emerge and
even bootstrap itself. A small cluster or group of cooperators who interact
mostly with one another perform well and then spread in the population. In
an ecosystem, offspring often locate adjacent to parents. If the offspring of
cooperators are more likely to be cooperators, then bootstrapping
cooperation becomes even easier.
To show that clustering can bootstrap cooperation, we start with a
partially filled network. Each node on that network is a potential location
for an individual. In the biological context, the locations would be feasible
habitats. We then populate a portion of the network with individuals who
are either cooperators or defectors. We might, for example, first draw a
random network with an average degree of 10, then roll a die at each node.
If the die comes up six, we place an individual at that node. If not, we leave
the node empty. If we do place an individual on a node, we roll the die
again. If we roll a five, we place a cooperator on the node. Otherwise, we
place a defector. This procedure populates one-sixth of the nodes on our
network, and one-sixth of occupied nodes contain cooperators.
Given this construction, individuals will differ in their number of
neighbors. Some will have no neighbors. Some will have four or five
neighbors. To enable the growth or demise of cooperation, we populate the
remainder of the network by iteratively filling in the nodes adjacent to
occupied nodes. We assume that an empty node takes the type (i.e., is either
a cooperator or a defector) of the highest-performing type among its
neighbors. Figure 22.3 shows two segments of linear networks. Cooperators

are represented by dark lines, defectors by gray lines, and empty nodes by
dashed dark lines. Each segment contains an empty node in the center with
two neighbors, one defector and one cooperator. In this figure, cooperating
creates a benefit of 2 and imposes a cost of 1.

image

Figure 22.3: Payoffs to Neighbors of an Empty Node in Two Linear Networks

In the top segment of figure 22.3, the defector to the right of the open
node has a cooperating neighbor, so it earns a payoff of 1. The cooperating
neighbor to the left of the open node has a defector as a neighbor, so it earns
a payoff of -1. Given our rules for node filling, because the defecting
neighbor has a higher payoff, the empty node will become a defector. In the
bottom segment, the defecting neighbor of the empty cell has a defecting
neighbor, while the cooperating neighbor of the empty cell is connected to
another cooperator. In the lower segment, therefore, we get the opposite
outcome. Here, the cooperating neighbor has the higher payoff, so the
empty node will become a cooperator.
In this example, a lone cooperator cannot spawn an additional
cooperator, but two adjacent cooperators can. It follows that a small cluster
of cooperative nodes surrounded by empty cells could expand into open
nodes. Therefore, regions of cooperation can emerge from a handful of
cooperators.
We can write more general conditions about whether an empty cell
becomes a cooperator or defector based on the proportions of neighboring
cooperators and defectors and the ratio of cooperative advantage. It follows
that cooperation becomes easier to bootstrap in networks with lower degree.
This finding is the opposite of what we found when analyzing how
reputation maintains cooperation, where a more connected network
increases the likelihood of a defection ruining someone’s reputation, so
more connections help to maintain cooperation. This provides another
example of many-model thinking producing conditional knowledge. The
question of whether connected networks produce more or less cooperation
has no single answer. If cooperation is maintained by sophisticated actors
using reputation, more connected networks will be more cooperative. If
cooperation is bootstrapped or evolved among unsophisticated actors, like
trees or ants, less connected networks should promote more cooperation.

Clustering Bootstraps Cooperation
If the neighbors of an open node include a cooperator of degree D
with K cooperating neighbors and all non-cooperators of the empty
node have no cooperating neighbors, then the open node becomes a
cooperator if and only if the ratio of cooperative advantage exceeds
the ratio of the degree to the number of cooperators:13

image

Group Selection
Our final mechanism for bootstrapping, evolving, and maintaining
cooperation, group selection, relies on competition or selection among
groups.14 To model group selection, we divide the population into
subgroups. Within each subgroup, individuals engage in a version of the
cooperative action model where each individual either cooperates or
defects. As before, we can assign a performance to each individual. We also
assign a performance to each group equal to the average performance of its
members. The model assumes selection among groups in which copies of
the highest-performing groups will replace lower-performing groups. This
advantages groups of cooperators, which will perform better.
The intuition that cooperative groups should take over given group
selection has a catch: within any group, the defectors outperform the
cooperators. Consider two groups of size ten. The first group contains two
cooperators and eight defectors. The second group contains two defectors
and eight cooperators. Assume benefits equal 2 and costs equal 1, as above.
In the first group, each defector’s performance equals 4, as it receives a
benefit of 2 from each cooperator. Each cooperator incurs a cost of 9 and
receives a benefit of only 2, so its performance equals -7. The average
performance of a group member equals 1.8. In the second group, each
defector receives 2 from each of eight cooperators, so its performance
equals 16. Each cooperator’s performance equals 5: it receives 14 from the
other seven cooperators but pays out 9 in costs. The average performance in
the second group equals 7.2.
These calculations reveal a paradoxical insight: within each group,
defectors perform better than cooperators, yet the higher-performing group
contains more cooperators. A tension should be apparent: individual
selection favors defection but group selection favors cooperation. This
tension operates across a variety of ecological, social, political, and
economic contexts. Trees whose roots cooperate with other trees fare worse
individually, but they contribute to stronger ecosystems better able to spread
into open spaces. Cooperative individuals within communities may reap
fewer benefits than defectors, but cooperative communities will grow in

size. Politicians who support their party may be less likely to be reelected
than those who focus on themselves, but cohesive parties will be more
likely to grow. And an individual working at a firm may fare worse by
building talents useful only to her current employer, yet if she does, her firm
will be able to outcompete others.
The cooperative action model helped us to identify and quantify this
tension. To see whether group selection can bootstrap, evolve, or maintain
cooperation, we need to add more detail to our model. Traulson and Nowak
propose an elegant model in which populations grow and new members
resemble the high performers. This construction builds in both individual
and group selection. Selection occurs at the level of individuals, but higher
performers are more likely to come from cooperative groups. When a group
becomes sufficiently large, it divides in two, creating a new group. To
prevent the population from becoming too large, the formation of a new
group causes the eradication of a randomly chosen existing group. This last
feature builds in a weak form of group selection.15
These models show that group selection increases cooperation provided
that the benefit from cooperative action is relatively large and the maximal
group size is small relative to the number of groups. The finding that the
efficacy of group selection depends in part on the ratio of maximal group
size to the number of groups reveals the necessity of competition. Having
more groups implies a greater likelihood of a group of all cooperators. It
also implicitly assumes more competition. The more unexpected result is
that smaller maximal group size enables more cooperation. A smaller
maximal group size prevents groups of cooperators from becoming
dominated by defectors; it limits the effects of individual selection. Think
back to our group of eight cooperators and two defectors. The defectors
perform better. If the group were allowed to grow to size eighty, it would
contain a much larger proportion of defectors before the split occurs. If the
group splits once it has twelve members, in the worst-case scenario the
group consists of two-thirds cooperators when splitting.
The potential for group selection to increase cooperation can be applied
within organizations. Most organizations assign compensation based
primarily on individual performance. Splitting employees into teams that
compete against one another and allocating bonuses and opportunities based

on team performance creates the possibility for inducing cooperative
behavior. If resources go to teams, individuals have incentives to work well
within those teams, to cooperate.16 These incentives should increase
cooperation within teams if the benefits from cooperation are high and if
the size of teams is small relative to the number of teams.
When evaluating the potential for group selection, we must think
carefully about the sophistication of the individual actors. Trees adapt
slowly, so group selection will not have to operate very fast. People adapt
quickly, so if the individual incentives to defect are high, group selection
will have to operate at a correspondingly fast rate. People, though, may also
recognize the group selection effect. They may take into account the
competition among groups and see their self-interest in creating a strong
group. This makes cooperation more likely. All this is to suggest that we
should be careful not to place too much confidence on a specific constraint
that shows cooperation will increase in a particular model. Instead, we
should apply our judgment across many models and ask if the qualitative
insights hold.

Summary
The puzzle of how cooperation takes hold, grows, and is maintained has
been studied by thousands of scholars across a range of disciplines. That
inquiry has been aided by models, most prominent among them the
Prisoners’ Dilemma. If we assume rational actors in a repeated game, the
puzzle goes away. Cooperation can be maintained through threat of
punishment. Punishment can occur directly through repeated play or
indirectly because of reputation. Those mechanisms may explain how
cooperation arises in high-stakes settings with sophisticated people, but it
will not explain why ants, bees, trees, and naked mole rats are so
cooperative. When we took up cooperation among rule players, we found
that evolving cooperation is not as easy. Rational actors can maintain
cooperation in environments in which rule-playing actors cannot evolve it.
We also found that unsophisticated rules, such as Tit for Tat, though not
optimal, can cooperate with one another and not be exploited. Subsequent
work has shown that Tit for Tat performs less well if we assume random
mistakes in play. If a mistake occurs and a player defects, two players each
using Tit for Tat will produce a cycle of defecting and cooperating actions.
If both players accidentally defect, Tit for Tat will result in mutual defection
until another mistake occurs.
In real Prisoners’ Dilemma games, mistakes happen. On September 1,
1983, Korean Airlines flight 007 drifted into Soviet airspace en route to
Seoul, Korea, from Anchorage, Alaska. A Soviet SU-15 shot down the
plane, killing all 269 people on board. The United States saw this as a
defection by the Soviet Union. The Soviet Union, thinking the plane was on
a spy mission, thought this was a defection by the United States.
To avoid endless punishments following a mistake, other strategies—
such as Win Stay, Lose Shift—are more forgiving. Under this strategy, the
mutual cooperation payoff (R) and the temptation payoff (T) are coded as
wins. The other two payoffs are coded as losses. Win Stay, Lose Shift starts
by cooperating; thereafter, if it won, it sticks with whatever it did last
period. If it lost, it switches to the other action. By working through some
examples, you can see how Win Stay, Lose Shift returns to cooperative

behavior.17
We also described two other mechanisms. Clustering can enable
cooperation to bootstrap itself. This mechanism relies on cooperators
playing one another and having cooperation spread through selection.
Group selection operates through a similar logic. Groups of cooperators
perform well and replace groups of defectors. If we construct models, we
find that cooperation that arises through clustering and group selection
requires more stringent conditions than does cooperation through repetition
or reputation. We also learned how the success of the various mechanisms
depends on how we model the individuals. We should not expect these
mechanisms to operate identically for people, ants, and trees. More
sophisticated actors may be better able to sustain cooperation by being
forward-looking, yet they might also be more likely to see the benefits of
defecting when surrounded by cooperators.
Most of our discussion has framed cooperation as beneficial. Entities
can also cooperate in order to exploit others. Firms form cartels to keep
prices artificially high, and countries form coalitions to restrict the supply
of a resource, such as oil, for their own benefit, not for the benefit of
humanity writ large. Cancer cells cooperate to fight off our immune
systems.18 So as we study cooperation, we should keep in mind that it need
not be for the common good. The water buffalo do not benefit from the
lions’ cooperative actions.