22. Models of Cooperation No one has ever become poor by giving. —Anne Frank Experts asked to name the most important scientific questions produce a limited set of responses: How did the universe form? How does consciousness emerge? Can we find a cure for cancer? One question experts identify spans the social and biological sciences: How does cooperation arise?1 Cooperation entails taking an action that is not in one’s self-interest, which suggests that we should not expect to see much of it. And yet we see cooperation in myriad domains and at multiple scales. Cells cooperate through adhesion, where one cell produces extracellular material to which others can attach. We see cooperation among ants, bees, humans, organizations of humans, and even nations, which cooperate in the creation of treaties and international laws. In this chapter, we use models to take up the questions of how cooperation emerges, how it is maintained, and how we might create more of it. These models cannot explain in full the variety of cooperation that exists in the world—why ravens share their discoveries of carrion, why naked mole rats collectively defend against predators, why climbing vines lay down fewer roots when planted adjacent to kin, why termites and bees build elaborate nests, and why ants lock appendages to form bridges for the carrying of food—but they will produce insights.2 Although we see many examples of cooperation within and across species, we also see failures. The extent of cooperation depends on the circumstances. Federations gain and then lose members; Britain participated in the creation of the European Union and then exited from it. The same people who volunteer labor for a school fundraiser may cut in line at the supermarket or cheat on their taxes. A lion who hunts water buffalo in a pack may secrete away a warthog kill. And not every species cooperates. The roots of black walnut trees release juglone, an herbicide, into the soil, to inhibit the growth of nearby plants. The diversity of behaviors of cooperating entities—cells, roots, ravens, people, business firms, and nations—obliges a many-model approach. We might best model cells and plants as following fixed rules; ravens, ants, and lions as using more rules that condition on the environment or on past outcomes; and people, business firms, and nations as looking ahead and performing cost-benefit calculations. The first key takeaway from this chapter will be that cooperation can emerge and be maintained through a variety of mechanisms. We highlight four mechanisms that enable cooperation: repetition, reputation, local clustering, and group selection. These mechanisms all enable cooperation without external intervention or management. They can apply to cooperating mole rats, bees, and humans alike. Humans also have other more formal ways to induce and maintain cooperation. In the discussion at the end of the chapter, we describe institutional solutions such as paying people to cooperate, punishing them if they do not, and legally mandating cooperative behavior. The second takeaway will be that the efficacy of any one of these mechanisms depends on the behavioral repertoires of those cooperating. Some mechanisms, notably repetition, work for almost any behavior. Reputations and norms require forward-looking behavior and information sharing. They will be most effective for more sophisticated actors. The effect of clustering, on the other hand, depends on the model. Cooperation among actors who are selected for or against by evolutionary forces emerges most often on sparse networks. Cooperation through norms requires dense networks. The efficacy of group selection depends in a nuanced way on the ability of the actors to be forward-looking and on their speed of adaptation. Making actors more forward-looking enhances the power of group selection. Allowing them to adapt more quickly can hinder it. To explore these questions and to unpack the interplay between behavioral assumptions and cooperation, we rely on the familiar Prisoners’ Dilemma game as well as a cooperative action model. The second model allows us to capture actions that benefit multiple players as well as to model cooperation on networks. The remainder of the chapter takes the following form. We begin with a description of the Prisoners’ Dilemma and show how cooperation can be maintained among rational actors. We then show how repetition also can induce cooperation between rule-based actors and why evolving cooperation is more difficult than maintaining it. We then consider less sophisticated biological actors and show how kin selection and local clustering can promote cooperation. The last two sections cover group selection and the question of how we use these models to produce more cooperation. The Prisoners’ Dilemma The name Prisoners’ Dilemma derives from a story of two people accused of jointly committing a crime. The authorities have circumstantial evidence so they offer each person a chance to confess. The accused confront a dilemma. If neither confesses, each receives a minor sentence based on the evidence. If only one confesses, then that person receives no punishment while the other is punished severely. If both confess, both receive harsh punishment, though not as extreme as in the case where only one confesses. Figure 22.1 represents this story as a two-player game. Each player can either cooperate (C) or defect (D). The gray numbers represent the payoff to the column player and the black numbers the payoff to the row player. Each player has a dominant strategy to defect: whatever the action of the other player, defecting produces a higher payoff. However, if both players defect, each receives a lower payoff than if both cooperated. Thus, self-interest leads to actions that are collectively worse. image Figure 22.1: An Example of a Prisoners’ Dilemma Game The Prisoners’ Dilemma captures the core incentives of many real-world contexts. It can model the arms race between the United States and the former Soviet Union, where defecting corresponds to spending money on weapons and cooperating to economic development. It can model political campaigning and whether to go negative (defect) or to run positive campaign ads (cooperate). It can even explain why male peacocks have such long tails: each peacock has an incentive to appear stronger and more robust than the others. Some instances of the Prisoners’ Dilemma can only be recognized after the fact. The first adopters of many technologies, such as banks that moved early into ATM machines, saw their profits increase. When others followed, profits fell from increased competition. Choosing to put in ATM machines proved to be an analog of the choice to defect.3 image Figure 22.2: The Prisoners’ Dilemma The general form of the Prisoners’ Dilemma, shown in figure 22.2, assumes a baseline payoff of zero if both players defect. The game can then be expressed with three variables: a reward, R, from cooperating, a temptation, T, to defect, and a sucker’s payoff, S, from being exploited (see box). The inequalities shown in the box ensure that choosing defect is a dominant strategy and cooperating produces the efficient outcome. Cooperation Through Repetition and Reputation We first show how repetition of the game and the building of reputations can maintain cooperation among rational actors. The fact that cooperation can be maintained does not guarantee that it will be realized; it says only that if cooperation happens to emerge, rational players can sustain it. To prove that repetition maintains cooperation, we construct a repeated game model in which after each play of the game, with probability P, the game will be played again. In theory, play could go on forever. The players apply repeated game strategies, which give an action based on the history of past play. Here we consider a repeated game strategy known as Grim Trigger, which cooperates in the first play of the game and cooperates in any future play of the game so long as the other player has never defected. If the other player ever defects, Grim Trigger defects forever. It is unforgiving. If both players use the Grim Trigger strategy, both cooperate forever. To prove Grim Trigger maintains cooperation in the repeated game, we need only show that if one player chooses Grim Trigger, then the other player receives the highest possible payoff by also playing Grim Trigger. Given that a deviation by the second player produces endless defection by the first player, the second player need only compare the expected payoff from always cooperating to the expected payoff from the one-time benefit of defecting plus the payoff when both players defect thereafter.4 Whether of not Grim Trigger produces the higher payoff depends on the extent of temptation, the reward from cooperating, and the probability that the game repeats. Repetition Maintains Cooperation In the repeated Prisoners’ Dilemma, Grim Trigger maintains cooperation if the probability of continued play, P, exceeds the ratio of the difference in the temptation payoff, T, and the reward payoff, R, to the temptation payoff:5 image The result tells us that if the temptation payoff exceeds triple the reward payoff, T > 3R, the game must be repeated with a probability in excess of two-thirds. The inequality also tells us that cooperation becomes easier to maintain if the reward increases, the probability of continued play increases, or the temptation to defect decreases. Each of these implications reveals an intuitive route to more cooperation: increase the reward, make continued interaction more probable, and reduce the temptation to defect. Though these are quite straightforward inferences, they might not have been at top of mind prior to writing the model. In pondering the necessary condition for cooperation, we can also infer less straightforward insights. The expression implies that if players thought that the probability of continuation would fall below the threshold in the future, then rational players would stop cooperating before the probability change occurs, not when the change occurs.6 The logic that repetition supports cooperation among rational players hinges on a particular feature of the model: a probability of continuing play. If, instead, we had assumed a fixed number of repetitions—say, that the game was to be played three times—rational players would not cooperate, which we can prove by backward induction. Suppose that the game is only played three times and that the first player announces that she will play Grim Trigger. Assume that T = 3, R = 2, and S = 1. Given these payoffs, if the second player cooperates in all three rounds, she earns a total payoff of 6. We need to check that no other strategy generates a higher payoff. Defecting in the first round produces a payoff of only 2, because after her defection the first player will defect in the last two rounds. Defecting in the second round produces a payoff of 5. Neither would be rational. Defecting in the third round, though, produces a payoff of 7: 2 in each of the first two periods, and 3 in the last period. Therefore, a rational player defects in the last round. The first player, who played Grim Trigger, should recognize the defection will occur in the third round and also defect. It then follows that the other player would realize that both players are going to defect in the third round and so would defect in the second round of the game. By the same logic, the first player would also defect. This unraveling would continue to the first round. The same reasoning applies if we repeat the game any finite number of times. In the last round played, rational players defect. As result, both have an incentive to defect in the second-to-last round, and so on and so on. The only rational strategy is to always defect. Our analysis so far considers two players in isolation. It does not take into account how a person’s defection might influence how others treat that person in future interactions. In effect, we drew a boundary around the two people playing the game. We can extend the model to include a community of people who monitor the behavior of one another and punish people who deviate. To do this formally, we assume that each day people randomly form pairs and play Prisoners’ Dilemma games. The members of the community believe that these games will go on forever, so the probability of future play equals 1. Under these assumptions, an individual will not be likely to play against the same person the next day, so the incentive to defect will be higher. However, we allow for the possibility that a defection can be recognized by the community. If so, the person earns a bad reputation and, by agreement, no one in the community will cooperate with that individual in the future. If we let PD denote the probability that a person gets caught defecting, earns a reputation as a defector, and is punished in all future games, then the condition for cooperation to be maintained through reputations, , is identical to the condition for repetition to maintain cooperation, except that PD, the probability that a person has been caught defecting, replaces P, the probability of repeated play. In the reputation model, the community enforces cooperation. Someone who has defected and has been caught will be defected against by all future players. Here again, individuals calculate the benefits and costs of defecting. They must also believe that others will adhere to the punishment, which in this case means that all others will defect. For that to be true, individuals must either know one another or have some method of identifying or tagging past defenders. It follows that, all else equal, people in small communities should be better able to enforce cooperation through repetition. In small northern towns, people leave their cars running in store parking lots during the winter. They do not fear the car being stolen (a defection) because they know everyone in the town. Anyone who stole a car, even as a prank, would incur a reputation loss. Physical tags can make reputations public information in order to maintain cooperation. In Nathaniel Hawthorne’s novel The Scarlet Letter, Hester Prynne is forced to wear a scarlet A for committing adultery. Some cultures amputate the hands of convicted thieves, a rather costly tag. Tagging of defectors even occurs in other species. The cleaner fish, Labroides dimidiatus, can clean parasites from other fish (cooperate) or consume tastier alternatives (defect). If a fish cooperates, its neighbors will be free of parasites. The lack of parasites is observable to other fish. The cleanliness of neighboring fish becomes a tag, a visual reputation.7 Connectedness and Reputation To support cooperation through a reputation mechanism requires that an individual’s neighbors know of a deviation. To assess the likelihood of word of a deviation spreading, we can apply three insights we learned when adding networks to the contagion model. First, the greater the degree of the network, the more likely it is that word of deviation would spread. Second, variation in the distribution of degrees, in particular the existence of superspreaders, would amplify the likelihood. Third, if an individual defects against someone who is not connected to any of the individual’s other neighbors, then the neighbors will not be likely to hear of the defection. Therefore, for reputations to spread, the network must have a high clustering coefficient. The clustering coefficient is a proxy for social capital. Cooperation Among Rule-Playing Behaviors We now relax the assumption of rationality and assume that players apply rule-based strategies such as Grim Trigger. We will use our model to understand whether and how cooperation can emerge. Our model assumes a population of individuals who play repeated rounds of a Prisoners’ Dilemma game against one another. We assume that each interaction continues with some probability as above. That construction could induce rational players to cooperate if the probability of continuation is sufficiently high. Unlike above, here we assume that players apply behavioral rules. Some may play Grim Trigger. Others may always cooperate, and others may always defect. Variants of these strategies may be played by other species. Warbler males adopt a “dear enemy” strategy in which they do not engage in loud singing or fighting to extend their property at the expense of their neighbors. We can think of this as a cooperative action.8 For ease of explanation, we assume that each individual plays with every other individual. After every individual has played all her games, each announces a performance equal to her average payoff in a play of the game. We use average per-game payoff rather than total payoff because some players may, by chance, play many more games than others given probabilistic continuation. In this model setup, a strategy’s performance depends on the distribution of strategies. It follows that the winning strategy can then also depend on the initial distribution. If cooperative strategies perform best initially, cooperation will likely grow in the population. For our analysis, we randomly assign to each player one of five behavioral rule strategies: always cooperate (All C), always defect (All D), Grim Trigger (GRIM), Tit for Tat (TFT), or TROLL. GRIM cooperates initially and continues cooperating until the opposing player defects, after which it defects forever. All C and All D do what their names imply: they blindly cooperate or defect regardless of the other player’s action. TFT cooperates in the first period and thereafter copies the action of the other player from the previous period; two players who both use TFT will always cooperate. TROLL seeks to exploit players who always cooperate. It defects in the first two periods, and if the other player does not defect in either of those periods, TROLL defects forever. If the other player does defect, TROLL switches to cooperate for two periods and thereafter plays GRIM. We first calculate the payoffs for each behavioral rule strategy playing against every other strategy using the payoffs from the Prisoners’ Dilemma in figure 22.1. We start with the strategy All D. If it plays against All C, it receives a payoff of 4 in every play of the game. All C, on the other hand, receives an average payoff of only 1 in those interactions. If All D plays against either TFT or GRIM, it receives a payoff of 4 in the first play and 2 thereafter. If we assume the game is repeated many times, this will average out to a little more than 2, so we write it as 2+. When All D plays TROLL, both defect in the first two periods, and TROLL cooperates in periods three and four but defects thereafter. All D again earns an average payoff of 2+. TROLL earns an average payoff of a little less than 2, which we write as 2−. We can perform similar exercises and compute the expected payoffs for every pair of strategies.9 Table 22.1 shows the payoff for each strategy against each of the other strategies. image Table 22.1: Average Payoffs for Row Strategies Against Column Strategies The table reveals a mix of mutual cooperation, mutual defection, and strategies taking advantage of flaws in other strategies. A careful examination of the table reveals that four of the five strategies cooperate with themselves. We will refer to these as the potentially cooperative strategies. Only TFT cooperates with all four of these potentially cooperative strategies. So if any combination of those four accounted for the bulk of a population, TFT would perform well, if not best.10 The thousands of human experiments run on the Prisoners’ Dilemma reveal tremendous heterogeneity in the strategies people choose. We will therefore use the payoffs in the table to think through the outcomes given different distributions. Based on the diversity of payoffs for the different combinations of strategies, the best strategy will depend on the composition of the population. In a population that consists mostly of All C, the strategy All D performs best. If individuals choose to adopt the best strategy, or if selection operates quickly, then the population might never manage to cooperate. If learning or selection happens at a moderate rate, players should move away from All C. Once the population contains few All C, All D will perform less well than GRIM, TROLL, and TFT. One of these strategies should take hold. This pattern of defectors performing well initially and then cooperation taking hold can be found in many experiments with human subjects as well as in simulations with computerbased artificial agents. We might describe what happens in those cases as the emergence or evolution of cooperation. One can imagine any distribution across these five strategies or any other ensemble of strategies, compute average payoffs, and then think through what might occur through learning or selection. In a later chapter, we construct formal models of learning and selection. We rely here on informal arguments, as we only wish to make the point that whether cooperation emerges depends on the initial strategies in population and how people learn or evolve new strategies. A necessary condition for cooperation to emerge or evolve is that the payoff from cooperating exceeds the payoff from defecting given the population. Otherwise, both selection and learning would lead the population toward defection. To simplify the analysis, we can imagine a population that consists of cooperative strategies, such as GRIM, All C, and TFT, and defecting strategies, such as All D. We can then calculate what would have to be true for the cooperative strategies to perform better on average. That calculation reveals that evolving cooperation is more difficult than maintaining cooperation, and that cooperation cannot bootstrap itself— a small population of cooperators cannot cause cooperation to emerge.11 This distinction between maintaining coordination, emerging or evolving coordination, and bootstrapping coordination merits revisiting. Cooperation can be maintained if, when all players cooperate, cooperation performs best. Maintenance corresponds to cooperation through GRIM being a Nash equilibrium of the repeated game. Cooperation can emerge or evolve if the strategies that cooperate when paired in a population outperform, on average, those that do not. As just argued, the conditions for emergence of cooperation are harder to satisfy than the conditions for maintenance of cooperation. In fact, the mathematics shows us that bootstrapping is all but impossible. If the proportion of cooperators is near zero, then cooperators earn lower payoffs than defectors. The takeaway should not be that bootstrapping coordination can never occur, only that it cannot happen in this model. To obtain cooperation, we need a proportion of people to cooperate initially. That could happen with people who reflect on the game, but it seems less likely for bees and tree roots. To understand how bootstrapping could occur, we need more elaborate models that allow for local learning, evolution, and group selection. We turn to those now. Cooperative Action Model To study how cooperation can emerge, we introduce a cooperative action model in which individuals can either take a cooperative action or refrain from doing so.12 The cooperative action imposes a cost on the individual and produces a benefit to others. Refraining from action imposes no cost and produces no benefit. There are several differences between the cooperation action model and the repeated Prisoners’ Dilemma. First, the individuals in the cooperative action model are not playing a repeated pairwise game in which they apply strategies and earn payoffs. Instead, individuals are either cooperators or non-cooperators. Second, the model does not assume rational actors or individuals who apply more sophisticated rules. Third, the individuals belong to an interaction network. Their cooperative actions impact only those with whom they are connected, their neighbors. Last, because the individuals have fixed types, they take the same action with all of their neighbors. A cooperating individual with five neighbors pays the cost of cooperating five times and produces a benefit to five others. Cooperative Action Model A population of N individuals consists of cooperators and defectors connected in a network. Cooperation incurs a cost C and produces a benefit B to the other player for each interaction. Defecting produces no cost or benefit. The ratio of cooperative advantage, , captures the potential gains from cooperation. The network plays a key role in allowing cooperation to emerge and even bootstrap itself. A small cluster or group of cooperators who interact mostly with one another perform well and then spread in the population. In an ecosystem, offspring often locate adjacent to parents. If the offspring of cooperators are more likely to be cooperators, then bootstrapping cooperation becomes even easier. To show that clustering can bootstrap cooperation, we start with a partially filled network. Each node on that network is a potential location for an individual. In the biological context, the locations would be feasible habitats. We then populate a portion of the network with individuals who are either cooperators or defectors. We might, for example, first draw a random network with an average degree of 10, then roll a die at each node. If the die comes up six, we place an individual at that node. If not, we leave the node empty. If we do place an individual on a node, we roll the die again. If we roll a five, we place a cooperator on the node. Otherwise, we place a defector. This procedure populates one-sixth of the nodes on our network, and one-sixth of occupied nodes contain cooperators. Given this construction, individuals will differ in their number of neighbors. Some will have no neighbors. Some will have four or five neighbors. To enable the growth or demise of cooperation, we populate the remainder of the network by iteratively filling in the nodes adjacent to occupied nodes. We assume that an empty node takes the type (i.e., is either a cooperator or a defector) of the highest-performing type among its neighbors. Figure 22.3 shows two segments of linear networks. Cooperators are represented by dark lines, defectors by gray lines, and empty nodes by dashed dark lines. Each segment contains an empty node in the center with two neighbors, one defector and one cooperator. In this figure, cooperating creates a benefit of 2 and imposes a cost of 1. image Figure 22.3: Payoffs to Neighbors of an Empty Node in Two Linear Networks In the top segment of figure 22.3, the defector to the right of the open node has a cooperating neighbor, so it earns a payoff of 1. The cooperating neighbor to the left of the open node has a defector as a neighbor, so it earns a payoff of -1. Given our rules for node filling, because the defecting neighbor has a higher payoff, the empty node will become a defector. In the bottom segment, the defecting neighbor of the empty cell has a defecting neighbor, while the cooperating neighbor of the empty cell is connected to another cooperator. In the lower segment, therefore, we get the opposite outcome. Here, the cooperating neighbor has the higher payoff, so the empty node will become a cooperator. In this example, a lone cooperator cannot spawn an additional cooperator, but two adjacent cooperators can. It follows that a small cluster of cooperative nodes surrounded by empty cells could expand into open nodes. Therefore, regions of cooperation can emerge from a handful of cooperators. We can write more general conditions about whether an empty cell becomes a cooperator or defector based on the proportions of neighboring cooperators and defectors and the ratio of cooperative advantage. It follows that cooperation becomes easier to bootstrap in networks with lower degree. This finding is the opposite of what we found when analyzing how reputation maintains cooperation, where a more connected network increases the likelihood of a defection ruining someone’s reputation, so more connections help to maintain cooperation. This provides another example of many-model thinking producing conditional knowledge. The question of whether connected networks produce more or less cooperation has no single answer. If cooperation is maintained by sophisticated actors using reputation, more connected networks will be more cooperative. If cooperation is bootstrapped or evolved among unsophisticated actors, like trees or ants, less connected networks should promote more cooperation. Clustering Bootstraps Cooperation If the neighbors of an open node include a cooperator of degree D with K cooperating neighbors and all non-cooperators of the empty node have no cooperating neighbors, then the open node becomes a cooperator if and only if the ratio of cooperative advantage exceeds the ratio of the degree to the number of cooperators:13 image Group Selection Our final mechanism for bootstrapping, evolving, and maintaining cooperation, group selection, relies on competition or selection among groups.14 To model group selection, we divide the population into subgroups. Within each subgroup, individuals engage in a version of the cooperative action model where each individual either cooperates or defects. As before, we can assign a performance to each individual. We also assign a performance to each group equal to the average performance of its members. The model assumes selection among groups in which copies of the highest-performing groups will replace lower-performing groups. This advantages groups of cooperators, which will perform better. The intuition that cooperative groups should take over given group selection has a catch: within any group, the defectors outperform the cooperators. Consider two groups of size ten. The first group contains two cooperators and eight defectors. The second group contains two defectors and eight cooperators. Assume benefits equal 2 and costs equal 1, as above. In the first group, each defector’s performance equals 4, as it receives a benefit of 2 from each cooperator. Each cooperator incurs a cost of 9 and receives a benefit of only 2, so its performance equals -7. The average performance of a group member equals 1.8. In the second group, each defector receives 2 from each of eight cooperators, so its performance equals 16. Each cooperator’s performance equals 5: it receives 14 from the other seven cooperators but pays out 9 in costs. The average performance in the second group equals 7.2. These calculations reveal a paradoxical insight: within each group, defectors perform better than cooperators, yet the higher-performing group contains more cooperators. A tension should be apparent: individual selection favors defection but group selection favors cooperation. This tension operates across a variety of ecological, social, political, and economic contexts. Trees whose roots cooperate with other trees fare worse individually, but they contribute to stronger ecosystems better able to spread into open spaces. Cooperative individuals within communities may reap fewer benefits than defectors, but cooperative communities will grow in size. Politicians who support their party may be less likely to be reelected than those who focus on themselves, but cohesive parties will be more likely to grow. And an individual working at a firm may fare worse by building talents useful only to her current employer, yet if she does, her firm will be able to outcompete others. The cooperative action model helped us to identify and quantify this tension. To see whether group selection can bootstrap, evolve, or maintain cooperation, we need to add more detail to our model. Traulson and Nowak propose an elegant model in which populations grow and new members resemble the high performers. This construction builds in both individual and group selection. Selection occurs at the level of individuals, but higher performers are more likely to come from cooperative groups. When a group becomes sufficiently large, it divides in two, creating a new group. To prevent the population from becoming too large, the formation of a new group causes the eradication of a randomly chosen existing group. This last feature builds in a weak form of group selection.15 These models show that group selection increases cooperation provided that the benefit from cooperative action is relatively large and the maximal group size is small relative to the number of groups. The finding that the efficacy of group selection depends in part on the ratio of maximal group size to the number of groups reveals the necessity of competition. Having more groups implies a greater likelihood of a group of all cooperators. It also implicitly assumes more competition. The more unexpected result is that smaller maximal group size enables more cooperation. A smaller maximal group size prevents groups of cooperators from becoming dominated by defectors; it limits the effects of individual selection. Think back to our group of eight cooperators and two defectors. The defectors perform better. If the group were allowed to grow to size eighty, it would contain a much larger proportion of defectors before the split occurs. If the group splits once it has twelve members, in the worst-case scenario the group consists of two-thirds cooperators when splitting. The potential for group selection to increase cooperation can be applied within organizations. Most organizations assign compensation based primarily on individual performance. Splitting employees into teams that compete against one another and allocating bonuses and opportunities based on team performance creates the possibility for inducing cooperative behavior. If resources go to teams, individuals have incentives to work well within those teams, to cooperate.16 These incentives should increase cooperation within teams if the benefits from cooperation are high and if the size of teams is small relative to the number of teams. When evaluating the potential for group selection, we must think carefully about the sophistication of the individual actors. Trees adapt slowly, so group selection will not have to operate very fast. People adapt quickly, so if the individual incentives to defect are high, group selection will have to operate at a correspondingly fast rate. People, though, may also recognize the group selection effect. They may take into account the competition among groups and see their self-interest in creating a strong group. This makes cooperation more likely. All this is to suggest that we should be careful not to place too much confidence on a specific constraint that shows cooperation will increase in a particular model. Instead, we should apply our judgment across many models and ask if the qualitative insights hold. Summary The puzzle of how cooperation takes hold, grows, and is maintained has been studied by thousands of scholars across a range of disciplines. That inquiry has been aided by models, most prominent among them the Prisoners’ Dilemma. If we assume rational actors in a repeated game, the puzzle goes away. Cooperation can be maintained through threat of punishment. Punishment can occur directly through repeated play or indirectly because of reputation. Those mechanisms may explain how cooperation arises in high-stakes settings with sophisticated people, but it will not explain why ants, bees, trees, and naked mole rats are so cooperative. When we took up cooperation among rule players, we found that evolving cooperation is not as easy. Rational actors can maintain cooperation in environments in which rule-playing actors cannot evolve it. We also found that unsophisticated rules, such as Tit for Tat, though not optimal, can cooperate with one another and not be exploited. Subsequent work has shown that Tit for Tat performs less well if we assume random mistakes in play. If a mistake occurs and a player defects, two players each using Tit for Tat will produce a cycle of defecting and cooperating actions. If both players accidentally defect, Tit for Tat will result in mutual defection until another mistake occurs. In real Prisoners’ Dilemma games, mistakes happen. On September 1, 1983, Korean Airlines flight 007 drifted into Soviet airspace en route to Seoul, Korea, from Anchorage, Alaska. A Soviet SU-15 shot down the plane, killing all 269 people on board. The United States saw this as a defection by the Soviet Union. The Soviet Union, thinking the plane was on a spy mission, thought this was a defection by the United States. To avoid endless punishments following a mistake, other strategies— such as Win Stay, Lose Shift—are more forgiving. Under this strategy, the mutual cooperation payoff (R) and the temptation payoff (T) are coded as wins. The other two payoffs are coded as losses. Win Stay, Lose Shift starts by cooperating; thereafter, if it won, it sticks with whatever it did last period. If it lost, it switches to the other action. By working through some examples, you can see how Win Stay, Lose Shift returns to cooperative behavior.17 We also described two other mechanisms. Clustering can enable cooperation to bootstrap itself. This mechanism relies on cooperators playing one another and having cooperation spread through selection. Group selection operates through a similar logic. Groups of cooperators perform well and replace groups of defectors. If we construct models, we find that cooperation that arises through clustering and group selection requires more stringent conditions than does cooperation through repetition or reputation. We also learned how the success of the various mechanisms depends on how we model the individuals. We should not expect these mechanisms to operate identically for people, ants, and trees. More sophisticated actors may be better able to sustain cooperation by being forward-looking, yet they might also be more likely to see the benefits of defecting when surrounded by cooperators. Most of our discussion has framed cooperation as beneficial. Entities can also cooperate in order to exploit others. Firms form cartels to keep prices artificially high, and countries form coalitions to restrict the supply of a resource, such as oil, for their own benefit, not for the benefit of humanity writ large. Cancer cells cooperate to fight off our immune systems.18 So as we study cooperation, we should keep in mind that it need not be for the common good. The water buffalo do not benefit from the lions’ cooperative actions.