Introduction

The traits of an individual often affect both its own performance and those of others. The optimal value of a trait for the individual often differs from the optimal value for its group or the population. For example, if an individual bears all the costs of an action that benefits both itself and others, the net balance can be negative for itself and positive for the group. In the simple case that the group is a random sample of the population, natural selection favours the value of the trait that maximizes individual performance, not group performance. This is the essence of social dilemmas. For this study I will define social dilemmas quantitatively. The magnitude of a social dilemma is the difference in fitness between two sets of populations that differ only in how costs and benefits are apportioned—a set in which all costs and benefits of an action accrue to the actor, and a set in which individual actions affect others. To “alleviate” a social dilemma is to decrease this difference by better aligning individual interests with the common good.

Using computer simulations, I explore (1) under what conditions selection favours partner choice, direct reciprocity, and indirect reciprocity, and (2) how these cooperation mechanisms contribute, alone and in concert, to alleviating social dilemmas. In partner choice, an individual observes how its potential partners behave towards others and tries to partner with those that give more help1,2,3,4,5,6,7,8,9. If enough individuals choose partners in this way, more generous individuals tend to partner among themselves, and they outperform less generous ones1,2. In reciprocity, each individual adjusts how much it helps according to the help given by its partners either to it (direct reciprocity10,11,12,13,14) or to others (indirect reciprocity9,15,16). Again, this generates a positive correlation between help given and help received. As individuals that give more help receive more help from others, natural selection favours giving1,17,18.

Individuals can engage in direct reciprocity when they interact repeatedly with the same partner. Reciprocators risk becoming trapped in long-lasting, mutually harmful relationships with non-cooperators, or even between themselves if they make perceptual or other errors. To avoid this, they may tolerate some anti-social behaviour in partners19 or end harmful relationships in the hope of getting better partners20,21. However, these alternatives allow non-cooperators to exploit cooperative partners before being reciprocated or rejected. In simulations, direct reciprocity cannot invade a population in which individuals can opt out of relationships (without partner choice)22,23,24. Here I study simple, unforgiving tit-for-tat reciprocity that is constrained only by perceptual acuity. I do not allow an individual to leave its partner unless it has first chosen another partner—all individuals are always paired.

Indirect reciprocators do not need to repeat interactions with the same partner, and harm non-cooperators more consistently than do direct reciprocators. However, when an indirect reciprocator interacts with a non-cooperator, it loses reputation and thereafter receives less help from other indirect reciprocators16. As in direct reciprocity, indirect reciprocators can alleviate this problem by tolerating some degree of defection in a partner25 or by rejecting a partner26. With enough information and cognitive ability, indirect reciprocators can also consider whether someone withholding help from a partner is justified in doing so27,28. There are many justification criteria29,30. I do not explore any of these alternatives and stick with plain tit-for-tat indirect reciprocity.

Direct and indirect reciprocators can avoid non-cooperators by choosing reputable partners. In simulations that allowed the evolution of both indirect reciprocity and partner choice, Roberts31 found that selection favoured both traits and that this enhanced cooperativeness in the population.

Theory and simulations also indicate that, all else being equal, a clumped distribution of kin, for example due to limited dispersal, facilitates the evolution of direct reciprocity32 and indirect reciprocity33. Like partner choice, limited dispersal would lead to a positive assortment of partners according to their generosity. The question then becomes whether natural selection favours sufficiently limited dispersal in the face of kin competition and inbreeding depression34 and whether all else, including the accumulation of slightly deleterious mutations35,36, is equal. These questions are outside the scope of the present study.

When comparing different mechanisms of cooperation, we face the problem that the parameters we choose may determine which mechanism is more successful. In this study, as in previous ones, helping varies continuously and has marginally increasing costs and decreasing benefits18,37,38. However, in my case marginal changes are explicitly tied to ecological factors. Individuals produce two goods or services (hereafter “goods”, anything that enhances fitness), A and B, and share only B. Individuals that produce more B help their partners more and pay a cost by producing less A. The function that translates the consumption of A and B into fitness has two parameters—the fitness value of B relative to A and how easily one good (A or B) can replace the other in terms of contributing to fitness. I explore the whole range of values of these two parameters (not including antagonistic goods). Few previous theoretical studies of helping have explicitly tied fitness functions to ecological factors39,40,41 or explicitly examined the effect of such factors on the evolution of partner choice, direct reciprocity, and indirect reciprocity42,43.

Partner choice, direct reciprocity and indirect reciprocity all rely on the ability to perceive how others behave and to behave accordingly (hereafter “sensitivity”). In my model this ability varies continuously and independently and has marginally increasing costs (hereafter “information costs”), for each cooperation mechanism. To summarize, I want to answer the above questions (1) and (2) by simulating the evolution of several continuous, independent traits—the production of B and the sensitivities to the production of B by partners and potential partners (Table 1)—that are related to fitness in an ecologically explicit way.

Table 1 Traits that are heritable and can vary between individuals.

Model

To answer (1) under what conditions selection favours partner choice, direct reciprocity, and indirect reciprocity, I vary the two parameters of the fitness function (the relative value of B and the mutual substitutability of A and B; Table 2) plus several ecological and behavioural factors (Table 3). The fitness function determines how many offspring each individual produces. Individuals are asexual. Selection involves the differential production of offspring by genotypes.

Table 2 Other variables mentioned in the text.
Table 3 Factors that vary between simulations.

Individuals are always paired. Social dilemmas stem from the fact that each individual must share B with its partner. To answer (2) how partner choice, direct reciprocity, and indirect reciprocity contribute, alone and in concert, to alleviating social dilemmas I allow different combinations of these cooperation mechanisms in different simulations (Table 4).

Table 4 Types of simulations according to what traits are allowed to mutate.

In this section I explain the above in more detail. In the next section (“Model implementation”) I describe the simulation flow (Box 1) and I justify the constants that are common to all simulations (Table 5).

Box 1 Simulation flow
Table 5 Constants across all simulations.

How many offspring an individual produces depends on the quantities it enjoys of goods A and B. To model this dependence, I use the approach of Tilman44, which has become standard in ecology45. In this approach, how two goods determine the reproductive rate of an organism depends on two parameters: the mutual substitutability and the relative weight or influence of the two goods. For example, an organism may require both A and B to produce any offspring, in which case A and B are essential (non-substitutable), but one unit of A is as important for fecundity as two units of B, in which case A is more influential than B.

Tilman44 used several disparate equations for different degrees of substitutability. However, Arrow et al.46 devised a unified framework that applies to any substitutability. They did so to model economic relationships (between resources and production, or between consumption and well-being) that are conceptually equivalent to those in ecology. With w being fitness, and cA and cB the quantities enjoyed of goods A and B (Table 2),

$$w = \, \left( {({1 } - \alpha )c_{A}^{\rho } + \alpha c_{B}^{\rho } } \right)^{{{1}/\rho }}$$
(1)

If ρ = 0, Eq. (1) is indeterminate but, as proven by Arrow et al.46, can be replaced by:

$$w = c_{A}^{{{1 } - \alpha }} c_{B}^{\alpha }$$
(2)

ρ determines to what degree A and B are mutually substitutable. ρ can range from -∞, meaning that each of A and B is entirely essential, to 1, meaning that they can perfectly substitute each other (w = (1 − α)cA + αcB). A value of 0 corresponds to the midpoint between these extremes. I use values ranging from − 31 to 0.96875. Results for ρ < − 31 are indistinguishable from those for ρ = − 31, and results for ρ > 0.96875 are indistinguishable from those for ρ = 0.96875.

α determines how much A and B influence w. If α = 0.5, A and B are equally influential. If α < 0.5, B is less influential than A. If α > 0.5, B is more influential than A. I modelled values between 0.1 and 0.9. Results for α < 0.1 are indistinguishable from those for α = 0.1, and results for α > 0.9 are indistinguishable from those for α = 0.9.

Equation (1) has several desirable properties. First, given ρ, goods A and B have the same substitutability for any amount of A and B enjoyed46. Second, varying ρ and α yields a wide range of non-antagonistic effects of A and B on fitness44,45.

Each individual produces qB (between 0 and 1) units of B, and qA = 1 − qB units of A. A and B are equally easy to produce. Each individual i must give a proportion g of qBi to its partner j, while it keeps all qAi to itself. So, cAi = qAi and cBi = (1 − g)qBi + gqBj. In each simulation g is a constant, not subject to mutation or phenotypic plasticity. All individuals always have the same g. However, individuals differ in how much B they give to their partners if they differ in qB. An individual that produces more B helps its partner more and pays a larger cost because it enjoys less A.

Let me illustrate what g means with some examples. First, I will give two examples in which g = 1. Suppose that i can remove certain ectoparasites from j but not from itself and does not eat the ectoparasites or get any other direct benefit from grooming j. qBi is the quantity of parasites i removes from j in a given period. qAi is the amount of other good (such as food) that i gets in the same period. The more parasites i removes from j the less time or energy i has left to procure food, and so the smaller is qAi. In the extreme case, i devotes all its time or energy to removing parasites from its partner, so qBi = 1 and qAi = 0. Now suppose instead that B is some kind of food. The amount of food i procures is qBi, and all of it goes to j. This happens, for example, when a dolphin flushes fish out or herds them towards a partner. The larger qBi is, the less time or energy i has to get another good A. In both examples i gets no B for itself—by grooming j, i removes none of its own parasites; each time i procures fish to j, i eats no fish. Thus, g = 1—all qBi goes to j.

In other cases, i may act in a way that directly benefits both itself and its partner, so that g lies between 0 and 1. For example, i finds or captures prey B on which both i and j then feed. In my model i spends time or energy looking for or capturing B, and so it has less time or energy to get good A (for example, a mating opportunity), while feeding on B once i finds it involves negligible time or energy. g is the proportion of all B found by i that is eaten by j, and 1 − g is the proportion eaten by i. Pairs of male lions provide a real world example of this—when one of them hunts a large prey, both feed on it, with nearly all the effort going into the hunting and not the eating.

In all these examples, individuals can change qB (how much they groom a partner, how much they herd fish towards a partner, or how much prey they catch) and qA (as it is inversely related to qB), but have little or no control over g. I here show results of simulations for g = 0, g = 0.5 and g = 1.

If we incorporate the effect of g into Eq. (1),

$$w_{i} = \, \left( {\left( {{1 } - \alpha } \right)q_{Ai}^{\rho } + \alpha \left( {\left( {{1 } - g} \right)q_{Bi} + gq_{Bj} } \right)^{\rho } } \right)^{{{1}/\rho }}$$
(3)

If we incorporate the effect of g into Eq. (2),

$$w_{i} = q_{Ai}^{{{1 } - \alpha }} \left( {\left( {{1 } - g} \right)q_{Bi} + gq_{Bj} } \right)^{\alpha }$$
(4)

If ρ < 1, the qBi = qBj that maximizes wi, given qA = 1 − qB, is

$$q_{B} * \, = { 1}/\left( {{1 } + \, \left( {\left( {{1 } - g} \right)\alpha /({1 } - \alpha )} \right)^{{{1}/(\rho - 1)}} } \right)$$
(5)

If ρ = 0, we can get qB* directly from Eq. (5) or deduce it from Eq. (4) given qA = 1 − qB:

$$q_{B} * \, = \alpha \left( {{1 } - g} \right)/({1 } - \alpha g)$$
(6)

If g = 1 and ρ = 0, qB* = 0. If g = 1 and ρ ≠ 0, Eq. (5) is indeterminate, but qB* → 0 as g → 1, so I use qB* = 0.

Changing the relative effort needed to produce each unit of A versus B has the same effect on qB* as changing α. Therefore, qB* depends on α, ρ and g, and we do not lose generality by stipulating that A and B are equally easy to produce. In infinite populations and in the absence of cooperation mechanisms such as partner choice or reciprocity, selection sets the equilibrium at qB*. Figs. S1S3 and Tables S1S3 of the Supplementary Information illustrate the joint effect of ρ, α and g on the fitness function and on qB*. Varying ρ, α, and g generates a wide range of unimodal fitness landscapes (Fig. S3 of the Supplementary Information).

We calculate the qB that maximizes the joint fitness of each pair of individuals and the population (qBs*) by setting g = 0. If ρ < 1,

$$q_{Bs} * \, = { 1}/\left( {{1 } + \, \left( {\alpha /({1 } - \alpha )} \right)^{{{1}/\left( {\rho - 1} \right)}} } \right)$$
(7)

If ρ = 0, this becomes qBs* = α. If α = 0.5, qBs* = 0.5 regardless of ρ. As ρ → − ∞, qBs* → 0.5 regardless of α. See Fig. S4 of the Supplementary Information to see this.

By definition, if g = 0, qB* = qBs* and there is no social dilemma (no difference between the values that maximize individual and group performance). If g > 0, qB* < qBs*. Therefore, the need to share B generates a social dilemma. If g = 1, qB* = 0. Doebeli et al.47 defined the cases of g = 1 and 0 < g < 1 as the continuous analogues of the prisoner’s dilemma and the snowdrift game, respectively. If g = 1 and B is essential (ρ ≤ 0), individuals are expected to produce no offspring. However, mutation and drift keep qB above zero in the finite populations I simulated (Fig. S4 of the Supplementary Information). If B is substitutable (ρ > 0) and less influential than A (α < 0.5), the social dilemma is mild (qB* ~ qBs*) regardless of g (Figs. S4S6 of the Supplementary Information).

Mutations affect the default qB an individual is born with (qBd). In all variants of reciprocity, individual i changes qBi with a precision or rounding error that correlates with its ability to measure the difference between its partner’s qBj and its own qBi. In tit-for-tat direct reciprocity, individual i uses its default qB the first time it interacts with a new partner; afterwards, it can change qBi to bring it closer to qBj in the previous time step. In indirect reciprocity, i not only observes how its current partner behaves, but also how other potential partners behave, and then acts accordingly towards them once they become partners. For example, i watches how j behaves towards j’s partner k. Once i and j become partners, i uses this information to try to behave towards j as j behaved towards k. So, unlike in direct reciprocity, i can mimic j also the first time it interacts with it, provided that both i and j were already alive in the previous time step.

I have modelled two versions of indirect reciprocity. Both rely on “image scoring”—each individual has a reputation score based on how much it has helped others in the past16. In “short memory” indirect reciprocity, each individual knows the most recent qB of its potential partners16,31,48. This version is the most like tit-for-tat direct reciprocity. In “long memory” indirect reciprocity, each individual knows the lifelong average qB of its potential partners.

In partner choice, two individuals choose to become partners if both detect that the other produces more B than do their current partners, according to their ability to measure the difference between the qB of potential partners. As in indirect reciprocity, I have also modelled short-memory and long-memory partner choice, based on the information available to individuals.

In both partner choice and reciprocity, the higher the sensitivity to the behaviour of others, the higher the information cost and the closer the correlation between qBi and qBj. All sensitivities (for direct reciprocity, short-memory indirect reciprocity, long-memory indirect reciprocity, short-memory partner choice and long-memory partner choice) and qB mutate independently of each other.

Partner choice and indirect reciprocity take place within subsets of the population, as each individual knowing how all other individuals in the population have behaved is unrealistic20,31,49. Thus, I simulate a population that is divided into permanent groups, and individuals only know the members of their own group. I call these groups “markets”50. Offspring disperse randomly to any market in the population. Thus, there is no kin selection due to limited dispersal. In simulations of “shuffling” markets, every time step each individual gets a new partner selected at random from its market. Immediately after this shuffling, if there is partner choice, individuals can agree to switch to a preferred partner. In simulations of “non shuffling” markets, each pair keeps together until one of its members dies or chooses a new partner.

For each combination of long memory allowed and not allowed, shuffling and no shuffling, and large and small markets (with 128 and 4 individuals, respectively), I conducted simulations allowing the evolution of different combinations of traits (Table 4). I did not directly simulate indirect reciprocity in the absence of direct reciprocity, but this is approximated by the case of shuffling in large markets, as individuals rarely repeat partner in successive time steps.

Model implementation

Start


In all simulations the population size is always 4096 (Table 5), a number divisible by 128 to allow the formation of markets of 128 individuals. At the start of each simulation all individuals have the traits and values of Table 1, are paired, and are permanently assigned to markets (Box 1).

Production of offspring


wi (Table 2) is computed for each individual i.

Shuffling


In simulations with “shuffling”, all pairs dissolve and the members of each market are randomly paired.

Partner choice


In simulations with partner choice, two individuals of the same market can mutually agree to leave their current partners and become partners. An individual i chooses a partner using traits s1i or s2i (Table 1). The smaller s1i or s2i are, the choosier i is—it can detect smaller differences in qB between potential partners and thus find a better partner. i uses s1 to compare the qB of potential partners in the previous time step. i uses s2 to compare the average lifetime qB of potential partners. To solve the problem of individuals competing for the same potential partner, individuals partner on the basis of first come first served plus mutual agreement. In the end, individuals do not match perfectly according to their qB and s1 or s2, but they do so better than expected by chance.

Partner choice proceeds as follows. In each time step t and in each market of size n, an individual i is chosen at random. i approaches its − 2 potential partners (the n members of its market except its current partner and itself) in a haphazard order. i and k become partners if i prefers k to its current partner j, and k prefers i to its current partner l. i prefers k to j if qBk − qBj > s, with s being the smaller of s1i and s2i. i and k no longer look for or accept new partners in t. j and l provisionally become new partners. This process is repeated for the remaining market members.

Adult death


Each individual dies with probability d = 2–7. I chose a d small enough to allow partners to interact many times and thus engage in direct reciprocity, but not so small that time to equilibrium is very long. With this d, two individuals remain partners for an average of 64.25 (= 1/(1 − (1 − 2−7)2)) time steps in non-shuffled populations without partner choice.

Establishment of new individuals


Randomly selected offspring from the whole population replace the dead adults. The probability that an offspring of i replaces a dead individual is proportional to wi. A new individual may be born from any parent in the population, and not necessarily from a parent in the same market. The new individual is paired with the partner of the dead one.

A new individual inherits the values of the traits in Table 1 from its parent. According to the type of simulation, the traits marked in Table 4 mutate. I assume that each trait marked in Table 4 is highly polygenic51, so every new individual carries at least a mutation that affects its value. Values in offspring follow a normal distribution centred at the parental value and truncated at 0 and at 1 to obey the ranges in Table 1. Larger standard deviations of this distribution (mutation sizes) generate more drift, adding more noise to results. Smaller standard deviations result in longer times to equilibrium. I use a mutation size of 2–6, which ensures that offspring resemble their parents, and balances noise and time to equilibrium. I generate the mutations with the code in52.

Offspring death


Offspring that do not replace dead adults die.

Reciprocity


In simulations with direct or indirect reciprocity, each individual decides how much to help its partner. Each individual i is born with a default qBdi, and thus with a default qAdi = 1 − qBdi. i uses qBdi in its first time step. In simulations without reciprocity, it keeps using qBdi for the remainder of its life. In simulations with reciprocity, i can change qBi in response to the previous behaviour of its partner j.

In the case of direct reciprocity, trait s3i determines how individual i adjusts qBi to mimic qBj. i mimics j imperfectly, with errors that increase with s3i. i measures |qBj − qBdi| in steps of size s3i. The first step starts at qBdi and the last encloses qBj. If the last step also encloses 0 or 1, which are the hard limits to qB, the step shrinks so that it ends in 0 or 1. i changes qBi if |qBj − qBdi| > 0 steps; that is, if |qBj − qBdi| > s3i. In such a case, it then sets qBi at the midpoint of the step that encloses qBj. If s3i = 1, i sees every |qBj − qBdi| as 0 steps, and thus it never reciprocates. If s3i = 0, i would perfectly copy j.

Traits s4 and s5 work like s3. However, they differ in what they perceive as qBj. For s3, qBj is the qB of j in time step − 1, provided j was i’s partner also in − 1. Thus, i uses s3 for simple, tit-for-tat, direct reciprocity.

For s4, qBj is the qB of j in − 1, regardless of whether j was i’s partner in − 1 or not. This means that i knows the value of qBj from i’s direct experience with j, from observing j’s behaviour towards another individual, or from gossip. Thus, i uses s4 for indirect reciprocity with memory one.

For s5, the relevant trait of j is not qBj in − 1, but the average of all the values of qBj since j was born. Thus, i uses s5 for indirect reciprocity with lifetime image scoring. The first time step with j, i uses the smaller of s4 and s5; afterwards, it uses the smaller of s3 and s5.

Repeats


A new time step starts with the computation of wi. I run the simulations until there is no directional change in the population averages of the traits in Table 1. I present the means of 30 runs of each simulation.

Information costs


Lower values of traits s1 to s5 are more costly. Comparing two values of qB with smaller s results in smaller rounding errors but requires better senses or cognition. Constant f modulates the costs:

$$w_{i} = \, \left( {({1 } - \alpha )q_{Ai}^{\rho } + \alpha \left( {\left( {{1 } - g} \right)q_{Bi} + gq_{Bj} } \right)^{\rho } } \right)^{{{1}/\rho }} + f\sum_{{v = {\text{ 1 to 5}}}} {\text{ln}}\left( {s_{vi} } \right)$$

except if ρ = 0,

$$w_{i} = q_{Ai}^{{{1 } - \alpha }} \left( {\left( {{1 } - g} \right)q_{Bi} + gq_{Bj} } \right)^{\alpha } + f\sum_{{v = {\text{ 1 to 5}}}} {\text{ln}}\left( {s_{vi} } \right).$$

Notice that fv = 1 to 5ln(svi) < 0. I use f = 2–15, which is small but still forbids traits s1 to s5 to be subject only to drift. Given the initial trait values (Table 1), at the start of a simulation individuals cannot choose or mimic partners and pay no information costs.

Controls


For each factor level (Table 3), I run control simulations with dummy traits s1C to s5C instead of s1 to s5. These are heritable, mutate and incur the same information costs as s1 to s5, but do not cause individuals to switch or mimic partners. They evolve according only to drift and negative selection due to information costs.

Results and discussion

I show a summary of the results of simulations with cooperation mechanisms in Figs. S7S39 of the Supplementary Information (see also Table S4 of the Supplementary Information). Here I show a subset.

Partner choice in the absence of direct and indirect reciprocity evolutionarily succeeds in more cases (Fig. 1) and alleviates social dilemmas in more cases and more thoroughly (Fig. 2) than do direct reciprocity (Figs. 3 and 4) and indirect reciprocity (Figs. 4 and 5) in the absence of partner choice. This is particularly true when markets are large (see also21,31,42) and when B is essential. If markets are large, short-memory partner choice alone alleviates social dilemmas almost as much as do all the cooperation mechanisms studied here (short-memory partner choice, long-memory partner choice, direct reciprocity, short-memory indirect reciprocity and long-memory indirect reciprocity) evolving together (Fig. 6). Partner choice enhances the evolution of indirect reciprocity in large, shuffled markets when g = 1 (Fig. 7).

Fig. 1
figure 1

Short-memory partner choice. Sensitivities for short-memory partner choice (s1) when this is the only cooperation mechanism allowed, as deviations from expected values. I show differences between s1 and s1C, a dummy variable with the same mutation rate and information costs as s1 but no effect on behaviour in control simulations with no cooperation mechanisms (s1C − s1, so that a redder colour indicates a higher sensitivity). I show here the results for g = 1. See Fig. S7 of the Supplementary Information for g = 0.5. (a, c) Partners keep together unless one of them decides to switch partner or dies. (b, d) Partners are randomly shuffled within markets every time step before individuals choose a partner. (a, b) Markets have 128 individuals. (c, d) Markets have 4 individuals.

Fig. 2
figure 2

Increase in fitness under short-memory partner choice. Difference between fitness (w) in simulations with short-memory partner choice and w in control simulations without cooperation mechanisms, for g = 1. The remaining properties of the figure are the same as those of Fig. 1. Partner choice effectively eliminates social dilemmas in populations with large markets and no shuffling (a) (see also Figs. S8S11 of the Supplementary Information).

Fig. 3
figure 3

Direct reciprocity. Sensitivities for direct reciprocity (s3) when this is the only cooperation mechanism allowed, as deviations from expected values. I show differences between s3 and s3C, a dummy variable with the same mutation rate and information costs as s3 but no effect on behaviour in control simulations with no cooperation mechanisms (s3C − s3, so that a redder color indicates a higher sensitivity). I show here the results for g = 1. See Fig. S12 of the Supplementary Information for g = 0.5. (a) Partners keep together unless one of them dies, and market size is irrelevant. (b, c) Partners are randomly shuffled within markets every time step. (b) Markets have 128 individuals and there is little opportunity for direct reciprocation. (c) Markets have 4 individuals.

Fig. 4
figure 4

Increase in fitness under direct reciprocity. Difference between fitness (w) in simulations with direct reciprocity and w in control simulations without cooperation mechanisms. The remaining properties of the figure are the same as those of Fig. 3. I show here the results for g = 1. See also Figs. S12S18 of the Supplementary Information.

Fig. 5
figure 5

Direct reciprocity and short-memory indirect reciprocity. Sensitivities for direct reciprocity (s3) (a, d) and short-memory indirect reciprocity (s4) (b, e) as deviations from expected values, as in Figs. 3 and 4. In these simulations I allow the simultaneous evolution of direct reciprocity and short-memory indirect reciprocity. Partners are randomly shuffled within markets every time step. I show here the results for g = 1. See also Figs. S12-S18 of the Supplementary Information. (c, f) Difference between fitness (w) in simulations with direct reciprocity and short-memory indirect reciprocity and w in control simulations without cooperation mechanisms. (a, b, c) Markets have 128 individuals and there is little opportunity for direct reciprocation. (d, e, f) Markets have 4 individuals. (d) shows that short-memory indirect reciprocity enhances the evolutionary success of direct reciprocity (compare with Fig. 3). (f) shows that direct reciprocity and short-memory indirect reciprocity together alleviate social dilemmas more effectively than does direct reciprocity alone (compare with Fig. 4).

Fig. 6
figure 6

Fitness deficit under short-memory partner choice with and without other cooperation mechanisms. Difference between w in simulations with mechanisms of cooperation and g = 1, and w in simulations without cooperation mechanisms and g = 0 (and thus no social dilemma). I allow short-memory partner choice in all the conditions shown in this figure. See Figs. S35 and S36 of the Supplementary Information for more data. (a, b, e, f) Each individual keeps its partner unless one of them switches partner or dies. (c, d, g, h) Partners are randomly shuffled within markets every time step before individuals choose a partner. (b, d, f, h) I also allow long-memory partner choice, direct reciprocity, short-memory indirect reciprocity, and long-memory indirect reciprocity. (a, b, c, d) Markets have 128 individuals. (e, f, g, h) Markets have 4 individuals.

Fig. 7
figure 7

Long-memory partner choice and long-memory indirect reciprocity, and their effect on B production and fitness. Sensitivities for long-memory partner choice (s2) (a, e) and long-memory indirect reciprocity (s5) (b, f) as deviations from expected values, as in Figs. 35. In these simulations I allow the simultaneous evolution of long-memory partner choice and long-memory indirect reciprocity. I show here the results for g = 1. See also Figs. S19S34 of the Supplementary Information. Partners are randomly shuffled within markets every time step. (c, g) Difference between the production of B (qB) in simulations with long-memory partner choice and long-memory indirect reciprocity and g = 1, and qB in control simulations without cooperation mechanisms and g = 0 (and thus no social dilemma). (c) shows that some populations produce on average more B than is socially optimal. (d, h) Difference between fitness (w) in simulations with long-memory partner choice and long-memory indirect reciprocity and g = 1, and w in control simulations without cooperation mechanisms and g = 0 (and thus no social dilemma). (a, b, c, d) Markets have 128 individuals. (e, f, g, h) Markets have 4 individuals.

If markets are small and B is substitutable and not very influential, partner choice fails to evolve (Fig. 1) and to alleviate social dilemmas (Fig. 2, Fig. S11 of the Supplementary Information). If B is substitutable and not very influential, good cooperators (individuals that produce much B) partnered with bad cooperators (individuals that produce little B) produce many fewer offspring than do pairs of bad cooperators, and pairs of good cooperators do not produce enough extra offspring to compensate for this disadvantage (see Fig. S40 of the Supplementary Information). In contrast, if B is essential, pairs of good cooperators produce many more offspring than do pairs of bad cooperators. This explains why partner choice better succeeds if B is essential. If markets are large, partner choice ensures that good cooperators rarely partner with bad cooperators, and partner choice succeeds also when B is substitutable.

In the absence of direct and indirect reciprocity, selection does not act against partner choice per se—when partner choice is not favoured, its dynamics are nearly neutral (driven by mutation, drift, and negative selection due to information costs). In the absence of direct and indirect reciprocity, long-memory partner choice equates to short-memory partner choice because the average lifelong qB of an individual equates to its most recent qB.

When selection favours partner choice, qB approaches the socially optimal value (Figs. S8 and S9 of the Supplementary Information). Thus, partner choice alleviates social dilemmas (Fig. 2). In some borderline cases qB is bimodal (see Figs. S37 and S38 of the Supplementary Information). In contrast to some previous studies5,8, partner choice in the absence of direct and indirect reciprocity brings qB close to its socially optimal level and not beyond. In my simulations overly generous individuals cannot benefit from attracting many partners because every individual always has a partner, with which it interacts just once per time step. This may contribute to the absence of “runaway cooperation”.

In the absence of partner choice and indirect reciprocity, direct reciprocity differs from partner choice in three ways (Fig. 3, Fig. S12 of the Supplementary Information). First, it is evolutionarily successful only when B is substitutable and not very influential, provided there is an initial social dilemma. Second, when selection does not favour mimicking a partner, it usually depresses the response to partners below that expected from nearly neutral dynamics. And third, reciprocators can foster population-wide cooperation even if they are kept rare by selection (Fig. 4, Figs. S15S18 of the Supplementary Information) (see also53). These three differences all have the same underlying cause. By mimicking others, reciprocators hurt bad cooperators but also often themselves.

If g = 0, individual i mimicking its partner j does not affect j but is often bad for ii behaves optimally only if both i and j behave optimally in the first round of their partnership (qBdi = qBdj = qB*). This by itself would not be a problem in an infinite population (besides information costs), but it is so in a finite population in which mutation and drift ensure that some partners behave suboptimally. Therefore, selection acts against mimicking (Fig. S41 of the Supplementary Information).

If g > 0, an individual with qB* < qBdi < qBs* partnered with a better cooperator (qBdi < qBdj) that does not reciprocate is better-off exploiting it than mimicking it. If g = 1, when partnered with a worse cooperator (qB* < qBdj < qBdi) that does not reciprocate, i is better off mimicking it—if i mimics j, i gets as much B and more A than otherwise. However, if A and B are mutually substitutable but B is much more influential than A, the gain is small because A contributes little to fitness (Fig. S41 of the Supplementary Information). If A and B are highly essential and j produces little or no B, i gains little or nothing by mimicking j because the limiting factor is B, not A. If 0 < g < 1, the situation is even worse because i gets less B by mimicking j than otherwise. On the other hand, in all these cases, reciprocators harm bad cooperators by producing less B and thus giving less B to them. As with partner choice, in some borderline cases qB is bimodal (see Fig. S39 of the Supplementary Information).

In markets of 128 individuals that are shuffled, the probability of getting the same partner in two successive time steps is less than 1/127, taking mortality into account. As a result, there is little opportunity for mimicking, and sensitivity for direct reciprocity is nearly neutral.

In markets of 4 individuals that are shuffled, the probability of getting the same partner in two successive time steps is slightly less than 1/3 and partnerships last an average of 1.8 time steps. The first rounds of partnerships, in which direct reciprocators behave like non-reciprocators, make up a large fraction of all rounds of interactions. Shuffling affects direct reciprocity like high mortality does11,49. As a result, direct reciprocity is positively selected under few combinations of ρ and α.

For short-memory indirect reciprocity to take place, individuals must receive new partners via shuffling or partner choice. In the absence of partner choice, short-term indirect reciprocity has less evolutionary success and contributes less to alleviating social dilemmas (Fig. 5) than short-memory partner choice and direct reciprocity. If markets have 4 individuals and are shuffled, and g = 1, short-memory indirect reciprocity enhances the evolutionary success of direct reciprocity (compare Fig. 3 with Fig. 5) and they together alleviate social dilemmas more effectively than does direct reciprocity alone (compare Fig. 4 with Fig. 5; see also Fig. S15 of the Supplementary Information). If g = 0.5, these effects are minimal (Figs. S12, S13 and S17 of the Supplementary Information).

In the absence of partner choice, long-memory indirect reciprocity has little evolutionary success, with selection depressing it well below the values expected from nearly neutral dynamics under many combinations of ρ and α, when g = 1 and markets are shuffled (Fig. S14). It contributes little to the effect of direct reciprocity on fitness and even counters it when g = 1 and small markets are shuffled (Fig. S17 of the Supplementary Information).

I will now turn to the results of simulations that allow individuals to both choose partners and reciprocate. Keep in mind that this study considers small and equal information costs for all the cooperation mechanisms.

If there is short-memory partner choice, direct reciprocity in large markets (Fig. S20 of the Supplementary Information) and short-memory indirect reciprocity in both large and small markets (Fig. S21 of the Supplementary Information) fail to evolutionary succeed beyond the level expected by nearly neutral dynamics. In populations with small markets and no shuffling, direct reciprocity evolves in a restricted set of combinations of ρ and α and has two contrasting effects on the evolution of partner choice—it harms it when A and B are neither essential nor easily substitutable, and enables it when they are easily substitutable and B is slightly more influential than A (Fig. S19 of the Supplementary Information). In the latter case, the presence of even a few reciprocators that punish bad cooperators erases some of their advantage over partner choosers. Short-memory indirect reciprocators have the same effect in small markets that are shuffled (Fig. S19 of the Supplementary Information). In populations with small, non-shuffled markets, partner choice and direct reciprocity are most effective at solving social dilemmas under different combinations of ρ and α, and when they evolve together no population retains a significant social dilemma (Fig. S25 of the Supplementary Information).

In populations with large markets, no mechanism adds much to alleviating social dilemmas to what short-memory partner choice already achieves when acting alone (Fig. 6, Figs. S35 and S36 of the Supplementary Information). However, there is a strong interaction of partner choice and long-memory indirect reciprocity in populations with large, shuffled markets (Fig. 7). In this case, allowing partner choice enhances long-memory indirect reciprocity (as also found by31), most notably when B is highly essential. At the same time, partner choice is harmed by the interaction with long-memory indirect reciprocity. When B is highly essential, selection depresses both short-memory and long-memory partner choice below the level expected by nearly neutral dynamics. Short-memory partner choice is also depressed when B is substitutable and as influential as A. These effects are less marked in small markets and do not exist in non-shuffled markets (see Figs. S26S30 of the Supplementary Information).

We are now faced with the paradox that in most cases selection only favours indirect reciprocity if we allow the evolution of partner choice, but then, in the presence of indirect reciprocity, selection acts against partner choice. Thanks to partner choice, indirect reciprocators avoid partnering with individuals that produce little of B—they tend to partner with good cooperators, and so they become good cooperators and attractive partners. However, in my simulations partner choice consists of trying to partner with individuals that produce more B than one’s current partner. If there are individuals that produce suboptimally large amounts of B, partner choice forces one to try to partner with them. An indirect reciprocator i that partners with such an individual mimics it and produces a suboptimally high amount of B. By doing so, i becomes a more attractive partner for partner chooser k and, if i and k become partners and k is an indirect reciprocator, k will mimic the suboptimally high qBi. As a result, some populations produce on average more B than is socially optimal (Fig. 7). This is most obvious when B is substitutable and as influential as A, in which case producing too much B has little fitness consequences (Fig. 7). In contrast, when A and B are essential, producing very large amounts of B and little of A would have a large effect on fitness, and in these cases selection favours indirect reciprocity and disfavours fine-grained partner choice.

Conclusions

Many organisms can choose partners and reciprocate54. The results above help us to predict when to expect each of these mechanisms. Neither partner choice nor reciprocity should evolve when shared goods are substitutable and contribute little to fitness. Partner choice should evolve more easily when organisms share essential or very influential goods, or when markets of potential partners are large. Direct reciprocity should evolve more easily in intermediate cases. Indirect reciprocity should evolve more easily when individuals can choose partners. We should expect non-reciprocators to cooperate unconditionally if there is some risk of partnering with reciprocators.

Humans systematically choose partners that are not kin43 and have several traits, particularly language, that help them to do so in large markets. We not only watch how others behave but use second-hand information about potential partners55. Partner choice could have allowed people to live in environments in which some goods are essential and obtainable only from others. Examples of these goods include clothing, fire, hunting tools, housing, and knowledge. Although a human can get any one of these goods without help, it cannot get them all. The only way to do so is by distributing tasks and trading with chosen partners.

In my simulations individuals are born with fixed sensitivities and fixed responses to the cooperativeness of others. It is likely that humans differ in their ability and propensity to use partner choice and reciprocity. However, in humans and probably other species, both perception (how much attention we pay to the behaviour of others) and responses (whether we switch partners, reciprocate, do nothing, and so on) are flexible. The results of this study predict that people should behave differently towards partners and potential partners depending on how essential or influential the goods at stake are.