Rationalizing Irrational Beliefs

In this paper we propose a “behavioral equilibrium” definition for a class of dynamic games of perfect information. We document various experimental studies of the Centipede Game in the literature that demonstrate that players rarely follow the subgame perfect equilibrium strategies. Although some theoretical modifications have been proposed to explain the outcomes of the experiments, we offer another: players can choose whether or not to believe that their opponents use subgame perfect equilibrium strategies. We define a “behavioral equilibrium” for this game; using this equilibrium concept, we can reproduce the outcomes of those experiments.


Introduction
In dynamic games of perfect information, the concept of subgame perfect equilibrium is most commonly used in the prediction of players' behavior.Consider a generic game of finitely many moves, the subgame perfect equilibrium always uniquely exists.While the equilibrium concept is easily understood and the equilibrium characterization is usually straightforward, challenges to its ability to predict players' behavior grow in the literature, both on theoretical front and experimental front.
Rosenthal [1] constructed a game (later dubbed the "Centipede Game") that consisted of a sequence of one hundred moves.In this game, each player moves in every alternative period, either to pass (to the next period) or to end the game right away.Passing the game to the next period yields a larger total pile of money, but it strictly reduces the payoff a player receives if the opponent ends the game in her subsequent turn.The unique sub-game perfect equilibrium (SPE) is that the first player ends the game at the first node and each player gets a small sum.Rosenthal argued that it is highly unlikely that, in practice, players will actually choose the SPE strategies when they play that game.
Various centipede game experiments have been conducted to test the predictive power of the concept of SPE.McKelvey and Palfrey [2] reported that only 15% of the players end the game at the first node (the outcome predicted by SPE) in a high-payoff version, and that number reduces to as little as 0.7% in other versions of the centipede game.In a much simplified two-move extensive form game, Goeree and Holt [3] documented that players usually did not trust their opponents to be rational.In contrast, Palacios-Huerta and Volij [4] conduct experiments involving expert chess players, who are known for their high degree of rationality and ability to find optimal strategies using backward induction reasoning.The outcome of their experiments is very close to the SPE prediction.Overall these experimental studies suggest that common knowledge of rationality of all players is the key requirement of SPE and so it is not surprising that players do not follow SPE strategies if they do not believe their opponents are rational.
In an attempt to reconcile the differences between the theory and the experimental outcomes, various modifications to the assumptions of the games used in the experiments have been proposed.McKelvey and Palfrey [2], for example, propose that a player believes that the opponent is an altruist with some positive probability.They find that even a very small such probability can induce players to adopt mixed strategies in the early rounds of the game, mimicking the observed behaviors in their experiment 1 .A few years later, McKelvey and Palfrey [6] use a quantal choice model to re-examine the same experimental results.They show that if one assumes that the probability of implementing a particular strategy is increasing in the equilibrium payoff of the strategy, then the observed behavior more or less coincides with the predictive behavior.Zauner [7] proposes an alternative explanation of McKelvey and Palfrey's experimental results by assuming a random perturbation of each player's payoffs.He considers different types of perturbations and two best-fit models are selected.
In the theoretical literature, game theorists have proposed alternatives to some key assumptions that lead to SPE, including the common knowledge of rationality and backward induction.Aumann [8] formalizes the idea of higher order mutual knowledge2 .Caplan [10] treats irrationality as a standard good, and players need to pay to get closer to some (irrational) "bliss belief."Basu [11] argues that each history of moves reveals certain characteristics of players to one another, and therefore the outcomes of a game depend on these revealed characteristics (instead of depending on rationality alone).Halpern and Pass [12] propose the "iterated regret minimization" as a solution concept for strategic games.They apply it to the centipede games and find that, with linear payoffs, players will cooperate for a number of rounds.With exponential payoffs, they will cooperate all the way up to the end of the game.Meanwhile, Rand and Nowak In an indirect evolutionary model in which a centipede game is played in each stage, Gamba [5] shows that altruism can evolve even if preferences are unobservable.[13] model the stochastic evolution of strategies in the centipede game and find that the players' cooperative behavior may in fact be the favored outcome of natural selection.
Advances in psychology also help explain why players in experiments may behave differently than SPE predicts.Epstein et al. [14] conduct studies that test the cognitive-experiential self-theory.They confirm that two conceptual systems, an experiential system and a rational system, operate by their own rules of inference inside the same individual.To some extent, an individual may switch from one system to another.Tirole [15] builds on similar psychological findings and proposes a model of rational irrationality that can explain why people rehearse good news and selectively forget bad news-a universal behavior.
In this paper, we argue along the lines of the above psychological findings and propose another theoretical explanation of the failure of the SPE as a predictor of behavior.
We emphasize on the observation that even if all players understand fully the concept of subgame perfect equilibrium and even if no players believe that other players are altruists, they still do not follow the SPE strategies when playing the centipede game.We assume that a player can choose to play SPE, i.e. be "rational", or else may choose to be "behavioral".If being "behavioral" yields a better expected outcome than being "rational", then a player would choose to be "behavioral" (or, in terms of standard game theory terminology, "irrational").Our intuition is as follows.SPE strategies are optimal for a player only when other players follow them.If players do not believe that other players will follow SPE strategies, then their own SPE strategies are not, in general, optimal.In the model, we specify an alternative belief for each player regarding the behavior of other players.Each player then has a choice of selecting his belief (between the SPE strategy and the alternative one) at the beginning of the game and then optimizing given the selected belief.A "behavioral equilibrium" is formed if each player is better off in the actual outcomes by selecting the alternative belief.These outcomes of the game are determined by the strategies the players actually used in the game.
The basic idea behind the "behavioral equilibrium" concept is that players can choose to believe that their counterparts can be either fully rational (such that SPE strategies are the best response) or somewhat irrational (so that SPE strategies are not best response any more).Given any belief, the players still optimize by choosing the best strategy.This is the same as in a subgame perfect equilibrium.However, the difference between a behavioral equilibrium and a subgame perfect equilibrium is that those alternative beliefs in a behavioral equilibrium do not usually coincide with those players' actual strategies.If the two are the same, a subgame perfect equilibrium is formed.Therefore, these alternative beliefs are somewhat irrational.Still, these irrational beliefs generate better payoffs than those SPE beliefs.Thus, players will choose these irrational beliefs rationally.
The origin of irrational beliefs is an interesting and open question.Epstein et al. [14] find that there are an experiential and a rational system in each individual and that an individual can switch from one system to another.We conjecture that irrational beliefs may come from the experiential system, while rational beliefs may come from the ra-tional system.As we observed in the above-mentioned experiments, players are better off using the irrational beliefs than the rational beliefs.These irrational beliefs may not translate into the players' "maximum" payoffs.But the payoffs are usually very good, and are much better than the payoffs implied by SPE strategies.Therefore, players may reinforce these irrational beliefs and move away from their rational beliefs.In some sense, these irrational beliefs are the "rules of thumb" for the players.
One real life example related to the centipede games that we examine in this paper is the rotating-savings and credit associations (Roscas), commonly found in many developing countries.(See Besley et al. [16] and Anderson and Baland [17], for example.)In these associations, a predetermined group of individuals get together and contribute a predetermined amount into a "pool" which is then given to one member (winner).
These gatherings repeat themselves, with previous winners excluded from receiving the "pool" while still being obliged to contribute.The gathering may stop after each member has received the "pool" but often the same group continues the Rosca with a new "pool".These Roscas run the risk of earlier winners defaulting on later contributions, a strategy resembling "stopping early" in the centipede game.Still, defaults are very infrequent.Our model of "irrational beliefs" or "rules of thumbs" may shed some light on these phenomena.
The rest of this paper is organized as follows.In Section 2, we analyze a few centipede games using the concept of "behavioral equilibria".In Section 3, we analyze some of the experiments in centipede games in the literature.In Section 4, we conclude.

Centipede Games and Behavioral Equilibria
We begin with a general description of the centipede games.
There are two players, 1 and 2, playing the centipede game of n moves in Figure 1.
To simplify notation, we assume that n is even., SPE B .Here, i SPE represents player i's subgame perfect equilibrium belief on his opponent j's behavior; i.e., player j will play T whenever it is his move.On the other hand, i B denotes player i's alternative belief.Let ( ) , , , n B p p p = be player 1's belief, where k p 2 is the probability that player 2 will play T at node 2k conditional on node 2k being reached.For SPE belief, ( ) . Similarly, we define ( ) The subgame perfect equilibrium belief i SPE is the only belief that satisfies the properties of common knowledge of rationality and backward induction in the centipede game.Therefore, any other belief i B would violate at least one of these proper- ties.This alternative belief may be derived from a player's past game-play experience against other players and/or some "rules of thumb" guesses may have been formed.
Since players in general do not always behave rationally, these "rules of thumb" guesses do not always coincide with the other players' SPE strategies.
In summary, the game we are examining is as follows.Both players simultaneously select their beliefs before the start of the game.Once the belief is selected, it remains the same throughout the game.Given these beliefs regarding an opponent's behavior, players play the above centipede game.Each player's goal is to maximize his expected payoff given his chosen belief.
To simplify our analysis, we assume that the beliefs are not updated during the game.
(Even if we allow for belief updating, we will not get back the SPE beliefs as long as the initial belief is somewhat incorrect.) To analyze the modified centipede game, first note the following.If 1 B is such that playing T at node 1 is the optimal action for player 1, then the game is over at node 1 no matter what belief player 1 has selected.The more interesting case is when playing T at node 1 is not the optimal action.
If player 1 chooses belief 1 SPE and thus plays T at the first node, the game ends at the first node, with payoffs ( ) , a b .If player 1 chooses belief 1 B , player 1 maximizes his expected payoff by choosing the node he plans to play T: Let denote an i that maximizes the above.(Note that there could be many such i's that maximize the above.)Consider player 2 at node 2. The optimal action with the belief of 2 SPE is to end the game right away.In this case, the payoffs are ( ) , a b .

If belief 2
B is chosen, player 2 maximizes his expected payoff by choosing the node he plans to play T: ) ( )( ) j n = denote a j that maximizes the above.(Again, there could be many such j's that maximize the above.) The proposed pure strategy for player 1 is to select 1 B and plan to play T at node In this behavioral equilibrium, players are better off selecting these non-SPE beliefs than selecting the SPE beliefs.These beliefs are reinforced if the players play these games again later.Now consider mixed strategy "behavioral equilibria".Suppose that there are more than one j's that maximize (2), or there are more than one i's that maximize (1), mixed strategies could be used by the players.Let ( ) , , , , , , = denote any of player 1's optimal mixed strategies, where 1 2 , , , k i i i * * * are all of the numbers that maximizes (1).Similarly, let ( ) , , , , , , k j j j s q q q * * * = denote any of player 2's optimal mixed strategies, where 1 2 , , , k j j j * * * are all of the numbers that maximizes (2).Then the outcomes of the game are determined by 1 s and 2 s .
, B s * form a mixed-strategy "behavioral equili- brium" if player 1's payoff is higher by selecting { } , , , , , , , Note that in the above definition, a player's belief may not be correct; that is, i B is not necessarily the same as ( ) , , , , , However, the optimal responses to these "incorrect" beliefs generate higher payoffs to each player than the subgame perfect equilibrium payoffs.Therefore, these "incorrect" beliefs are reinforced.
Note also that the subgame perfect equilibrium strategy profile σ together with the corresponding correct belief i i B SPE = always form a behavioral equilibrium.In fact, according to the definition, there could be many behavioral equilibria in a game.However, in games with dominant strategies, such as the Prisoner's Dilemma games, players using the dominant strategies are the unique behavioral equilibrium, since they are optimal independent of players' beliefs.
Below, we focus on centipede games to illustrate our equilibrium concept.
Example 1 Consider the eight-move centipede game in Figure 2.
Suppose that ( ) . Then it is straight-forward to obtain 1 7 n * = , and 2 6 n * = .That is, player 1 playing T at node 7 is optimal given 1 B , while player 2 playing T at node 6 is optimal given 2 B .The minimum of 1 n * and 2 n * , n * , is 6; that is, the game ends at node 6, with payoffs (2,5).
It is easy to see that { } .Given these beliefs, denote player 1's expected payoff of planning to play T at node i by ( ) . We have ( )   To construct a behavioral equilibrium, player 1's mixed strategy ( ) 3 0, ,1 q must satisfy the following two conditions regarding each player's actual payoffs.First, for player 1, ( ) 3 1 0 q q + − is at least 1, which is player 1's payoff by following SPE strategy and playing T at node 1.This gives us 3 1 3 q ≥ .Second, for player 2, ( ) 0 1 4 q q + − must be at least 2, which is player 2's payoff by following SPE strategy and playing T at node 2. This gives us 3 1 2 q ≤ .Therefore, any 3 would satisfy these two conditions.

Analyzing Previous Centipede Game Experiments
McKelvey and Palfrey [2] report the results of seven different centipede game experiments.Sessions 1 to 3 are four-move centipede games with the following payoffs: , 0.2, 0.8 , 25.6, 6.4 a b = .Table IIA in McKelvey and Palfrey [2] reports the proportion of observations at each terminal node.In that table, i f is used to denote the proportion of games that ends at node i.From these i f 's, we can calculate a player's strategy as follows.For the four-move game, let 1 q and 3 q be the proportion of player 1 who plans to choose TAKE at node 1 and at node 3 respectively.(Therefore, the proportion of player 1 choosing Pass at node 3 is equal to 1 3 1 q q − − .)Similarly, let 2 q and 4 q be the pro- portion of player 2 who plan to choose TAKE at node 2 and at node 4 respectively, and thus the proportion of player 2 choosing Pass at node 4 is equal to 1 q q − − .Then 1 q q f − = , and ( ) 1 q q q f − − =.We define i q si- milarly in the six-move game.Then we have 1 1 q q q f − − =, ( ) 1 q q q f − − =, and ( ) 1 q q q q f − − − =.The results are reported in the following table.
We cannot infer a player's belief in playing these games from the data since many different beliefs could lead to the same observed strategy.Therefore, in each session, we assume that a player's belief corresponds exactly to his rival's revealed strategy and calculate the player's optimal action according to that belief.In the calculations, we assign the players a utility function with a constant degree of absolute risk aversion of 0.5 so  that the players are modestly risk averse.That is, ( ) for player i, where x is the amount of money earned in one game.The results are reported in Table 1 as well.The percentage number after each optimal action is the percentage of players actually choosing the implied optimal action in that session.As we can see from the table, the majority of the players chose the implied optimal action in all but session 3.
We interpret these findings cautiously as our assumption that a player's belief corresponds exactly to his rival's revealed strategy is only one possible specification of beliefs consistent with the behavioral equilibrium.Nevertheless, and in contrast with the predictions of SPE, the behavior of the majority of the players can be explained by our theory.

Conclusion
In this paper, we propose a concept of behavioral equilibrium to explain the observed behavior of players in centipede games.Experimental evidence suggests that players' behavior is inconsistent with game theoretic predictions.We allow players to abandon the "logic" of subgame perfect equilibrium and to choose an alternate belief of opponents' expected behavior formed from previous experience in similar situations.We show that, under certain conditions, players are better off abandoning the "logic" of subgame perfect equilibrium and choosing the alternative belief instead.We argue this reinforces the players' subjective belief that subgame perfect equilibrium may not work well in these games and, by extension, that the alternative belief becomes the belief of choice.We support our theory by re-examining the results of centipede game experiments conducted by other researchers.

2
Consider the six-move centipede game in FigureIn this game, we can construct pure-strategy behavioral equilibria similarly to the last examplethat is, the game ends at node 3.This constitutes a behavioral equilibrium as the final outcome is (3,0), which is weakly better for both players than the SPE outcome of (1,0).Now consider a mixed-strategy behavioral equilibrium.Suppose that between playing T at node 3 and playing T at node 5, we should set

. 3
Session 4 is a high-payoff four-move centipede game where the payoffs are quadrupled.Sessions 5 to 7 are six-move centipede games with the follow- ,1.6 a b = are obtained if player 2 chooses to pass at move 4.
n .The proposed pure strategy for player 2 is to select 2 B and play P at node 2 (if player 1 played P at node 1), and plan to play T at node * * 1

Table 1 .
Players' strategies and optimal actions.