A Regret-Based Algorithm for Computing All Pure Nash Equilibria for Noncooperative Games in Normal Form ()
1. Introduction
In a Nash equilibrium (NE) for an n-person game, every player has a strategy that maximizes his payoff for the other n− 1 players’ strategies. In other words, no player can unilaterally change his strategy in an NE (i.e., with no change of strategy by the other players) and improve his payoff. This stability is the reason an NE is termed an equilibrium. The NE thus models selfish behavior in which each player wants to maximize his payoff for any given set of strategies for the other players. Nash’s fundamental result for a game with a finite number of actions for each player was that every n-person noncooperative game has an equilibrium in mixed strategies, though perhaps not in pure strategies. It should be recalled that a mixed strategy for a player is an assignment of probability to each of the player’s possible actions, with a pure strategy being a special case where one action has a probability of 1 and the rest have probabilities of zero. The joint mixed strategies for the players then determine expected payoffs for each player.
The NE has found application in fields as diverse as economics (Myerson, 1999), computing (Chen, 2015), evolutionary biology (Kastampolidou & Andronikos, 2020), medicine (Zhang, Shan, Gao, & Jia, 2019), psychology (DeDreu, Giacomantonio, Giffin, & Vecchiato, 2019), and artificial intelligence (Fragkos, Tsiropoulou, & Papavassiliou, 2020). The efficient computation of an NE has thus been considerably studied. Most work, however, has focused on the computational complexity for finding mixed NEs, whose existence is guaranteed by Nash’s existence theorem (Nash, 1951). Despite this fact, it remains unknown whether a mixed NE can be computed in polynomial time. However, Papadimitriou (1994) has defined the complexity class PPAD to include this problem, and Daskalakis, Christos, and Papadimitriou (2009) have shown that the computation of a mixed NE is PPAD-complete.
In addition to the computational difficulties associated with mixed strategies, their interpretation is controversial. For example, both von Neumann & Morgenstern (1944) and later Nash (1953) considered a randomizing process an essential part of a mixed strategy, a requirement that Aumann (1985) and Rubinstein (1991) considered problematic since the reasons and methods for the players randomizing their decisions were not specified. Aumann and Brandenburger (1995) later viewed a Nash equilibrium as an equilibrium in beliefs, rather than actions. More recently, Nahhas and Corley (2018) interpreted mixed strategies as the allocation of resources.
However, the principal advantage of mixed strategies seems to be theoretical. They provide NEs when none exist in pure strategies and allow for the development of a rich mathematical theory. Nonetheless, practitioners are frequently ambivalent towards mixed strategies due to the lack of a consistent interpretation and to the current difficulty (if not impossibility) of computing them except for relatively small games.
On the other hand, pure NEs are not guaranteed to exist. Though conceptually simpler than mixed NEs, the associated computations appear even more difficult (Gottlob, Greco, & Scarcello, 2005). In general, determining whether a finite game has a pure Nash is known to be tractable in normal form but NP-complete in other representations such as graphical form (Gottlob, Greco, & Scarcello, 2005). Recent algorithms for determining whether an n-person game in normal form has a pure NE and, if so, for determining all NEs, have been developed in (Buttler & Akchurina, 2013), (Barrios, Luna, & Balcazar, 2016), and (Zaman, Elsayed, Ray, & Sarkerr, 2018), for example. In contrast to these previous approaches, the algorithm presented here uses the concept of regret, which is essentially an opportunity cost, and a regret matrix.
The use of a regret matrix in single-agent decision making is surveyed in (Yager, 2004), (Taha, 2017) , and (Mishra & Tsionas, 2020). For games, regret has been used in (Deligkas, Fearnley, Savani, & Spirakis, 2017), (Zhang, Chen, & Chang, 2020), (Farina, Kroer, & Sandholm, 2019), and the references therein, but apparently not as applied here. Utilizing regret, we present here an efficient algorithm for computing a pure NE for a normal form game. Normal form is the most fundamental representation of a game (Daskalakis & Leyton-Brown, 2009) since all other representations of finite games, such as extensive form, can be encoded in it.
Consider now the game
in normal form, where
is the set of players,
is the finite set of
actions (i.e., pure strategies) for player i, and
is the utility of player i for an action profile
. The NE is defined as follows, where an incomplete strategy profile
denotes a member of
. With this notation and terminology, an NE for
is now defined.
Definition 1. The action profile
is an NE of
if and only if
for all
. (1)
In Section 2 we define the regret matrix (RM) corresponding to the payoff matrix (PM) for
and prove that a joint action
is a NE if and only if
is the corresponding entry of the RM. In Section 3 the approach of Section 2 is formalized as an algorithm to obtain all NEs for
if any exist. In Section 4 a simple computational example is presented for n = 3.
2. The Regret Matrix
The regret function
is defined as a transformation of player i’s payoffs to losses for a given action profile
.
Definition 2. For the game
, the regret incurred by any player i for
is
for all
. (2)
The regret incurred by player i for an action profile
is thus the difference between the best payoff that player i can obtain for fixed
and the payoff that player i will obtain for
. The following result is needed for the algorithm.
Result 1. The pure strategy profile
is a NE for the game
if and only if the
for all
.
Proof. The strategy profile
is an NE for
if and only if
satisfies (1) of Definition 1. But (1) holds if and only if
in (2) for all
.
We next define the regret matrix (RM) of
.
Definition 3. The regret (RM) of
is the matrix obtained from its PM by replacing
with
for all
.
The associated game
with payoff matrix RM is not equivalent to
in the sense of Myerson (1991) but has the same NEs. The regret matrix provides an efficient approach for determining if a pure NE exists and, if so, obtaining them.
3. The Algorithm
Consider a game
as above, and denote the jth strategy of player i by
. The following pseudocode describes the algorithm PURECOMP for obtaining the set of pure NEs for
from its RM.
While
do
While
do
From (2), compute
for all
. (3)
for all
End While
End While
.
The set of pure NEs of
is P, which may be empty. From (3) the computational complexity of PURECOMP is
, where
is the size of the input to the game. It is the number of individual utilities
of the n players for all possible action profiles
. In particular, for
, then
. PURECOMP is thus linear in the input size. It should be noted, however, that the input size itself is exponential in the number of players. On the other hand, for example, Daskalakis and Leyton-Brown (2009) propose an enumeration using Definition 1 with complexity
, which is at least
because
for all
. But
for
since
, and so PURECOM is considerably faster.
4. Example
For n = 3 we use the notation of Section 3 with
and consider the game
with the PM of Table 1 below. The RM for
is then obtained from the PM via PURECOMP and shown in Table 2. The unique pure NE for
is
with associated payoffs
.
5. Conclusion
An algorithm PURECOM was presented here to determine whether an n-person game in normal form has a pure NE and, if so, to obtain all NEs. In PURECOM the payoff matrix is transformed into a regret matrix. The RM has the property that an action profile of the PM is a pure NE if and only if
is the corresponding element of the RM. The computational complexity of the algorithm is
in the number of elements N of the PM, which is
times faster than a complete enumeration.
PURECOM thus makes two contributions. First, it determines a pure NE by implementing the intuition that an equilibrium should not cause regret for any player. This approach undoubtedly has pedagogical applications. Second, it gives a computational procedure for determining all pure NEs, if any exist, for games with normal form. PURECOM is unlikely to be improved except for payoff matrices with special structure such as certain symmetries.