A Forward-Looking Nash Game and Its Application to Achieving Pareto-Efficient Optimization

Recognizing the fact that a player’s cognition plays a defining role in the resulting equilibrium of a game of competition, this paper provides the foundation for a Nash game with forward-looking players by presenting a formal definition of the Nash game with consideration of the players’ belief. We use a simple two-firm model to demonstrate its fundamental difference from the standard Nash and Stackelberg games. Then we show that the players’ belief functions can be regarded as the optimization parameters for directing the game towards a much more desirable equilibrium.


Introduction
Game theory has been very well recognized as a branch of applied mathematical tools, best for analyzing the phenomenon of selfish competition, which arises in numerous real-life applications ranging from economics to social sciences, and even to engineering problems.Many optimization problems can be viewed as a competition problem.
If there are shared resources, then inherently there will be competition in the allocation of such resources.An individual who participates in the resource allocation always wishes to maximize its payoff.Nevertheless, the player's return not only depends on its own strategy, but is also dependent on competitors' responses.As a result, ideally, a player should choose or optimize its strategy based on not only its immediate return but also the possible outcomes of how others might respond to its strategy.Apparently, a player's cognition (i.e., its belief on how the environment as a whole would react to any of its action) will be pivotal in the optimization of its strategy for maximizing its payoff and in defining the equilibrium of the game.
In the literature, there are a number of game-theoretic models and they differ in their assumptions on the players' cognition.Due to this fundamental difference, these models represent different competition environments, thereby resulting in very different outcomes, which at the steady state are referred to as equilibria.When the competition process reaches an equilibrium, by definition, no player would have incentive to change their strategies further to deviate from the equilibrium which is therefore usually considered as a satisfactory outcome to all players.
The most popular model has to be the Nash game [1] in which a player has the belief that other players' strategies are fixed and will not change regardless of what it does.A Nash game is based on such assumption on the players' cognition.It is noted that the game is still very well defined, although the player's belief is totally inaccurate.
On the other hand, in a Stackelberg game [2], there is a super player, commonly known as leader, who knows all the information about its competitors, known as followers.In this case, the leader is considered to have perfect cognition and therefore can obtain the most rewarding strategy.
In most practical competition situations, the efficiency of a Nash game resulting in the most celebrated Nash equilibrium is often poor (which will be revealed by a two-firm example later in this paper).The main reason is that the assumption for the Nash game is too much simplified, and it fails to account for the interaction between the players.In contrast, the assumption for the Stackelberg model is too stringent and it is often too difficult to be qualified to be a leader in most practical scenarios.Nevertheless, the Stackelberg equilibrium is highly beneficial to the leader.The limitation is that if there were two or more leaders, all of which possess perfect cognition and wish to be the biggest winner, this would lead to a tragedy [3].
In this paper, we recognize the importance of players' cognition in a game, and focus on how the assumption of the players' cognition affects the equilibrium.Based on the analysis of Nash and Stackelberg equilibria, we provide a new definition for the Nash game with considera tion of the players' belief.As a useful byproduct, the new definition facilitates the interpretation of the players' belief as the optimization parameters, which, if optimized properly, can direct the game towards a Pareto-efficient equilibrium.

The Two-Firm Model
For illustrative purpose, in this paper, we use a simple two-firm model in economics as an example.In this simple model, there are two firms, Firm A and Firm B. They manufacture an identical product and its unit cost is the same for both firms.In addition, the profit per unit product diminishes as the number of products available in the market increases.In particular, let denote the number of products manufactured by Firm A, and be that manufactured by Firm B. The unit cost is denoted by and the price per product which is set the same by both Firm A and Firm B is assumed to be a b for some constant .Therefore, the profit functions for Firm A and Firm B are, respectively, given by In various game-theoretic models, the objective would be for Firm A and Firm B to iteratively optimize their respective strategies, a x and b x , for maximizing their profits (1) in a competitive fashion.Before we examine different equilibria of this two-firm game, we find it useful to first present the general notations of a game.
For a game with > 1 K players, we denote the strategy profile for player as and use to represent the strategy profile for all the players.Moreover, we use to denote a specific choice of strategy from all the players, where is the strategy adopted by player , and denotes the adopted strategy by all the players except player .Likewise, the reward function for player k is denoted by f at the equilibrium.

Nash Competition
Given the competition of K players, if at some point, no one can gain any further by deviating from its present strategy, then the strategy of all the players is said to have reached to an equilibrium [4].Mathematically, we have     * and : , .
Now, proceed to derive the Nash equilibrium for the two-firm example.According to (4), Firm A solves As such, it can be easily shown that As a result, the game governed by (7) will reach the Nash equilibrium * , 3 3 which leads to the profits x  but does not represent the actual profit function due to interaction from player B.

Stackelberg Competition
A player can benefit more from the rest of the players in cognition about how a game if this player has perfect others would react to its strategy.This is studied formally by the Stackelberg equilibrium.In the general setting with K players, if player  is the leader, it should know perfectly the response function and therefore is able to obtain the most effective strategy such that For other players , as they do not have any information about any of the other players' they can that ot k   strategies, only assume her players' strategies are fixed, and will act like a Nash player, see (4).As a result, we have
In a Stackelberg game, there is a very strict order of how the players play the game [1,5].In particular, leader  needs to first give out its strategy and lets other players compete to reach a Nash equilibrium against this strategy before it revises its strategy for another round of competition among the rest of the players.Achieving the Stackelberg equilibrium will thus require a two-level game.
The merit of Stackelberg equilibrium is that leader  has an absolute advantage over other players but the drawback is that knowing the function x wou be too difficult to achieve in practice, if not impossible.A standard approach would require the le to try exhaustively all possible x   ld ader   to identify the best strategy.
Recalling from the two-firm example, if we let Firm A be the leader and Firm B be llower, then according to (11 the fo ), player B's cognition is that player A's strategy is optimal and fixed and player B therefore aims to solve which is the exact response function of player B with respect to any action a x .
3), player A finds its As a result, the best strategy for player A can be analytically obtained as Hence, at the Stackelberg equilibrium, we have which leads to the profits achieve the Nash equilibrium earlier, there is no specific order of obtaining Note that in order to * a x and * b x equ ho in solving the simultaneous Equat ilibrium can be achieved by free competition is is ions (7) and the .Th wever not true for the Stackelberg equilibrium where the leader's strategy, * a x , must be obtained first and the follower(s) respond.As the number of players increases, the complexity of the Stackelberg game will increase considerably and the simplicity of the Nash game will prevail.In addition, a game with all leaders degenerates to a Nash game.
In the two-firm example, because Firm A knows precisely the strategy adopted by Firm B (13), it is able to achieve a higher profit than what is achieved by the Nash equilibrium.However, the key questions are:  If (13) is not known by Firm A, or Firm A only knows partial information about (13), e.g., d d x x , would it still help Firm A to obtain a better or even Stackelberg strategy?If the answer is yes, this would mean that the level of cognition for a leader could be significantly reduced. Also, what if two players have partial information about each other's strategies?What will happen?This has motivated us to investigate the Nash game with simultaneous forward-looking players, in which each player optimizes its strategy based on its own belief.

3.
the resulting equilibrium of a game.At the same time, we

Forward-Looking Competition
After the discussion of Nash and Stackelberg equilibria above, it is clear that players' cognition is key to defining understand the beauty of the Nash game where players can compete freely (without following a specific order) to reach the equilibrium.Based on these two points, we present the Nash game with forward-looking players.
To facilitate our analysis, we find it useful to have the following definitions. Environmental function-In a competition process, the reward for player k depends not only its own strategy k x but also others' strategies k  x , at any given time instant t .We use the environmental to quantify the influence of other players' strategies (at time instant t ) onto player t 's reward.Obviously, other players' strategies can always be treated as some form of response to a given player's strategy at time instant t . Belief function-A player's understanding on its environmental function reflects its cognition about the competition in the game.We assume that player k possesses the knowledge of a belief function, which is where denotes the strategies from all the players at time instant t except player k , and clearly x .The latter relationship further suggests to formulate the belief response using some form of Taylor series expansion.For example, we may write where From the side of player B, if it has as low cognition as a Nash player, then it will have the belief function , , As a result, player A's strategy can be optimized by  , this suggests an updating process 1 0.5 .
mpletely different process.The striking result here is that t librium can now be achieved by a the Nash equilibrium with forward-looking which ends up the same strategy in the Stackelberg equilibrium ( t via a co he Stackelberg equifree competition between the players without following a specific order of how the game should be played and that the so-called leader, i.e., player A here, does not need perfect cognition of the environmental function ( 13), but a good belief (20).

Definition of the New Equilibrium
Motivated by the potential of being forward-looking, we here present players (i.e., with some cognition in the form of belief functions).Mathematically, it is written as because according to (18), x and , we have In this model, the belief funct r  ndicat etition envir sh equ player nment.k In fact, (29) embraces the conventional Na ilibrium (4) in which players have the belief function x which effectively treats the environment player k observes at any present time instant t as fixed and constant and ignores the subsequent chan ers' strategies provoked by player k 's new strategy.
We refer to the equilibrium of a Nash game with belief functions as a belief-directed Nash equilibrium (BNE).To examine this, we consider in the two-firm example that , , vatives which can are regarded as th deri be interpreted as th of the environmental function with respect to one's .Also, ( For convenience, we refer to this two-firm example as a BNE game by BNE(λ).In this game, player A aims to e interference e rate of change strategy can be viewed as a first-order Taylor series approximation for the response.With (31), we can express the predicted reward functions for the players as The above strategies will lead to the profits (or rewards) Very interestingly, we can observe that the equilibrium varies according to the belief parameters of , which offers an opportunity to optimize the equilibrium.
In order to illustrate this, we let  However, intriguingly, the player's cognition is very different and for player A, we have This reveals that the belief function a player has is not re has changed our understanding on how player's cognition influences the equilibrium of world may be an outcome from peop On the contrary, if quired to reflect the true reality but can still achieve the best outcome from the competition of the players.This a game.A beautiful le's mistakes.1    , then it can be shown that 0, 0 † It has the same pr its as Nash equilibrium but half production.◊ It achieves the Stackelberg equilibrium but is not Pareto-optimal.# This is the Pareto-optimal equilibrium.
and the b unctions are perfect in terms of representing the reality.Unfortunately, the profits r both players at the equilibrium will be , showing that if le, we also included the case with 2 elief f fo 0 both players are perfectly smart, the outcome could be a tragedy.
In this tab 3

Existence
Since the birth of game theory, the quest for the existence of an equilibrium has been the focal point in this area.In [6], Nash proved the existence of the widely known Nash equilibrium we know today.Later, numer us researchers presented further theorems and proofs about xistence of the game-theoretic equilibria [7,8].Proof The BNE game (with forward-looking players) presented in Section 3.2 follows the same spirit as a typical Nash game where any player acts towards an equilibrium assuming that other players' strategies are fixed.Their only difference lies in their understanding about the payoff functions due to different level of cognition to the environment.Consequently, the same result regarding existence of Nash equilibrium [7] can be directly applied to BNE by simply replacing the pay off function by the predicted reward function, which completes the proof.

Conclusion
This paper showed the importance of players' cogn to the equilibrium of the game and presented the equilibrium of forward-looking players.Using a two-fi example, we demonstrated how a Nash game with belief (referred to as BNE) can be made to achieve the Nas ition Nash rm h and Stackelberg equilibria.On the other hand, this paper has also illustrated the potential of using the belief function as an optimization parameter which makes possible the game converging to a Pareto-optimal equilibrium.

.
If the game converges to an equilibrium, then we will use the superscript   f x *  to highlight that the corresponding parameters are at the equilibrium.For instance, we have and * x * k

worth pointing out here that
is obviously inaccurate.At the Nash equilibrium in particular, from  k k x   k (1.7), we actually have is in fact only the profit in the ideal situation b b * x

3
arbitrarily and it only serves to i e 's understanding about the comp oIn the proposed model of the Nash game with forwardose its own belief function.Altogether, the combination of the play- looking players, every player is free to cho ers' belief functions defines the resulting equilibrium of the players' competition and has numerous possibilities.


consider that  is an optimization parameter of the game.Then  can be optimized by The above op imization maximizes both the sum-profit and the indiv ual profits of the playe in the same profits a equilibrium but with half production.To show ho s the Nash belief (or the interference derivative  )

x
literature, we provide the sufficient existence of the general BNE game.what is known in condition for the Corollary 1 Consider a K -player game, with the strategy spaces k  for 1, 2, , k K   , which are not empty, compact and convex subsets of an Euclidean space.If the predicted reward function , then there exists an equilibrium.

Figure 1 .
Figure 1.The strategy and reward at various equilibria against the belief assuming .The solid line refers to the resul s for the str e the dash line shows the rewards.λ t

Table 1 .
We summarize our results and inc ing scenarios in Note tha ple, the Stackelberg equilibrium is not Paretoal.It can be seen by comparing it with BNE(1) that if the leader is being less aggressive, reduced but the profit for the follower increased.

Table 1 . Strategies and profits for various equilibria.
*