Statistical Learning in Game Theory

Abstract

In economics, buyers and sellers are usually the main sides in a market. Game theory can perfectly model decisions behind each “player” and calculate an outcome that benefits both sides. However, the use of game theory is not lim-ited to economics. In this paper, I will introduce the mathematical model of general sum game, solutions and theorems surrounding game theory, and its real life applications in many different scenarios.

Keywords

Share and Cite:

Shi, L. (2023) Statistical Learning in Game Theory. Journal of Applied Mathematics and Physics, 11, 663-669. doi: 10.4236/jamp.2023.113043.

1. Introduction

The use of game theory mainly lies within the economics realm. The concept of zero sum game, a special type of general sum game, where what the player gains is also equal to what the other player loses, aligns with the economic term scarcity. Such concept can trace as far back to the beginning of American history. When the European conquistadors landed on American shores, they adopted an economic model know as mercantilism. This economic model is derived from theories that suggest natural resources are limited countries that need import. In the modern world, politicians suggest that cutting taxes will facilitate economic growth, while some economists argue that the decreased fund in government spending due to tax cut will decrease the amount of job opportunities. This phenomenon aligns with the model of zero sum game, that “this spending cut is likely to reduce demand in the state just as much as the reduction in taxes may stimulate demand” [1] . The typical solution to such economic problems is setting up a zero sum game model and calculates the Nash equilibrium. In this paper, we will demonstrate different types of solutions in general sum game through using economics, politics, and sports model.

2. General Sum Game

We first introduce a formal mathematical model to characterize general sum games. Assume there are two players: Player I and II. Player I has m possible actions. Player II has n possible actions.

The payoff matrix for player I is $A={\left[{a}_{ij}\right]}_{i}{}_{\in \left[m\right],j\in \left[n\right]}$ where aij denotes the pay to player I when player I play action i and player II play action j. (player I and II choose the action pair (i, j).) The payoff matrix for player II is $B={\left[{b}_{ij}\right]}_{i}{}_{\in \left[m\right],j\in \left[n\right]}$ where bij denotes the pay to player II when player I play action i and player II play action j. (player I and II choose the action pair (i, j).) The zero-sum game is a special case of the general sum game where A + B = 0.

Let $x={\left[{x}_{\text{1}},{x}_{\text{2}},\cdots ,{x}_{m}\right]}^{\top }$ and $y={\left[{y}_{\text{1}},{y}_{\text{2}},\cdots ,{y}_{n}\right]}^{\top }$ be the strategy of player I and II, respectively. Here ${x}_{i}\in \left[0,\text{1}\right]$ for $i\in \left[m\right]$ is the probability that player I chooses action i and ${y}_{j}\in \left[0,\text{1}\right]$ for $j\in \left[n\right]$ is the probability that player II chooses action j. The expected payoff of player I given strategy pair (x, y) is

${x}^{\top }Ay$ (1)

The expected payoff of player II given strategy pair (x, y) is

${x}^{\top }By$ (2)

Equation (1) is a simplification of the sum of the expected gain of player I from playing all the possible actions (represented in the equation below)

$\underset{i=1}{\overset{m}{\sum }}\underset{j=1}{\overset{n}{\sum }}{x}_{i}{a}_{ij}{y}_{j}$ (3)

Equation (2) is a simplification of

$\underset{i=1}{\overset{m}{\sum }}\underset{j=1}{\overset{n}{\sum }}{x}_{i}{b}_{ij}{y}_{j}$ (4)

This equation is derived from going box by box in the payoff matrix for all the possible outcomes, and calculate the expected payoff for each player using the expected value equation (gain * probability)

$\left({P}_{a1}\ast {P}_{b1}\ast {a}_{11}\right)+\left({P}_{a2}\ast {P}_{b1}\ast {a}_{21}\right)+\cdots +\left({P}_{ai}\ast {P}_{bj}\ast {a}_{ij}\right)$ (5)

2.1. Solve Zero Sum-Game

2.1.1. Nash Equilibrium

1) Pure optimal strategy: Saddle point or Nash Equilibrium is when both players play the optimal strategy and they will not benefit from unilaterally deviating from their decisions. In other words, no players can be better off changing their plays. We can predict the result of the game by finding the Nash equilibrium, because there are no incentives for players to deviate from their equilibrium strategies. This can be seen as the solution to a general sum game [2] . A Nash Equilibrium could also be illustrated mathematically. When (x*, y*) is played by both players and they are the optimal strategy, then ${\left({x}^{\ast }\right)}^{\top }A{y}^{\ast }\ge {x}^{\top }A{y}^{\ast }$ for all $x\in \Delta m$ and ${\left({x}^{\ast }\right)}^{\top }B{y}^{\ast }\ge {x}^{\top }B{y}^{\ast }$ for all $y\in \Delta n$ .

2) Equalization principal: This solution is used with a mixed strategy game where there are no pure Nash Equilibrium. Suppose optimal mixed strategy is given by (x*, y*). We can set the probability that player II plays action 1 to y1 and probability of playing action 2 to y2 or 1 − y1. We then can calculate the expected gain of each action played by player I ( ${\sum }_{j=1}^{n}{a}_{ij}{y}_{j}^{*}$ ) It must be that given y*, player I obtains the same payoff from each action $i\in \left[m\right]$ . Finally we set the expected gain of each action equal to each other to find y1 and y2. The fact that player I obtains the same payoff can be seen as a given is because that in a mixed strategy game, if player I obtain greater payoff in strategy i than strategy 1, player I would simply always play strategy i and thus playing a pure strategy. However, this contradicts the principal of the game, where each player can only both play whether pure strategy or mixed strategy.

2.1.2. Dominant Equilibrium

The technique of domination: This solution is when one strategy of a player dominates over the other ones, then the other ones can be ignored. We can take a look at this through an example.

In this payoff matrix (Table 1), Player II is always better off playing strategy 2 no matter what Player I plays. Therefore, strategy 1 is dominated by strategy 2, and strategy 1 will be removed from the matrix. Now the matrix will look like this (Table 2).

Now that player II has only one dominant strategy, let’s look at player I. Player I will obviously pick strategy A to maximize his gain in this 2 × 1 payoff matrix. Finally, the solution or the Nash Equilibrium of this game is when player I plays A and player II plays 2. The dominant equilibrium can be found when both players play their dominant strategy. A lot of the times, dominant equilibrium is the Nash equilibrium of the game.

2.1.3. Safety Strategy Equilibrium

Minimax Theorem: The Minimax Theorem proposed by John Von Neumann suggests that for any two player zero sum game, with m × n payoff matrix A.

Table 1. Dominant equilibrium example.

Table 2. Dominant equilibrium example.

The maximum value of the minimum expected gain for one player is equal to the minimum value of the maximum expected loss for the other. This is also called the value of a game where

${\mathrm{max}}_{x\in \Delta m}{\mathrm{min}}_{y\in \Delta n}{x}^{\top }Ay=V={\mathrm{min}}_{y\in \Delta n}{\mathrm{max}}_{y\in \Delta n}{x}^{\top }Ay$ (6)

This equation is derived from both players making rational decisions and playing their safety strategy. In a zero-sum game setting, a player can maximize his payoff by minimizing his opponent’s payoff. If player I plays mixed strategy x, player II would play y to minimize player I’s payoffs. At the same time, player I wanted to maximize his playoff my maximizing over x. Therefore the payoff for player one can be illustrate with

${\mathrm{max}}_{x\in \Delta m}{\mathrm{min}}_{y\in \Delta n}{x}^{\top }Ay$ (7)

This equation can be assigned to V1 or the maximin value of player I. For player two, his worst case loss by playing strategy y would again simply be

${\mathrm{min}}_{y\in \Delta n}{\mathrm{max}}_{y\in \Delta n}{x}^{\top }Ay$ (8)

This value can be assigned V2 or the minimax value for player II.

In conclusion, player I’s worst case gain would be V1 and his best case gain would be V2. In a general sum game setting V1V2, while a zero sum game V1 would be equivalent to V2.

We will use three specific examples, respectively in economics, politics, and sports, to illustrate these general sum game models.

2.2. Company Decisions

The concept of general sum game has a significant implication in economics. Let’s say that a company has a decision to make to enter a monopoly. If the company does, pay 10 million to the monopoly company to own 1/2 of the share of that market. The monopoly in the market also gets a choose of play to choose their price points for their products. They can decide to sell their product at a price of 1000 and 25,000 of that product will be sold; or at a price of 600 and 30,000 of that product will be sold.

The pay off matrix (Table 3) listed below illustrates the possible results of the company’s decision. Let’s say that a company can choose to enter a monopolized market and pay 10 million to get half the market. The monopoly company will also have a decision to make regarding pricing. They can set the price of their

Table 3. Company decisions payoff matrix.

products to be 1000 and 25,000 will be sold; or price it at 600 and 30,000 will be sold. If the monopoly company make their pricing to be 1000 for 25,000 the expected gain of the entrance company will be 1000 × 25,000/2 (half the market share) − 10,000,000 (entrance payment) = 2,500,000. The monopoly company gets the other half of the market plus the 10 million they received from the entrance company, making their total payoff to be 12,500,000. The entrance company might choose not to enter, they will gain nothing while the monopoly company gains the whole market 1000 × 25,000 = 25,000,000. On the other hand, if the monopoly decides to make the pricing of that product to be 600 for 300,000, the gain when entrance company enters will be 600 × 300,000/2 (half the market) − 10,000,000 (entrance payment) = −1,000,000. This means that the entrance company will lose 1 million dollars if they decides to enter the market when the pricing of that product is at 600 for 300,000. The monopoly company will again get the whole market and gain 600 × 300,000/2 + 10,000,000 = 9,000,000 from that decision. In this scenario, out of the four possible results, the best solution for both players will be when the entrance company decides to enter and the monopoly company set their pricing at 1000 for 25,000. This is the Nash Equilibrium where no player can increase their gain from unilaterally deviating (changing their decision).

2.3. Cuban Missile Crisis

Another perspective to look at general sum game models is through politics. A book by Steven J. Brams “Game Theory and Politics” connected the use of game theory to real world politics [3] . During the cold war, the US experienced a 13 days of nuclear war standoff with the Soviet Union. This event is known as the Cuban Missile Crisis. He created the payoff matrix down below (Table 4) to demonstrate the situation.

The Soviet Union has employed missiles on Cuban island which threatened US home-land security. The US has a choice between launching a direct airstrike over the missile site, which can quickly escalate into a full scale nuclear war, or enact a naval blockade to prevent further nuclear missiles. There are two Nash Equilibria in this situation: When Soviet Union decides to Maintain and US decides to blockade, and when Soviet Union decides to withdraw and US launches an airstrike. However, both countries chose to avoid the war instead of playing the Nash equilibrium choices. This is an example of both countries playing the strategy that is not the optimal for either country.

Table 4. Cuban missile crisis payoff matrix.

2.4. Football

Finally, sports is another interesting way of looking general sum game. Football, the most popular sport in America, is a great representation of a special type of general sum game called zero sum game, because when a team gains 10 yard of field, the other team loses to same amount. An article by Chase Stuart published in 2019 analyzing the average yardage gain from different plays from 2019 NFL season suggests that on average a team gains 6.4 yards on a passing play, while 4.3 yards on a running play. The defensive players on other team also have to moves to play: they can tackle the player with the ball, or intercept a pass from the quarter back. If the defense intercepts the ball and the offense decides to run, the defense will lose 4.3 yards and offense will gain 4.3 yards. When the defense correctly predicts the offense, for example defense play intercept while offense play pass, or defense play tackle when offense plays run, the offense will move on to the next down. According to the football rules, the offense has 4 downs to gain 10 yards. So in the scenarios where the defense correctly predicts the offense, they will gain 2.5 (10/4) yards and the offense will lose 2.5 yards. Although there is no Nash equilibrium in the game of football, we can the equalization principal determine the optimal passing percentage and running percentage for the offense and the optical defensive strategy for the defense.

We can see in the matrix (Table 5) that in a zero sum game model, the payoff of both player is the same value but opposite signs. Since there are no Nash Equilibrium, we can use the equalization principal by setting the probability of running = Y1 and the probability of passing = 1 − Y1.

Analysis of the game. Assuming that the defense will choose to intercept, the expected value of that down will be 4.3 × Y1 + (1 − Y1)(−2.5) = 4.3Y1 + 2.5Y1 − 2.5 = 6.8Y1 − 2.5.

If the defense plays tackle the expected value of that down will be −2.5 × Y1 + (1 − Y1)(6.4) = −2.5Y1 − 6.4Y1 + 6.4 = −8.9Y1 + 6.4.

We can set the two equation equal to each other to find out Y1 − 8.9Y1 + 6.4 = 6.8Y1 − 2.5 15.7Y1 = 8.9Y1 = 0.57.

Therefore The optimal probability of running = 0.57 and the optimal probability of passing = 0.43.

Though in the 2019 NFL regular season, the passing percentage is about 59, this solution is a approximation of the optimal strategy using a estimated model. In a real football game, there are much more choices and rules for both teams and the players will adjust their choices according to much more factors.

Table 5. Football payoff matrix.

We can do the same calculation to find out the optimal mixed strategy for the defense is ptackle = 0.43 and pintercept = 0.57.

The expected payoff for the offense is ${x}^{\top }Ay={\sum }_{i=1}^{m}{\sum }_{j=1}^{n}{x}_{i}{a}_{ij}{y}_{j}$ = (4.3 × 0.57 − 2.5 × 0.43) + (−2.4 × 0.57 + 6.4 × 0.43) = 2.76.

Since this is a zero sum game, therefore the expected payoff for the defense is −2.76 per down.

As a very interesting fact, in the NFL 2019 season, the team that has the best win-to-loss ratio was the Baltimore ravens with 14 wins to 2 losses. Surprisingly, or maybe unsurprisingly, their passing and running percentage is almost identical to what we’ve just calculated. They pass at a 42.5% and run at a 57.5%.

3. Conclusions

General sum game is an effective method to model real-life economic scenarios and help individuals or firms to make decision.

Till now, we have considered the settings where the payoff matrices are completely known, which is usually referred to as games with complete information.

Literature has also considered the setting where the payoff matrices are not completely known, which is called games with incomplete information.

With the advance of information technology, we now face a third setting where the payoff matrices are not known, but it is possible to estimate the payoff matrices from data collected through playing the games repeatedly [4] .

The concept of game theory, combined with the rise of cryptocurrency, will have significant implications in the future.

Conflicts of Interest

The author declares no conflicts of interest regarding the publication of this paper.

 [1] Tannenwald, R. and Lav, J.I., et al. (2010) The Zero-Sum Game: States Cannot Stimulate Their Economies by Cutting Taxes. Center on Budget and Policy Priorities, 2 March. https://www.cbpp.org/research/the-zero-sum-game-states-cannot-stimulate-their-economies-by-cutting-taxes [2] Cong, W.L., Li, Y. and Wang, N. (2022) Token-Based Platform Finance. Journal of Financial Economics, 144, 972-991. https://www0.gsb.columbia.edu/faculty/nwang/papers/Cong_Li_Wang_JFE_2022.pdfhttps://doi.org/10.1016/j.jfineco.2021.10.002 [3] Brams, J.S. (2011) Game Theory and Politics. Dover Publications, New York. [4] Crandall, J.W. and Goodrich, M.A. (2005) Learning to Compete, Compromise, and Cooperate in Repeated General-Sum Games. Proceedings of the 22nd International Conference on Machine Learning, New York, August 2005, 161-168. https://doi.org/10.1145/1102351.1102372