Markov models for the tipsy cop and robber game on graphs

In this paper we analyze and model three open problems posed by Harris, Insko, Prieto-Langarica, Stoisavljevic, and Sullivan in 2020 concerning the tipsy cop and robber game on graphs. The three different scenarios we model account for different biological scenarios. The first scenario is when the cop and robber have a consistent tipsiness level though the duration of the game; the second is when the cop and robber sober up as a function of time; the third is when the cop and robber sober up as a function of the distance between them. Using Markov chains to model each scenario we calculate the probability of a game persisting through $\mathbf{M}$ rounds of the game and the expected game length given different starting positions and tipsiness levels for the cop and robber.


Introduction
The game of cops and robbers on graphs was introduced independently by Quilliot [1] and Nowakowski and Winkler [6]. In this game, a cop and a robber alternate turns moving from vertex to adjacent vertex on a connected graph G with the cop trying to catch the robber and the robber trying to evade the cop. In 2014, Komarov and Winkler studied a variation of the cop and robber game in which the robber is too inebriated to employ an escape strategy, and at each step he moves to a neighboring vertex chosen uniformly at random [4].
In 2020, Harris, Insko, Prieto-Langarica, Stoisavljevic, and Sullivan introduced another variant of the cop and robber game that they call the tipsy cop and drunken robber game on graphs [5]. Each round of this game consists of independent moves where the robber begins by moving uniformly and randomly on the graph to an adjacent vertex from where he began, this is then followed by the cop moving to an adjacent vertex from where she began; since the cop is only tipsy, some percentage of her moves are random and some are intentionally directed toward the robber.
In this paper, we generalize the work of Harris et al. to analyze the tipsy cop and tipsy robber game. One inspiration for this study to model is the biological scenario illustrated in the YouTube video [2] https: //www.youtube.com/watch?v=Z_mXDvZQ6dU where a neutrophil chases a bacteria cell moving in random directions. While the bacteria's movement seems mostly random, the neutrophil's movement appears slightly more purposeful but a little slower.
Harris et al. modeled the game by having the players alternate turns. In this paper we use a slightly different model of the game that was suggested to us by Dr. Florian Lehner. We let the cop and robber be any amount of tipsy and rather than assume that the players alternate turns and flip a coin to determine if the player moves randomly or not, we use a spinner wheel to determine the probability of whether the next move will be a sober cop move, a sober robber move, or a tipsy move by either player. We model this scenario on vertex-transitive graphs (both finite and infinite examples) and on non-vertex transitive graphs (friendship graphs) using the theory of Markov chains. Given a set of initial conditions on each player's tipsiness and their initial distance, the questions we consider are: • What is the probability P (i, j, M) that the game, beginning in state i, will be in state j after exactly M rounds?
• What is the probability G M (d) that the game lasts at least M rounds if they start distance d away?
• What is the expected number E(d) of rounds the game should last if they start distance d away?
For some of these graphs we are able to identify when the expected capture time is finite and when the robber is expected to escape.
In Section 2 we introduce our model of the tipsy cop and robber game on both vertex-transitive and non-vertex transitive graphs. In Section 3 we analyze the game on regular trees and compare it to the famous Gambler's ruin problem [3]. In Section 4 we present our general method for analyzing the game using Markov chains, and we then apply these methods to analyze the game on specific families of graphs in Sections 5-8. In Section 5 we analyze the game on cycle graphs. In Section 6 we analyze the game on Petersen graphs. In Section 7 we analyze the game on friendship graphs. In Section 8 we analyze the game on toroidal grids. In Section 9 we present a model for the game where players start off drunk and sober up as the game progresses, answering Question 6.1 from Harris et al. [5] Finally, in Section 10 we present a model for the game where the players' tipsiness increases as a function of the distance between them, thus providing an answer to Question 6.6 from Harris et al. [5]. While we present our methods with specific examples from each family of graphs we analyze, we also include our Cocalc (Sage) code in the Appendix so that any reader may adapt our methods to their own models.

Background and introduction of our model
In this paper we model the tipsy cop and tipsy robber game on a graph G by first placing the cop on one vertex and the robber on another vertex on the graph G. Rather than require the players alternate turns as in previous models of the cops and robbers game, we allow for four possible outcomes in each round of the game: a sober robber move r, a sober cop move c, a tipsy robber move t r , or a tipsy cop move t c , where c + r + t c + t r = 1. The outcome of each round is assigned at random, perhaps by a probability spinner as depicted in Figure 1. Figure 1. Probability of each move based on spinner.
• r is the probability of a sober robber move (in yellow) • c is the probability of a sober cop (in green) • t r is the probability of a tipsy move by the robber (in red) • t c is the probability of a tipsy move by the cop (in blue) If the game is played on a non-vertex transitive graph the probability of transition from one state to another depends on the starting location of the players. One such graph is a friendship graph, where n copies of the cycle graph C 3 are joined by a common vertex (see Figure 2 for an example of a friendship graph with 3 copies of the C 3 cycle graph). A vertex-transitive graph is a graph, where every vertex has the same local environment, so that no vertex can be distinguished from any other based on the vertices and edges surrounding it. On a non-vertextransitive graph a tipsy move by the cop is not necessarily equivalent to a tipsy move by the robber, as it may increase, keep the same, or decrease the distance between the players based on each player's starting state. For example, the first part of Figure 2 gives us the possible movements of the cop (in blue) if she is in the center of a friendship graph with 3 triangles. If the cop makes a tipsy move to a green vertex (which accounts for 4 6 of t c moves), the distance between her and the robber (in red) increases to 2. If she moves tipsy to the vertex in orange the distance will not change ( 1 6 of t c moves), and if she moves to the vertex in red (that move can be either 1 6 of t c moves or the only sober move) she will catch the robber (distance will be 0). On the other hand, if the cop's starting position is on an outer vertex (second part of Figure  2) there are only two possible outcomes, she moves closer to the robber (in black-either 1 2 of t c or the only sober move) or keeps the same distance (in orange, 1 2 of t c moves). Similarly, the starting state of the robber determines the probability of moving to the next state. The first part of Figure 3 depicts the possible movements of the robber (in red) if he is in the center of a friendship graph with 3 triangles. If the robber makes a sober or tipsy (which accounts for 4 6 of t r moves) move to a green vertex, the distance between him and the cop (in blue) increases to 2. If he moves tipsy ( 1 6 of t r moves) to the vertex in orange the distance will not change, and if he moves to the vertex in blue (a different tipsy move 1 6 of t r moves) he will get caught (distance will be 0). On the other hand, if the robber's starting position is on an outer vertex (second part of Figure 3) there are only two possible outcomes, he moves closer to the cop (in black, 1 6 of t r moves) or keeps the same distance (in orange, 1 6 of t r moves or a sober move). We will analyze the friendship graph game in more detail in Section 7. The game is simpler to model on vertex-transitive graphs. A sober cop move will always decrease the distance between the two (as the cop is chasing the robber). A sober robber move will always increase the distance between the two, or if they are at a maximum distance from each other on a finite graph, he can decide to stay in the same place. Finally, a tipsy move by either player is a random move, and therefore may increase or decrease the distance between the two. When modeling the game on a vertex-transitive graph, a tipsy move by the cop is equivalent to a tipsy move by the robber, so we regroup all tipsy moves together. Every move in the game is guaranteed to fall in one of those three categories hence, c + r + t = 1. r 30% c 30% t 40% Figure 4. Probability of each move based on spinner where tipsy moves are grouped together.
• r is the probability of a sober robber move (in yellow) • c is the probability of a sober cop (in green) • t is the probability of a tipsy move by either player (in red) For example, if the game is played on an infinite path ( Figure 5), the probability that the distance between the cop and robber increases by one in a round is equal to the probability of a sober robber move, since a sober robber will always run from the cop, plus one half of the probability of a tipsy move by either player, since half of all random moves will increase the distance between them. We will employ Markov chains and transformation matrices to model how the players move from one state to another on these graphs. On an infinite regular tree of degree ∆, the distance between the two players decreases with probability 1 ∆ and increases with probability ∆−1 ∆ when either player makes a tipsy move. When the cop makes a sober move, the distance always decreases, and when the robber makes a sober move the distance always increases. Hence, if we assume that the cop calls off the hunt when the distance between the players reaches a specified distance d = n, then the Markov chain in Figure 6 models the game where the probability of the distance increasing is p = t ∆−1 ∆ + r and the probability of the distance decreasing is Those familiar with the gambler's ruin problem will immediately realize that this Markov chain is the same as the gambler's ruin chain with a probability of the gambler winning a round given by p = t ∆−1 ∆ + r and the gambler losing a round given by 1 − p = c + t ∆ . It is well-known that the expected game length of the gambler's ruin problem is given by the equation After finding a common denominator and factoring out 1 1−2p we can rewrite E(d) using our notation as When p < 1 2 (favoring cop success) then the right hand side of Equation (1) has a limit of d as n → ∞. So if the cop never gives up, we get the formula If R(d) and C(d) represent the probability of the robber or cop winning respectively when starting the game from distance d away from each other, then of course R(d) + C(d) = 1. Formulas for R(d) and C(d) can easily be derived from well-known formulas modeling the gambler's ruin problem. For instance, when p = 1 2 , the probability of the robber escaping R(d) is the same as the probability of the gambler's success in the fair gambler's ruin problem R(d) = d n and when p = 1 2 , we have R(d) = n is the same as the probability of the gambler's success in the unfair gambler's ruin problem [3,Equation 1.2].
Substituting p = t ∆−1 ∆ + r (transition probability to increase distance) and 1 − p = c + t ∆ (transition probability to decrease distance) we can find R(d) to be  3.838 0.9959 0.0041 We observe that if this game begins at d = 9, then it has the lowest expected game time of any possible starting distance. This makes sense as 40% of the time the robber will escape in the first move, or with probability 22.5% either player will move drunkenly to allow the robber to escape in the first move; hence, the robber has a 66.5% chance of escaping in the first move if the game begins at d = 9.

General Matrix Method for Analyzing Markov Processes
In this section we describe how to use various matrix equations to find the critical data points of the game. These include, the probability of transitioning from one vertex to another in exactly M rounds, the probability the game lasts at least M rounds, and the expected game time. To calculate these values for a finite Markov chain we will use its probability matrix P , where P i,j is the probability of transitioning from state i to state j in one round of the Markov process. We will use these matrix methods repeatedly to analyze various families of graphs in the subsequent sections of this paper.

4.1.
Calculating probability of moving from state i to state j in M rounds. The probability of starting in state i and ending in state j in exactly M moves is where e i and e j are the row and column basis vectors respectively associated with state i and state j. For example, given the Markov chain in Figure 7, the probability matrix for this Markov chain is Some probabilities of going from one state to another are displayed in the table below From state to state in M rounds Note that in this example, state 0 and state 3 are both absorbing states, and that it is possible when calculating P (i, a, M) for an absorbing state a, that the robber reaches the absorbing state a in m < M moves, and then sits there for the remaining M − m moves.

4.2.
Calculating the probability G M (d) that the game will last at least M rounds. To calculate the probability that the cop will still be actively chasing the robber through M rounds of this game with a starting distance d between the players, we restrict attention to the matrix T = P transient modeling only the non-absorbing states of the Markov chain. Note that in this case, T is the matrix obtained from P by removing the columns and rows associated with any absorbing states in P . The probability of the game lasting at least M rounds of the game if the players start at distance d from each other is then given by the following product of matrices where e d denotes a standard basis row vector with 1 in column d and zero elsewhere, and 1 is the column vector with 1 in each entry.

Calculating Expectation E(d).
As the number of rounds goes to infinity the likelihood that the game is still going on goes to zero. If we sum all the G M (d) probabilities for all M we can find the expected number of rounds the game should last. That is, where I is the identity matrix.

Cycle Graphs
The study of cycle graphs breaks down into two families of cases based on if the number of vertices n is even or odd. We start by analyzing the case when n is even, and then explain how the odd case differs slightly.
For a cycle graph C n with n nodes, where n is even we have the states where the robber and the cop are 0 (cop catches the robber or the robber is tipsy and stumbles upon the cop herself), 1, 2, 3, . . . or n 2 moves away from each other.
We assume that the cop always moves until they are at distance 0 away. Also, if they are at distance n 2 away, and the robber is sober, he will stay at the same location since it is the furthest from the cop. The Markov chain for a cycle graph with n nodes has n 2 states as shown in Figure 8. From this Markov chain we find the probability matrix P to be a tridiagonal ( n 2 + 1 × n 2 + 1) matrix of the following form: P 1,1 = 1, P n 2 +1, n 2 = c + t, P n 2 +1, n 2 +1 = r and the upper and lower diagonal entries are P k+1,k = c + t 2 for 1 ≤ k ≤ n 2 − 1, and P k+1,k+2 = r + t 2 for 1 ≤ k ≤ n 2 − 1, and finally P i,j = 0 otherwise. The transition matrix T is derived by removing the first row and column corresponding to the absorbing state of P . Hence, T is an ( and T i,j = 0 otherwise. When n is odd, the only difference in the Markov chain is that the probability of transitioning from state n 2 to itself is r + t 2 and the probability of transitioning from n 2 to n since the robber and cop move all the time. We may now use our formulas involving T to find the probability that the game lasts at least M rounds as in Subsection 4.2 and the expected game time Subsection 4.3. It is also important to note that because our model does not guarantee turns alternate, and this graph is finite where the cop never calls off the chase, C(d) = 1 for all values of d, c, r, t, and n.

Numerical Examples.
For cycle graph with n = 6 nodes we have the states where the robber and the cop are 0 (cop catches the robber or the robber is tipsy and stumbles upon the cop himself), 1, 2 or 3 moves away from each other ( Figure 9). We assume that the cop always moves until they are at distance 0 away. Also, if they are at distance 3 away and the robber is sober he will stay at the same location since it is the furthest from the cop. We can then create the Markov chain in Figure 10 for a cycle graph with 6 nodes.
We use a stochastic matrix, P (below), to represent the transition probabilities of this system (rows and columns in this matrix are indexed by the possible states listed above, with the pre-transition state as the row and post-transition state as the column). Transition probability matrix This transition probability matrix P can be restricted to a transient transition matrix T with absorbing state 0 removed Using our general matrix method as in Section 4 we can solve the following numerical example.

Numerical Examples.
We assume the tipsy moves by either player account for half of all moves t = 0.5 and c + r = 1 − t = 0.5. The following tables gives the probability the cop will still be chasing the robber after M = 7 rounds provided the cop and the robber start at distance d = 1, 2 or 3 as the percentage of the sober moves allocated to the cop and the robber varies. Measure Proportion of Sober moves c + r = 0.5 Robber The table shows the game will last longest, and the chase has the largest probability of still continuing after 7 rounds, if the starting distance is 3 and the robber takes the maximum percentage of sober moves possible. The accompanying CoCalc code for these computations is included in Appendix A.1, so the interested reader can adapt these calculations to model the game for any values of M, r, c, t they choose.

Petersen Graph
The Petersen graph is a vertex-transitive graph where the cop and robber can only be 0, 1, or 2 moves away from each other as ( Figure 11). Figure 11. Distance between cop (blue double circle) and robber (red single circle) on a Petersen graph.
We assume that the cop moves until eventually she captures the robber. Also, if robber is distance 2 away from the cop, and the robber is sober, he will stay at the same location since it is the furthest from the cop. Hence the Markov chain for the Petersen graph is as depicted in Figure 12. Figure 12. Markov chain for Petersen graph.
We use a stochastic matrix P to represent the transition probabilities of this system. The rows and columns in this matrix are indexed by the possible states listed above, with the pre-transition state as the row and the post-transition state as the column.
The states where there is a distance between the cop and the robber do not contribute to the survival average so state 0 can be ignored. The initial state and transition matrix can be reduced to a transition matrix with absorbing state 0 removed We may now use our formulas involving T to find the probability that the game will last at least M rounds as in Subsection 4.2 and the expected number of rounds as in Subsection 4.3.

Numerical Examples.
In the table below, we assume the tipsy moves by either player account for half of all moves t = 0.5 and c + r = 1 − t = 0.5. We then calculate the probability the cop will still be chasing the robber after M = 7 rounds based on their starting distances d = 1 or 2 and on the percentage of the sober moves allocated to the cop and the robber.

Measure
Proportion of Sober moves c + r = 0.  We have published the accompanying CoCalc code for these computations in Appendix A.1, so the interested reader can change these calculations to model the game for any values of M, r, c, t that they choose.

Friendship Graphs
As friendship graphs are not vertex-transitive, we cannot simply model the game on them with a Markov chain where each state is determined by the distance between the cop and robber, and we must treat tipsy cop moves and tipsy robber moves separately as the transition probabilities from state to state depend on who is moving. Hence, in this section the spinner we use has four distinct possible outcomes satisfying r + c + t r + t c = 1.
The following notation will be used to model the tipsy cop and tipsy robber game on friendship graphs of n triangles. The states 1e, 1cc, and 1rc all refer to the cop and robber being distance 1 away from each other but, 1e means both players are on the same outer edge, 1cc means the cop is in the center, and 1rc means the robber is in the center as depicted in Figure 13.

State 1e
State 2 State 1rc State 1cc Figure 13. All possible states of Friendship Graphs of n = 2 triangles.
Friendship graphs with n = 2, 3, and 4 triangles are shown in Figure 14. Since the cop never calls off the chase on this finite graph, the cop will win eventually C(d) = 1 even if the robber is completely sober throughout.

Toroidal Grids
A toroidal grid is the Cartesian product of two cycle graphs C m C n and represents a grid that can be embedded on the surface of a torus, as shown in Figure 16. Since our model does not guarantee turns alternate, and the cop never calls off the chase, the cop will eventually win C(d) = 1 on any toroidal grid.     [5,Question 6.1]. In this section, we model the game where both players begin completely drunk and sober up as time passes. We define our probability of a tipsy move by either player t as a function of m rounds that have passed t = f (m). Additionally, we set the proportion of sober cop moves to be c = a b (1 − t) and the proportion of sober robber moves to be r = b−a b (1 − t), where a and b are any desired integers that determine the proportion of sober moves that is assigned to the cop and to the robber. As we assume the players are sobering up as time passes, we choose to use a function f with the following properties: With these assumptions, the probability of surviving M rounds, given an initial state d, is given by: The expected game time is given by the series: We conclude this section with some sample calculations in tables. The code for the calculations is attached in Appendix A.3, so the interested reader can change these calculations to model the game for any values of a, b, t = f (m), or N they choose. 9.1. Numerical Examples. Assuming the game is played on a cycle graph of six nodes Figure 9, its accompanying Markov Chain is given in Figure 10.  The expected game time for 80% was also calculated at N = 3000, and found to be the same for at least the first five digits. It seems only at 90% and above is N = 1000 not sufficiently large. 4.093 * 4.513 * 5.120 * 6.062 * 7.659 * 10.70 * 17.46 * 36.36 * 114.5 * * 566.5 * * ∞ *Sum calculated to N = 500, **N = 1000 The expected game time for 70% was also calculated at N = 1000 and found to be the same for at least five digits. However, we are certain that for the expected game time at 90% N = 1000 is not sufficiently large, yet trying to calculate the sum with any larger N causes the CoCalc server to timeout.

Modeling the game where tipsiness is a function of distance
Mentioning that it may be more biologically realistic, Harris et al. also asked how to model a tipsy cop and drunk robber game where the players' tipsiness is determined by the distance between them [5, Question 6.6]. In this section, we model the tipsy cop and robber game where the players sober up as they get closer to each other. This models the scenario where the cop's ability to track the robber improves the closer they get, and the robber senses that the cop is on his trail so he moves more deliberately. The transition matrix T δ represents the stochastic matrix, where both the cop and robber sober up as they get closer to each other and get more tipsy as the distance between the two increases. If n is the maximum distance the two can be apart on a finite graph (or the distance at which the cop calls off the hunt), then we assume that the function δ(d) has the following properties: Additionally, we set the proportion of sober cop moves to be c = a b (1 − t) and the proportion of sober robber moves to be r = b−a b (1 − t), where a and b are any desired integers that determine what fraction of all sober moves is assigned to the cop and to the robber.
10.1. Linear Increase of Tipsiness. In this scenario, we choose to use the function δ(d) = d−1 n , where d is the distance between the cop and the robber when they start the chase and n is the maximum distance they can be apart; this value n is specified to be either the radius of the graph on finite graphs, or the maximum specified distance before the cop calls off the chase on an infinite graph. The probability of the game lasting at least M rounds when starting in state d is 10.2. Exponential Increase of Tipsiness. Now, we choose to use the function δ(d) where d is the distance between the cop and the robber when they start the chase. The probability of the game lasting through M rounds is given by: The expected game time is given by: We conclude this section with some sample calculations in tables. We have published the accompanying CoCalc code for these computations in Appendix A.5, so the interested reader can change these calculations to model the game for any values of n, m, d, r that they choose.
10.3. Numerical Example on Cycle Graph. Given a cycle graph of n = 10 nodes, we will have the possibilities of starting at distances 0 to 5 away with maximum distance n = n/2 = 5. The accompanying Markov Chain is (again we assume that the cop always moves and the sober robber chooses not to move if they are furthest away from each other as noted in Section 5 -even nodes cycle graphs) Based on the Markov chain we get the transformation matrix based on distance T δ ∞ Based on the results in the tables the chase is longer if the tipsiness is increasing exponentially and the percentage of all sober moves (1−t d ) are given to the robber is less than 50%. In the case that the percentage of all sober moves (1 − t d ) are given to the robber is greater than 50% the chase is longer if the tipsiness is increasing linearly. Assuming the robber is given the choice, the robber should pick a strategy based on the percentage of sober moves assigned to him-if he gets to move more than 50% of the sober moves his tipsiness should increase linearly, otherwise (he gets to move 50% or less of all sober moves) he has a better chance of surviving if the tipsiness changes exponentially.
10.4. Infinite Regular Trees. If we play the game on an infinite regular tree, and the cop calls off the hunt if the robber reaches a distance of n nodes away from the cop, then the matrix P is an (n+1×n+1) tridiagonal matrix with P 1,1 = 1, P n+1,n+1 = 1 and the upper and lower diagonal entries are P k+1,k = c k+1 + t k+1 ∆ for 1 ≤ k ≤ n − 1, and P k+1,k+2 = t k+1 ∆−1 ∆ + r k+1 for 1 ≤ k ≤ n − 1, and finally P i,j = 0 otherwise. To obtain the matrix T we remove from P the first and last rows and columns corresponding to both absorbing states. Figure 18. Markov chain for regular tree with maximum distance of n = 5.
For example, on an infinite regular tree of degree ∆ with maximum specified distance n = 5, the accompanying Markov Chain is in Figure 18. Based on the Markov chain we get the transformation matrix based Note that in this case we remove states 0 and 5 because they are absorbing. Similarly in the numerical example on cycle graphs, we specify what percentage of all sober moves (1 − t d ) are given to the robber.

Future Directions and Open Questions
We end this paper by pointing out a few possible areas of future study and posing a few open questions: Modeling the game on infinite grids is considerably more difficult than on infinite regular trees because there are multiple states for a given distance, and it is not immediately clear what strategy the cop should use to decrease distance or the robber should use to increase distance.
(1) On an infinite grid, what are the optimal strategies for the cop and robber?