### Paper Menu >>

### Journal Menu >>

Journal of Information Security, 2010, 1, 41-44 doi:10.4236/jis.2010.11005 Published Online July 2010 (http://www.SciRP.org/journal/jis) Copyright © 2010 SciRes. JIS Game Theory Based Network Security Yi Luo1, Ferenc Szidarovszky1, Youssif Al-Nashif2, Salim Hariri2 1Department of Systems and Industrial Engineering, University of Arizona, Tucson, USA 2Department of Electrical and Computer Engineering, University of Arizona, Tucson, USA E-mail: Luo1@email.arizona.edu, szidar@sie.arizona.edu, {alnashif, hariri}@ece.arizona.edu Received March 9, 2010; revised July 9, 2010; accepted July 20, 2010 Abstract The interactions between attackers and network administrator are modeled as a non-cooperative non-zero-sum dynamic game with incomplete information, which considers the uncertainty and the special properties of multi-stage attacks. The model is a Fictitious Play approach along a special game tree when the attacker is the leader and the administrator is the follower. Multi-objective optimization methodology is used to predict the attacker’s best actions at each decision node. The administrator also keeps tracking the attacker’s actions and updates his knowledge on the attacker’s behavior and objectives after each detected attack, and uses it to update the prediction of the attacker’s future actions. Instead of searching the entire game tree, appropriate time horizons are dynamically determined to reduce the size of the game tree, leading to a new, fast, adaptive learning algorithm. Numerical experiments show that our algorithm has a significant reduction in the damage of the network and it is also more efficient than other existing algorithms. Keywords: Multi-Stage Attack, Dynamic Game, Multi-Objective Optimization, Adaptive Learning 1. Introduction The increased dependence on networked applications and services makes network security an important re- search problem. Detection of intrusions and the protec- tion of the networks against attacks is the central issue. Game theory is an appropriate methodology to model the interactions between attackers and network administra- tor and to determine the best countermeasure strategy against attacks. There are however some difficulties in directly applying classical game theory, since the attack- ers’ strategies are uncertain, their steps are not instanta- neous, the rules of the games might change in time, and so on. Therefore any game theory based methodology has to take these difficulties into account. There are many types of intrusions. Multi-stage at- tacks are the most destructive and most difficult kinds for any defense system. They use intelligence to strategically compromise the targets in a planned sequence of actions, so the usual methodology designed to protect against single-stage attacks cannot be used. Network intrusion response mechanisms have been intensively developed and studied in recent years. Sev- eral authors used Markov Games (MG) as a model and methodology. Lye and Wing [1] viewed the interactions between the attacker and the administrator as a two-player Markov game and modeled it by an intrusion graph. The recovery effort was considered as the cost of the re- sponse. The payoff for the attacker was defined by the amount of effort the administrator needed to spend in order to bring the network back to normal state. The equilibrium was obtained by using nonlinear program- ming and dynamic programming for infinite and finite horizon games, respectively. The main disadvantage of this approach is the huge size of the state space which makes extremely difficult to compute the equilibria. Shen et al. [2] used a piecewise linearized Markov game model with estimated beliefs of the possible cyber attack patterns obtained by data fusion and adaptive control. They also recognized that larger time-step horizons result in increased computation complexity. Another approach is based on Partially Observable Markov Decision Proc- esses (POMDP) which results in more complex compu- tation problems. Carin et al. [3] introduced the protection map and used reverse-engineering methodology to build an attack graph. Zhang and Ho [4] presented a model to characterize multi-stage collusive attacks in terms of key spatio-temporal properties. The attacker’s behavior was modeled as a reward-directed partially observable Markov The research presented in this paper is supported in part by National Science Foundation via grants numbers CNS-0615323 and IIP-0758579, and by Electronic Warfare Associates, Inc. via the grant number ewagsi-07-LS-0002, and it is conducted as part of the NSF Center for Autonomic Computing at University of Arizona. Y. LUO ET AL. Copyright © 2010 SciRes. JIS 42 decision process and the administrator was assisted by identifying the potential causal relationships between the different system vulnerabilities. This approach also suf- fered from serious computation difficulties because of the very large state space. Liu et al. [5] introduced a dy- namic game approach based on modeling the Attacker Intent, Objectives and Strategies (AIOS) which resulted in a much smaller state space, so this approach is more efficient than the application of Markovian games. A similar concept was applied in Siever et al. [6] for the security of electric power transmission grids, when the goals of the attacker were used in formulating the at- tacker’s game, which optimizes the difference of its re- ward and the amount of power delivered. The objective function of the defender is the sum of the amount of de- livered power and a special reward function. The ap- proach introduced by Luo et al. [7] was based on POMDP, however a reduced special game tree structure was used and a new stochastic multi-stage defense algo- rithm was developed. All models and algorithms are based on assessing all damages and costs of the cyber attacks. The uncertainty in the knowledge of the network, its vulnerabilities, pos- sible actions and counteractions, damages and costs, etc. make the mathematical modeling more complicated. In the economic literature this issue has been known and treated by the deterministic equivalent, which is a linear combination of the expected value ( ) and variance (2 ) of a random outcome: 2 , where shows the level of willingness of the decision maker to take risk. Almost all models of multi-stage attacks are based on special game trees. It is well-known from the game the- ory literature (see for example, Forgo et al., [8]) that such games with full information always have at least one Nash equilibrium, which can be computed by using backward induction. This general result however cannot be used in computer network security, since the game tree and the possible strategies of the players are not completely known by all participants. The administrator and the attacker might believe in different game trees with different possible actions. 2. Consequence Modeling We have adopted the approach given in Richardson and Chavez [9]. The consequence of any attack and any ac- tion during a multi-stage attack is based on the following six steps: 1) Define the categories of impact; 2) State the importance of each category relative to the others; 3) Define the measures of impact for each category; 4) Define the relationships between physical effects and impact measures; 5) Define the system and its users; 6) Define the events in terms of scales and network system impact. Impact categories include and not restricted to eco- nomic, image, safety, security, intelligence, and privacy concerns. Their relative importance factors can be as- sessed by any one of the well known procedures from multi-criteria decision making (see for example, Szi- darovszky et al., [10]). A common approach of obtaining the weights is based on pair-wise comparisons, when all participants in the decision making process are asked to give relative importance factors for all pairs of categories. Then the results are summarized into a final set of im- portance weights either by averaging them or by using the Analytic Hierarchy Process (AHP). The measures of the impact in different categories are usually given in different units, and they can be combined by using multi-attribute utility theory or weighting method with normalized evaluations. Performance measures can be defined for the impact categories, and each performance measure can be divided into a set of constructed scales representing the amount of impact the physical conse- quences have on the network and its users including lost revenue, repair and/or replacement cost, damage by lost or stolen information, etc. Any actual attack has impacts on different categories with different levels. Using the consequence modeling tool, the overall consequence of the different types and scales of events on the system and its users can be assessed into one combined value. This value has to be computed at all states of the multistage attack and will be used in the game tree analysis. 3. Game Tree and Decision Nodes Multi-stage attacks are represented by special game trees. Figure 1 shows the first two interactions on a game tree. The attacker is the leader, the administrator is the fol- lower. The root of the tree is the initial decision node of the attacker, and the possible initial moves of the attacker are represented by the arcs originating at the root. These actions might include attacking the server with different intensity levels, sending a virus to a group of customers, etc. At the end point of each arc the administrator has to Figure 1. Special game tree. Y. LUO ET AL. Copyright © 2010 SciRes. JIS 43 respond, so they are its decision nodes. After the admin- istrator’s response the attacker makes the next move, and so on. This tree continuous until the intruder gives up the attack or reaches its goals. This tree can become very large and the payoff values at the decision nodes are un- certain, therefore the classic method, known as backward induction, cannot be used in this case. 4. Determining Optimal Responses The algorithm to be described in this section is an on-line procedure, it provides the best response of the adminis- trator at each of its decision nodes, when a multi-stage attack reaches that particular node of the game tree. So during an attack the algorithm can be used at each stage to find the best next move of the administrator starting just after the first action of the attacker and continuing until the end of the game. Consider now a particular decision node of the admin- istrator and the sub-game tree having this node as its root. The time horizon for this sub-game tree is obtained as follows. We have to check all end points of this sub-game tree where the attacker reaches its goals during smallest number of steps, so we select the shortest path with smallest number of arcs from the root to such end points. The length of the shortest path is the time horizon, and then all paths starting at the root will be considered only until this time horizon. The utility function of the at- tacker is then assessed at all endpoints of this truncated sub-game tree. The utility function is the linear combina- tion of the expected payoff of the attacker and its vari- ance as it was explained earlier. The risk taking coeffi- cient of the attacker can be updated after each attack, since the administrator has estimates of the expectations and the variances of its utility values for all possible moves and also observes the actual move. The adminis- trator then has to assess the probability distributions of the attacker’s actions at all of its decision nodes. The probability values are computed based on the assessed utility values of the intruder as well as previous interac- tions with the attacker. First the probability values are computed proportionally to the utility values at the end- points of the different arcs representing the next moves of the attacker, and if previous interactions provide rela- tive frequencies, they are averaged with the computed probabilities. Using these probability distributions the expectation and the variance of the cumulative impact up to the time horizon for the administrator can be com- puted for each of its possible responses, and the corre- sponding utility values are obtained by combining ex- pectations and variances with the risk acceptance coeffi- cient of the administrator. The best response of the ad- ministrator is the arc which has its highest utility value. The attacker makes the first move. At the end point of the corresponding arc the administrator has to respond. Using the above procedure the administrator finds its response. Then the attacker makes the next step, and the best next response of the administrator is obtained again by using the same algorithm with updated data based on the information obtained from the previous actions of the attacker. Then the attacker has the next move, the ad- ministrator responds by using the same algorithm, and so on until the game ends, which occurs when the attacker stops attacking by reaching its goals or giving up. 5. Numerical Example Figure 2 shows a network structure. It is assumed that the HTTP server, Database 2, the FTP1 server and the information in the CEO are the vulnerable components in the network system, and access to the information in the CEO is the attacker’s objective. It is also assumed that the CEO needs services provided by the HTTP server, Database 2 and the FTP1 server to do its jobs. The at- tacker can launch multi-stage attacks to obtain the in- formation from the CEO in many different ways. Then the administrator can respond to it by selecting from a set of options, and so on, which leads to the game tree. Next we assume that in addition to the sensitive data in the CEO the data in the Accounting is another vulner- ability of the system. The attacker has now two objec- tives: the information in CEO and the data in Accounting. The Accounting also needs services provided by Data- base 2 and the HTTP server, etc. Our computer study assumed that the attacker always selects the action lead- ing to maximal impact, and the administrator always selects its best action at its decision nodes by using one of the three tested algorithms. We applied three methods to find the best responses of the administrator: One is a greedy algorithm (GA) in which the administrator completely blocks the traffic of Figure 2. Network structure. Y. LUO ET AL. Copyright © 2010 SciRes. JIS 44 Table 1. The performances of the three algorithms. Administrator The total losses of the system occurred during the life-cycle of the multi-stage attacks GA Algorithm SO Algorithm Our Algorithm Single Objective 4,694 3,608 2,781 Risk Seeking Two Objective 9,597 6,741 4,689 Single Objective 4,694 3,176 2,252 Attacker Risk Neutral Two Objective 9,597 6,325 4,021 corresponding services on routers, firewall, or disconnect the machines using managed switches, etc. regardless of what kind of attack occurs or what is the intensity levels of the attack. Another algorithm is also myopic, sin- gle-interaction optimization algorithm (SO) in which the administrator tries to minimize the loss from the most current attack at each interaction without considering future interactions with the attacker. The third algorithm is the one we developed. The results are shown Table 1. Two types of attacks were assumed. The risk seeking attacker worried about only the expectation of the impact (0 ), while the risk neutral intruder selected a rela- tively high risk taking coefficient (1 ). The two sce- narios refer to the cases of one or two objectives of the attacker. The last three columns indicate the three meth- ods which were used for comparison. The numbers in the last three columns of the table show the total losses of the system with using different methods. Clearly our method resulted in the smallest overall losses in all cases, where the loss reduction was 41%, 51%, 52% and 58% in comparison to the Greedy Algorithm, and 23%, 30%, 29% and 36% in comparison to single-interaction opti- mization. In assessing the numerical values of the im- pacts in the consequence analysis, we used only eco- nomic impact. A more complex consequence analysis would not alter the main steps of the algorithms. 6. Conclusions This paper introduced a multi-stage intrusion defense system, where the interactions between the attacker and the administrator are modeled as a two-player non-co- operative non-zero-sum dynamic game with incomplete information. The two players conduct Fictitious Play along the game tree, which can help the administrator to find quickly the best strategies to defend against attacks launched by different types of attackers. Our algorithm is an online procedure, which gives the most appropriate response of the administrator at any stage of the game. So it has to be repeated at all actual decision nodes of the administrator. Our algorithm is different than the usual methods based on decision trees, since at each step only a finite horizon is considered, instead of expected out- comes certain equivalents are used and the probabilities of the different arcs are continuously updated based on new information. In our numerical example our approach was compared to two other algorithms and the total net- work losses were compared. The loss reduction by using our approach varied between 23% and 58%. The per- formance of our algorithm is much better than that of other algorithms based on the results of our numerical experiments. 7. References [1] K. Lye and J. Wing, “Game Strategies in Network Secu- rity,” International Journal of Information Security, Vol. 4, No. 1-2, 2005, pp. 71-86. [2] D. Shen, G. Chen, E. Blasch and G. Tadda, “Adaptive Markov Game Theoretic Data Fusion Approach for Cy- ber Network Defense,” IEEE Military Communications Conference (MILCOM 2007), Orlando, 2007. [3] L. Carin, G. Cybenko and J. Hughes, “Cybersecurity Strategies: The QuERIES Methoddology,” Computer, Vol. 41, No. 8, 2008, pp. 20-26. [4] Z. Zhang and P. Ho, “Janus: A Dual-Purpose Analytical Model for Understanding, Characterizing and Counter- mining Multi-Stage Collusive Attacks in Enterprise Net- works,” Journal of Network and Computer Applications, Vol. 32, No. 3, 2009, pp. 710-720. [5] P. Liu and W. Zang, “Incentive-Based Modeling and Inference of Attack Intent, Objectives, and Strategies,” CCS’03, Washington, DC., 2003. [6] W. M. Siever, A. Miller and D. R. Tauritz, “Blueprint for Iteratively Hardening Power Grids Employing Unified Power Flow Controllers,” SoSE’07, IEEE International Conference on System of Systems Engineering, Tampa, 2007. [7] Y. Luo, F. Szidarovszky, Y. Al-Nashif and S. Hariri, “Game Tree Based Partially Observable Stochastic Game Model for Intrusion Defense Systems (IDS),” IIE Annual Conference and EXPO (IERC 2009), Miami, 2009. [8] F. Forgo, J. Szep and F. Szidarovszky, “Introduction to the Theory of Games,” Kluwer Academic Publishers, Dordrecht, 1999. [9] B. T. Richardson and L. Chavez, “National SCADA Test Bed Consequence Modeling Tool,” Sandia National La- boratory Report, SAND2008-6098, Albuquerque, 2008. [10] F. Szidarovszky, M. Gershon and L. Duckstein, “Tech- niques for Multiobjective Decision Making in Systems Management,” Elsevier, Amsterdam, 1986. |