Multi-Agent Strategic Confrontation Game via Alternating Markov Decision Process Based Double Deep Q-Learning ()
ABSTRACT
To provide quantitative analysis of strategic confrontation game such as cross-border trades like tariff disputes and competitive scenarios like auction bidding, we propose an alternating Markov decision process (AMDP) based approach for modeling sequential decision-making behaviors in competitive multi-agent confrontation game systems. Different from the traditional Markov decision process typically applied to single-agent systems, the proposed AMDP approach effectively captures the sequential and interdependent decision-making dynamic characteristics of complicated multi-agent confrontation environments. To address the high-dimensional uncertainty resulting from the continuous decision-making space, we integrate the deep double Q-value network (DDQN) learning into the AMDP framework, leading to the proposed AMDP-DDQN approach. This integration enables agents to effectively learn their respective optimal strategies in an unsupervised manner to approximately solve the optimal policy problem, thereby enhancing decision-making quality in strategic confrontation tasks. As such, the proposed AMDP-DDQN method not only accurately predicts the confrontation game outcomes in sequential decision-making, but also provides dynamic and data-driven decision support that enables agents to effectively adjust their strategies in response to evolving adversarial conditions. Experimental results involving a strategic confrontation scenario between two countries with different situations of security, economy, technology, and administration demonstrate the effectiveness of the proposed approach.
Share and Cite:
Feng, S. and Wei, C. (2025) Multi-Agent Strategic Confrontation Game via Alternating Markov Decision Process Based Double Deep Q-Learning.
Journal of Computer and Communications,
13, 67-93. doi:
10.4236/jcc.2025.137004.
Cited by
No relevant information.