Multi-Agent Strategic Confrontation Game via Alternating Markov Decision Process Based Double Deep Q-Learning - Journal of Computer and Communications

JCC > Vol.13 No.7, July 2025

Journal of Computer and Communications

Volume 13, Issue 7 (July 2025)

ISSN Print: 2327-5219 ISSN Online: 2327-5227

Google-based Impact Factor: 1.98 Citations

Multi-Agent Strategic Confrontation Game via Alternating Markov Decision Process Based Double Deep Q-Learning ()

XML

Download as PDF (Size: 10278KB) PP. 67-93

DOI: 10.4236/jcc.2025.137004 36 Downloads 185 Views

Author(s)

Shou Feng, Chi Wei

Affiliation(s)

Southwest China Institute of Electronic Technology, Chengdu, China.

ABSTRACT

To provide quantitative analysis of strategic confrontation game such as cross-border trades like tariff disputes and competitive scenarios like auction bidding, we propose an alternating Markov decision process (AMDP) based approach for modeling sequential decision-making behaviors in competitive multi-agent confrontation game systems. Different from the traditional Markov decision process typically applied to single-agent systems, the proposed AMDP approach effectively captures the sequential and interdependent decision-making dynamic characteristics of complicated multi-agent confrontation environments. To address the high-dimensional uncertainty resulting from the continuous decision-making space, we integrate the deep double Q-value network (DDQN) learning into the AMDP framework, leading to the proposed AMDP-DDQN approach. This integration enables agents to effectively learn their respective optimal strategies in an unsupervised manner to approximately solve the optimal policy problem, thereby enhancing decision-making quality in strategic confrontation tasks. As such, the proposed AMDP-DDQN method not only accurately predicts the confrontation game outcomes in sequential decision-making, but also provides dynamic and data-driven decision support that enables agents to effectively adjust their strategies in response to evolving adversarial conditions. Experimental results involving a strategic confrontation scenario between two countries with different situations of security, economy, technology, and administration demonstrate the effectiveness of the proposed approach.

KEYWORDS

Deep Reinforcement Learning, Double Q-Value Network, Multi-Agent Strategic Confrontation Game, Alternating Markov Decision Process

Share and Cite:

Feng, S. and Wei, C. (2025) Multi-Agent Strategic Confrontation Game via Alternating Markov Decision Process Based Double Deep Q-Learning. Journal of Computer and Communications, 13, 67-93. doi: 10.4236/jcc.2025.137004.

Cited by

No relevant information.

Journals Menu

Follow SCIRP

	customer@scirp.org
	+86 18163351462(WhatsApp)
	1655362766

	Paper Publishing WeChat

Journals Menu

Home

About SCIRP

Service

Policies