Prediction Distortion in Monte Carlo Tree Search and an Improved Algorithm - Journal of Intelligent Learning Systems and Applications

JILSA > Vol.10 No.2, May 2018

Journal of Intelligent Learning Systems and Applications

Volume 10, Issue 2 (May 2018)

ISSN Print: 2150-8402 ISSN Online: 2150-8410

Google-based Impact Factor: 1.5 Citations

Prediction Distortion in Monte Carlo Tree Search and an Improved Algorithm ()

HTML XML

Download as PDF (Size: 9956KB) PP. 46-79

DOI: 10.4236/jilsa.2018.102004 1,835 Downloads 4,729 Views Citations

Author(s)

William Li

Affiliation(s)

Delbarton School, Morristown, NJ, USA.

ABSTRACT

Teaching computer programs to play games through machine learning has been an important way to achieve better artificial intelligence (AI) in a variety of real-world applications. Monte Carlo Tree Search (MCTS) is one of the key AI techniques developed recently that enabled AlphaGo to defeat a legendary professional Go player. What makes MCTS particularly attractive is that it only understands the basic rules of the game and does not rely on expert-level knowledge. Researchers thus expect that MCTS can be applied to other complex AI problems where domain-specific expert-level knowledge is not yet available. So far there are very few analytic studies in the literature. In this paper, our goal is to develop analytic studies of MCTS to build a more fundamental understanding of the algorithms and their applicability in complex AI problems. We start with a simple version of MCTS, called random playout search (RPS), to play Tic-Tac-Toe, and find that RPS may fail to discover the correct moves even in a very simple game position of Tic-Tac-Toe. Both the probability analysis and simulation have confirmed our discovery. We continue our studies with the full version of MCTS to play Gomoku and find that while MCTS has shown great success in playing more sophisticated games like Go, it is not effective to address the problem of sudden death/win. The main reason that MCTS often fails to detect sudden death/win lies in the random playout search nature of MCTS, which leads to prediction distortion. Therefore, although MCTS in theory converges to the optimal minimax search, with real world computational resource constraints, MCTS has to rely on RPS as an important step in its search process, therefore suffering from the same fundamental prediction distortion problem as RPS does. By examining the detailed statistics of the scores in MCTS, we investigate a variety of scenarios where MCTS fails to detect sudden death/win. Finally, we propose an improved MCTS algorithm by incorporating minimax search to overcome prediction distortion. Our simulation has confirmed the effectiveness of the proposed algorithm. We provide an estimate of the additional computational costs of this new algorithm to detect sudden death/win and discuss heuristic strategies to further reduce the search complexity.

KEYWORDS

Monte Carlo Tree Search, Minimax Search, Board Games, Artificial

Share and Cite:

Li, W. (2018) Prediction Distortion in Monte Carlo Tree Search and an Improved Algorithm. Journal of Intelligent Learning Systems and Applications, 10, 46-79. doi: 10.4236/jilsa.2018.102004.

Cited by

[1]	Using Different Assessment Indicators in Supporting Online Learning
	CS & IT Conference Proceedings, 2021

[2]	USING STATISTICAL ANALYSIS ALGORITHM IN ARTIFICIAL INTELLIGENCE FOR ONLINE LEARNING

Journals Menu

Follow SCIRP

	+1 323-425-8868
	customer@scirp.org
	+86 18163351462(WhatsApp)
	1655362766

	Paper Publishing WeChat

Journals Menu

Home

About SCIRP

Service

Policies