Topological Order Value Iteration Algorithm for Solving Probabilistic Planning

Xiaofei Liu; Mingjie Li; Qingxin Nie

doi:10.4236/cn.2013.51B020

Communications and Network > Vol.5 No.1B, February 2013

Topological Order Value Iteration Algorithm for Solving Probabilistic Planning

Xiaofei Liu, Mingjie Li, Qingxin Nie
Department of Computer Science, DunHua Vocational-technical School, Dunhua, China.
Polytechnic School, SanYa University, Sanya, China.
School of Foundation Courses, SanYa University, Sanya, China.
DOI: 10.4236/cn.2013.51B020 PDF HTML 2,544 Downloads 3,668 Views

Abstract

AI researchers typically formulated probabilistic planning under uncertainty problems using Markov Decision Processes (MDPs).Value Iteration is an inef?cient algorithm for MDPs, because it puts the majority of its effort into backing up the entire state space, which turns out to be unnecessary in many cases. In order to overcome this problem, many approaches have been proposed. Among them, LAO*, LRTDP and HDP are state-of-the-art ones. All of these use reach ability analysis and heuristics to avoid some unnecessary backups. However, none of these approaches fully exploit the graphical features of the MDPs or use these features to yield the best backup sequence of the state space. We introduce an improved algorithm named Topological Order Value Iteration (TOVI) that can circumvent the problem of unnecessary backups by detecting the structure of MDPs and backing up states based on topological sequences. The experimental results demonstrate the effectiveness and excellent performance of our algorithm.

Keywords

Probabilistic Planning; Markov Decision Processes; Dynamic Programming; Value Iteration

Share and Cite:

Liu, X. , Li, M. and Nie, Q. (2013) Topological Order Value Iteration Algorithm for Solving Probabilistic Planning. Communications and Network, 5, 86-89. doi: 10.4236/cn.2013.51B020.

Conflicts of Interest

The authors declare no conflicts of interest.

References

[1]	S. Y. Yan, M. H. Yin, W. X. Gu and X. F. Liu, “Research and Advances on Probabilistic Planning,” Caai Transactions on Intelligent Systems, Vol. 1, 2008, pp. 9-22.
[2]	A. Barto, S. Bradke and S. Singh, “Learning to Act using Real-time Dynamic Programming,” Artificial Intelligence, Vol. 72, 1995, pp. 81-138. doi:10.1016/0004-3702(94)00011-O
[3]	E. Hansen and S. Zilberstein, “LAO*: A Heuristic Search Algorithm that Finds Solutions Withloops,” Artificial Intelligence, Vol. 129, 2001, pp. 35-62. doi:10.1016/S0004-3702(01)00106-0
[4]	B. Bonet and H. Geffner, “Labeled RTDP: Improving the Convergence of Real-time Dynamic Programming,” Proceedings of 13th ICAPS, 2003, pp. 12-21.
[5]	B. Bonet and H. Geffner, “Faster Heuristic Search Algorithms for Planning with Uncertainty and Full Feedback,” Proceedings of IJ-CAI-03, 2003, pp. 1233-1238.
[6]	C. Guestrin, D. Koller, R. Parr and S. Venkataraman, “Efficient Solution Algorithms for Factored MDPs,” Journal of Artificial Intelligence Research, Vol. 19, 2003, pp. 399-468.
[7]	Z. Feng and E. Hansen, “Symbolic Heuristic Search for Factored Markov Decision Processes,” In Proceedings of AAAI-05, 2002, pp. 44-50.
[8]	P. Dai, Mausam and S. Daniel, “Focused Value Iteration,” The Nineteenth International Conference on Automated Planning and Scheduling (ICAPS-09), 2009, pp. 82-89.
[9]	P. Dai and J. Goldsmith, “Ranking Policies in Discrete Markov Decision Processes,” Annals of Mathematics and Artificial Intelligence, Vol. 59, 2010, pp. 107-123. doi:10.1007/s10472-010-9216-8
[10]	M. Pterman and Markov, “Decision Processes: Discrete Stochastic Dynamic Programming,” Wiley-Interscience, 2005.
[11]	M. Littman, T. Dean and P. Kaelbling, “On the Complexity of Solving Markov Decision Problems,” In Proceedings of UAI-95, 1995, pp. 394-402.
[12]	H. Cormen, C. Leiserson and R. Rivest, “Introduction to Algorithms,” Second Edition, The MIT Press, 2001.

Journals Menu

Follow SCIRP

	+1 323-425-8868
	customer@scirp.org
	+86 18163351462(WhatsApp)
	1655362766

	Paper Publishing WeChat

Journals Menu

Home

About SCIRP

Service

Policies