A Policy-Improving System for Adaptability to Dynamic Environments Using Mixture Probability and Clustering Distribution - Journal of Computer and Communications

JCC > Vol.2 No.4, March 2014

Volume 2, Issue 4 (March 2014)

ISSN Print: 2327-5219 ISSN Online: 2327-5227

Google-based Impact Factor: 1.12 Citations

A Policy-Improving System for Adaptability to Dynamic Environments Using Mixture Probability and Clustering Distribution ()

HTML

Download as PDF (Size: 826KB) PP. 210-219

DOI: 10.4236/jcc.2014.24028 2,852 Downloads 4,485 Views Citations

Author(s)

Uthai Phommasak, Daisuke Kitakoshi, Jun Mao, Hiroyuki Shioya

Affiliation(s)

Department of Information Engineering, Tokyo National College of Technology, Tokyo, Japan.
Division of Information and Electronic Engineering, Graduate School of Engineering, Muroran Institute of Technology, Hokkaido, Japan.

ABSTRACT

Along with the increasing need for rescue robots in disasters such as earthquakes and tsunami, there is an urgent need to develop robotics software for learning and adapting to any environment. A reinforcement learning (RL) system that improves agents’ policies for dynamic environments by using a mixture model of Bayesian networks has been proposed, and is effective in quickly adapting to a changing environment. However, the increase in computational complexity requires the use of a high-performance computer for simulated experiments and in the case of limited calculation resources, it becomes necessary to control the computational complexity. In this study, we used an RL profit-sharing method for the agent to learn its policy, and introduced a mixture probability into the RL system to recognize changes in the environment and appropriately improve the agent’s policy to adjust to a changing environment. We also introduced a clustering distribution that enables a smaller, suitable selection, while maintaining a variety of mixture probability elements in order to reduce the computational complexity and simultaneously maintain the system’s performance. Using our proposed system, the agent successfully learned the policy and efficiently adjusted to the changing environment. Finally, control of the computational complexity was effective, and the decline in effectiveness of the policy improvement was controlled by using our proposed system.

KEYWORDS

Reinforcement Learning; Profit-Sharing Method; Mixture Probability; Clustering

Share and Cite:

Phommasak, U. , Kitakoshi, D. , Mao, J. and Shioya, H. (2014) A Policy-Improving System for Adaptability to Dynamic Environments Using Mixture Probability and Clustering Distribution. Journal of Computer and Communications, 2, 210-219. doi: 10.4236/jcc.2014.24028.

Cited by

[1]	An empirical study on evaluating basic characteristics and adaptability to users of a preventive care system with learning communication robots
	Soft Computing, 2015

[2]	A Novel exploration/exploitation policy accelerating learning in both stationary and non-stationary environment navigation tasks
	International Journal of Computer and Electrical Engineering, 2015

[3]	動的な階層環境における強化学習エージェントの確率知識を用いた方策改善に関する研究
	2015

[4]	A Novel Exploration/Exploitation Policy Accelerating Environment Navigation Tasks
	2015

[5]	A Reinforcement Learning System to Dynamic Movement and Multi-Layer Environments
	Journal of Intelligent Learning Systems and Applications, 2014

Journals Menu

Follow SCIRP

	+1 323-425-8868
	customer@scirp.org
	+86 18163351462(WhatsApp)
	1655362766

	Paper Publishing WeChat

Journals Menu

Home

About SCIRP

Service

Policies