[1]
|
Sutton, R.S. and Barto, A.G. (1998) Reinforcement Learning: An Introduction. MIR Press, Cambridge.
|
[2]
|
Croonenborghs, T., Ramon, J., Blockeel, H. and Bruynooghe, M. (2006) Model-Assisted Approaches for Relational Reinforcement Learning: Some Challenges for the SRL Community. Proceedings of the ICML-2006 Workshop on Open Problems in Statistical Relational Learning, Pittsburgh.
|
[3]
|
Fernandez, F. and Veloso, M. (2006) Probabilistic Policy Reuse in a Reinforcement Learning Agent. Proceedings of the Fifth International Joint Conference on Autonomous Agents and Multi-Agent Systems, New York, May 2006, 720-727. http://dx.doi.org/10.1145/1160633.1160762
|
[4]
|
Kober, J., Bagnell, J.A. and Peters, J. (2013) Reinforcement Learning in Robotics: A Survey. International Journal of Robotics Research, 32, 1238-1274. http://dx.doi.org/10.1177/0278364913495721
|
[5]
|
Kitakoshi, D., Shioya, H. and Nakano, R. (2004) Adaptation of the Online Policy-Improving System by Using a Mixture Model of Bayesian Networks to Dynamic Environments. Electronics, Information and Communication Engineers, 104, 15-20.
|
[6]
|
Kitakoshi, D., Shioya, H. and Nakano, R. (2010) Empirical Analysis of an On-Line Adaptive System Using a Mixture of Bayesian Networks. Information Science, 180, 2856-2874. http://dx.doi.org/10.1016/j.ins.2010.04.001
|
[7]
|
Phommasak, U., Kitakoshi, D. and Shioya, H. (2012) An Adaptation System in Unknown Environments Using a Mixture Probability Model and Clustering Distributions. Journal of Advanced Computational Intelligence and Intelligent Informatics, 16, 733-740.
|
[8]
|
Phommasak, U., Kitakoshi, D., Mao, J. and Shioya, H. (2014) A Policy-Improving System for Adaptability to Dynamic Environments Using Mixture Probability and Clustering Distribution. Journal of Computer and Communications, 2, 210-219. http://dx.doi.org/10.4236/jcc.2014.24028
|
[9]
|
Tanaka, F. and Yamamura, M. (1997) An Approach to Lifelong Reinforcement Learning through Multiple Environments. Proceedings of the Sixth European Workshop on Learning Robots, Brighton, 1-2 August 1997, 93-99.
|
[10]
|
Minato, T. and Asada, M. (1998) Environmental Change Adaptation for Mobile Robot Navigation. 1998 IEEE/RSJ International Conference on Intelligent Robots and Systems, 3, 1859-1864.
|
[11]
|
Ghavamzadeh, M. and Mahadevan, S. (2007) Hierarchical Average Reward Reinforcement Learning. The Journal of Machine Learning Research, 8, 2629-2669.
|
[12]
|
Kato, S. and Matsuo, H. (2000) A Theory of Profit Sharing in Dynamic Environment. Proceedings of the 6th Pacific Rim International Conference on Artificial Intelligence, Melbourne, 28 August-1 September 2000, 115-124.
|
[13]
|
Nakano, H., Takada, S., Arai, S. and Miyauchi, A. (2005) An Efficient Reinforcement Learning Method for Dynamic Environments Using Short Term Adjustment. International Symposium on Nonlinear Theory and Its Applications, Bruges, 18-21 October 2005, 250-253.
|
[14]
|
Hellinger, E. (1909) Neue Begrüündung der Theorie quadratischer Formen von unendlichvielen Veräänderlichen. Journal für die Reine und Angewandte Mathematik, 136, 210-271.
|
[15]
|
Pearl, J. (1988) Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann Pub. Inc., San Francisco.
|