Motor Learning Based on the Cooperation of Cerebellum and Basal Ganglia for a Self-Balancing Two-Wheeled Robot
Xiaogang Ruan, Jing Chen, Lizhen Dai
.
DOI: 10.4236/ica.2011.23026   PDF    HTML     5,229 Downloads   8,495 Views   Citations

Abstract

A novel motor learning method is present based on the cooperation of the cerebellum and basal ganglia for the behavior learning of agent. The motor learning method derives from the principle of CNS and operant learning mechanism and it depends on the interactions between the basal ganglia and cerebellum. The whole learning system is composed of evaluation mechanism, action selection mechanism, tropism mechanism. The learning signals come from not only the Inferior Olive but also the Substantia Nigra in the beginning. The speed of learning is increased as well as the failure time is reduced with the cerebellum as a supervisor. Convergence can be guaranteed in the sense of entropy. With the proposed motor learning method, a motor learning system for the self-balancing two-wheeled robot has been built using the RBF neural networks as the actor and evaluation function approximator. The simulation experiments showed that the proposed motor learning system achieved a better learning effect, so the motor learning based on the coordination of cerebellum and basal ganglia is effective.

Share and Cite:

X. Ruan, J. Chen and L. Dai, "Motor Learning Based on the Cooperation of Cerebellum and Basal Ganglia for a Self-Balancing Two-Wheeled Robot," Intelligent Control and Automation, Vol. 2 No. 3, 2011, pp. 214-225. doi: 10.4236/ica.2011.23026.

Conflicts of Interest

The authors declare no conflicts of interest.

References

[1] J. C. Houk and S. P. Wise, “Distributed Modular Architectures Linking Basal Ganglia, Cerebellum, and Cere-Bral Cortex: Their Role in Planning and Controlling Action,” Cerebral Cortex, Vol. 5, No. 2, 1995, pp. 95-110. doi: 10.1093/cercor/5.2.95
[2] X. Lu, O. Hikosaka and S. Miyachi, “Role of Monkey Cerebellar Nuclei in Skill for Sequential Movement,” Journal of Neurophysiology, Vol. 79, No. 5, 1998, pp. 2245-2254.
[3] K. Doya, “What are the Computations of the Cerebellum, the Basal Ganglia and the Cerebral Cortex?” Neural Networks, Vol. 12, No. 7-8, 1999, pp. 961-974. doi:10.1016/S0893-6080(99)00046-5
[4] J. C. Houk, “Agents of the mind,” Biological Cybernetics, Vol. 92, No. 6, 2005, pp. 427-437. doi:10.1007/s00422-005-0569-8
[5] E. I. Knudsen, “Supervised Learning in the Brain,” The Journal of Neuroscience, Vol. 14, No. 7, 1994. pp. 3985-3997.
[6] T. V. P. Bliss and G. L. Collingridge, “A Synaptic Model of Memory: Long-Term Potentiation in the Hippo-Campus,” Nature, Vol. 32, No. 2, 1993, pp. 31-39. doi:10.1038/361031a0
[7] S. E. Hua and J. C. Houk, “Cerebellar Guidance of Premotor Network Development and Sensorimotor Learning,” Learning and Memory, Vol. 4, No. 1, 1997, pp. 63-76. doi:10.1101/lm.4.1.63
[8] B. Girard, N. Tabareau, Q. C. Phama, A Berthoz and J. J. Slotine, “Where Neuroscience and Dynamic System Theory Meet Autonomous Robotics: A Contracting Basal Ganglia Model for Action Selection,” Neural Networks, Vol. 21, No. 4, 2008, pp. 628-641. doi:10.1016/j.neunet.2008.03.009
[9] R. Bogacz and K. Gurney. “The Basal Ganglia and Cortex Implement Optimal Decision Making between Alternative Actions,” Neural Computation, Vol. 19, No. 2, 2007, pp. 442-477. doi:10.1162/neco.2007.19.2.442
[10] W. M. Jonathan, “The Basal Ganglia: Focused Selection and Inhibition of Competing Motor Programs,” Progress in Neurobiology, Vol. 50, No. 4, 1996, pp. 381-425. doi:10.1016/S0301-0082(96)00042-1
[11] M. X. Cohen and M. J. Frank, “Neurocomputational Models of Basal Ganglia Function in Learning, Memory and Choice,” Behavioural Brain Research, Vol. 199, No. 1, 2009, pp. 141-156. doi:10.1016/j.bbr.2008.09.029
[12] B. W. Balleine, M. Liljeholm and S. B. Ostlund, “The Integrative Function of the Basal Ganglia in Instrumental Conditioning,” Behavioural Brain Research, Vol. 199, No. 1, 2009, pp. 43-52. doi:10.1016/j.bbr.2008.10.034
[13] P. Dean and J. Porrill, “Adaptive-Filter Models of the Cerebellum: Computational Analysis,” The Cerebellum, Vol. 7, No. 4, 2008, pp. 567-571. doi:10.1007/s12311-008-0067-3
[14] P. Dean, J. Porrill, C. F. Ekerot E and J. Henrik, “The Cerebellar Microcircuit as an Adaptive Filter: Experimental and Computational Evidence,” Nature Reviews Neuroscience, Vol. 11, No. 1, 2010, pp. 30-43. doi:10.1038/nrn2756
[15] M. Kawato, “Feedback-Error-Learning Neural Network for Supervised Motor Learning,” Elsevier, Amsterdam, 1990.
[16] M. Kawato and H. Gomi, “A Computational Model of Four Regions of the Cerebellum Based on Feeback-Error-Learning,” Biological Cybernetics, Vol. 68, No, 2, 1992, pp. 95-103. doi:10.1007/BF00201431
[17] D. Joel, Y. Niv and E. Ruppin, “Actor-Critic Models of the Basal Ganglia: New Anatomical and Computational Perspectives,” Neural Network, Vol. 15, No. 4-6, 2002, pp. 535-547. doi:10.1016/S0893-6080(02)00047-3
[18] M. Khamassi, L. Lachèze, B. Girard, A. Berthoz and A. Guillot, “Actor-Critic Models of Reinforcement Learning in the Basal Ganglia: from Natural to Artificial Rats,” Adaptive Behavior, Vol. 13, No. 2, 2005, pp. 131-148. doi:10.1177/105971230501300205
[19] B. Girard, N. Tabareau, Q. C. Phama, A. Berthoz and J. J. Slotine, “Where Neuroscience and Dynamic System Theory Meet Autonomous Robotics: A Contracting Basal Ganglia Model for Action Selection,” Neural Networks, Vol. 21, No. 4, 2008, pp. 628-641. doi:10.1016/j.neunet.2008.03.009
[20] K. Doya, “Complementary Roles of Basal Ganglia and Cerebellum in Learning and Motor Control,” Current Opinion in Neurobiology, Vol. 10, No. 6, 2000, pp. 732-739. doi:10.1016/S0959-4388(00)00153-7
[21] C. Ye, N. H. C. Yung and D. W. Wang, “A Fuzzy Controller with Supervised Learning Assisted Reinforcement Learning Algorithm for Obstacle Avoidance,” IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, Vol. 33, No. 1, 2003, pp. 17-27. doi:10.1109/TSMCB.2003.808179
[22] M. J. Er and C. Deng, “Obstacle Avoidance of a Mobile Robot Using Hybrid Learning Approach,” IEEE Transactions on Industrial Electronics, Vol. 52, No. 3, 2005, pp. 898-905. doi:10.1109/TIE.2005.847576
[23] M. T. Rosenstein and A. G. Barto, “Learning and Approximate Dynamic Programming: Scaling Up to the Real World,” IEEE Press and John Wiley & Sons, Inc., New York, 2004.
[24] J. A. Clouse and P. E. Utgoff, “A Teaching Method for Reinforcement Learning,” Proceedings of the Nineth International Conference on Machine Learning, San Francisco, 1992, pp. 92-101.
[25] H. Benbrahim, and J. A. Franklin, “Biped Dynamic Walking Using Reinforcement Learning,” Robotics and Autonomous Systems, Vol. 22, No. 3-4, 1997, pp. 283-302. doi:10.1016/S0921-8890(97)00043-2
[26] K. Pathak, J. Franch and S. K. Agrawal, “Velocity and Position Control of a Wheeled Inverted Pendulum by Partial Feedback Linearization,” IEEE Transactions on Robotics, Vol. 21, No. 3, 2005, pp. 505-513. doi:10.1109/TRO.2004.840905
[27] A Blankespoor and R Roemer, “Experimental Verification of the Dynamic Model for a Quarter Size Self-Balancing Wheelchair,” Proceedings of the 2004 American Control Conference, Boston, 2004, pp. 488-492.
[28] F Grasser, A. D’Arrigo, S. Colombi and A. C. Rufer, “JOE: A Mobile, Inverted Pendulum,” IEEE Transactions on Industrial Electronics, Vol. 49, No. 1, 2002, pp. 107-114. doi:10.1109/41.982254
[29] D. P. Anderson, “NBot Balancing Robot, a Two Wheel Balancing Robot,” 2003. http://www.geology.smu.edu/~dpa-ww/robo/nbot/index. html.
[30] C.-H. Chiu, “The Design and Implementation of a Wheeled Inverted Pendulum Using an Adaptive Output Recurrent Cerebellar Model Articulation Controller,” IEEE Transactions on Industrial Electronics, Vol, 57, No. 5, 2010, pp. 1814-1822. doi:10.1109/TIE.2009.2032203
[31] S. Jung and S. S. Kim, “Control Experiment of a Wheel-Driven Mobile Inverted Pendulum Using Neural Network,” IEEE Transactions on Control Systems Technology, Vol. 16, No. 2, 2008, pp. 297-303. doi:10.1109/TCST.2007.903396
[32] L. Q. Han and X. Y. Tu, “Study of Artificial Brain Based on Multi-Centrum Self-Coordination Mechanism,” Science Press, Beijing, 2009.
[33] C. Chen and R. F. Thompson, “Temporal Specificity of Long-Term Depression in Parallel Fiber - Purkinje Synapses in Rat Cerebellar Slice,” Learning and Memmory, Vol. 2, No. 3-4, 1995, pp. 185-198. doi:10.1101/lm.2.3-4.185
[34] B. F. Skinner, “The Behavior of Organisms,” Appleton-Century-Crofts, New York, 1938.
[35] I. P. Pavlov, “Conditioned Reflexes,” Oxford University Press, Oxford, 1927.
[36] B. Brembs, W. Plendl, “Double Dissociation of PKC and AC Manipulations on Operant and in Drosophila,” Current Biology, Vol. 18, No. 15, 2008, pp. 1168-1171. doi:10.1016/j.cub.2008.07.041
[37] C. W. Yao and G. C. Chen, “A Emotion Development Agent Model Based on OCC Model and Operant Conditioning,” 2001 International Conferences on Info-Tech and Info-Net Proceedings, Beijing, 2001, pp. 246-250.
[38] K.Itoh, H. Miwa, M. Matsumoto, M. Zecca, M. Takanobu, H. Roccella, S. Carrozza, M. C. Dario and P. Takanishi, “Behavior Model of Humanoid Robots Based on Operant Conditioning,” Proceedings of 2005 5th IEEE-RAS International Conference on Humanoid Robots IEEE/RAS International Conference on Humanoid Robots, Tsukuba, 2005, pp. 220-225.
[39] J. S. Leng, L. Jain and C. Fyfe, “Convergence Analysis on Temporal Difference Learning. International Journal of Innovative Computing,” Information and Control, Vol. 5, No. 4, 2009, pp. 913-922.
[40] X. G. Ruan, “Neural Computational Science: Simulation Brain Function at the Cellular Level,” National Defense Industry Press, Beijing, 2006.
[41] M. Murata and S. Ozawa, “A Reinforcement Learning Model Using Deterministic State-Action Sequences,” International Journal of Innovative Computing, Information and Control, Vol. 6, No. 2, 2010, pp. 577-590.
[42] J. Randlov, A. G. Barto and M. T. Rosenstein, “Combining Reinforcement Learning with a Local Control Algorithm,” Proceedings of the Seventeenth International Conference on Machine Learning, Vol. 1, No. 4, 2000, pp. 775-782.
[43] C. T. Chiang and C. S. Lin, “CMAC with General Basis Functions,” Neural Network, Vol. 9, No. 7, 1996, pp. 1199-1211. doi:10.1016/0893-6080(96)00132-3
[44] A. V. Lazo and P. Rathie, “On the Entropy of Continuous Probability Distributions,” IEEE Transactions on Information Theory, Vol. 24, No. 1, 1978, pp. 120-122. doi:10.1109/TIT.1978.1055832

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.