SCIRP Mobile Website
Paper Submission

Why Us? >>

  • - Open Access
  • - Peer-reviewed
  • - Rapid publication
  • - Lifetime hosting
  • - Free indexing service
  • - Free promotion service
  • - More citations
  • - Search engine friendly

Free SCIRP Newsletters>>

Add your e-mail address to receive free newsletters from SCIRP.

 

Contact Us >>

WhatsApp  +86 18163351462(WhatsApp)
   
Paper Publishing WeChat
Book Publishing WeChat
(or Email:book@scirp.org)

Article citations

More>>

Duryea, E., Ganger, M. and Hu, W. (2016) Exploring Deep Reinforcement Learning with Multi Q-Learning. Intelligent Control and Automation, 7, 129. https://doi.org/10.4236/ica.2016.74012

has been cited by the following article:

  • TITLE: Distributional Reinforcement Learning with Quantum Neural Networks

    AUTHORS: Wei Hu, James Hu

    KEYWORDS: Continuous-Variable Quantum Computers, Quantum Reinforcement Learning, Distributional Reinforcement Learning, Quantile Regression, Distributional Q Learning, Grid World Environment, MDP Chain Environment

    JOURNAL NAME: Intelligent Control and Automation, Vol.10 No.2, April 10, 2019

    ABSTRACT: Traditional reinforcement learning (RL) uses the return, also known as the expected value of cumulative random rewards, for training an agent to learn an optimal policy. However, recent research indicates that learning the distribution over returns has distinct advantages over learning their expected value as seen in different RL tasks. The shift from using the expectation of returns in traditional RL to the distribution over returns in distributional RL has provided new insights into the dynamics of RL. This paper builds on our recent work investigating the quantum approach towards RL. Our work implements the quantile regression (QR) distributional Q learning with a quantum neural network. This quantum network is evaluated in a grid world environment with a different number of quantiles, illustrating its detailed influence on the learning of the algorithm. It is also compared to the standard quantum Q learning in a Markov Decision Process (MDP) chain, which demonstrates that the quantum QR distributional Q learning can explore the environment more efficiently than the standard quantum Q learning. Efficient exploration and balancing of exploitation and exploration are major challenges in RL. Previous work has shown that more informative actions can be taken with a distributional perspective. Our findings suggest another cause for its success: the enhanced performance of distributional RL can be partially attributed to its superior ability to efficiently explore the environment.