Results 11 to 20 of about 5,156,964 (309)

Robust Q-Learning

open access: yesJournal of the American Statistical Association, 2020
–Q-learning is a regression-based approach that is widely used to formalize the development of an optimal dynamic treatment strategy. Finite dimensional working models are typically used to estimate certain nuisance parameters, and misspecification of ...
Robert L. Strawderman (2880557)   +3 more
core   +5 more sources

Self-correcting Q-learning

open access: yesProceedings of the AAAI Conference on Artificial Intelligence, 2021
The Q-learning algorithm is known to be affected by the maximization bias, i.e. the systematic overestimation of action values, an important issue that has recently received renewed attention. Double Q-learning has been proposed as an efficient algorithm
Zhu, Rong, Rigotti, Mattia
core   +3 more sources

Constrained Deep Q-Learning Gradually Approaching Ordinary Q-Learning [PDF]

open access: yesFrontiers in Neurorobotics, 2019
A deep Q network (DQN) (Mnih et al., 2013) is an extension of Q learning, which is a typical deep reinforcement learning method. In DQN, a Q function expresses all action values under all states, and it is approximated using a convolutional neural ...
Shota Ohnishi   +6 more
doaj   +3 more sources

Speedy Q-learning

open access: yes, 2011
International audienceWe introduce a new convergent variant of Q-learning, called speedy Q-learning, to address the problem of slow convergence in the standard form of the Q-learning algorithm.
Kappen, Hilbert   +3 more
core   +5 more sources

Periodic Q-learning

open access: yesCoRR, 2020
The use of target networks is a common practice in deep reinforcement learning for stabilizing the training; however, theoretical understanding of this technique is still limited.
Lee, Donghwan, He, Niao
core   +4 more sources

Double Q-learning [PDF]

open access: yes, 2010
In some stochastic environments the well-known reinforcement learning algorithm Q-learning performs very poorly. This poor performance is caused by large overestimations of action values, which result from a positive bias that is introduced because Q ...
Hasselt, H. P. (Hado) van
core   +4 more sources

An Improved Q-Learning Algorithm and Its Application in Path Planning

open access: yesTaiyuan Ligong Daxue xuebao, 2021
Traditional Q-Learning algorithm has the problems of too many random searches and slow convergence speed. Therefore, in this paper an improved ε-Q-Learning algorithm based on traditional Q-Learning algorithm was propased and applied to path planning. The
Guojun MAO, Shimin GU
doaj   +1 more source

Logistic Q-Learning

open access: yesCoRR, 2020
ISSN:2640 ...
Bas-Serrano, Joan   +3 more
openaire   +4 more sources

Multi-Source Multi-Destination Hybrid Infrastructure-Aided Traffic Aware Routing in V2V/I Networks

open access: yesIEEE Access, 2022
The concept of the “connected car” offers the potential for safer, more enjoyable and more efficient driving and eventually autonomous driving.
Teodor Ivanescu   +3 more
doaj   +1 more source

Generalized Speedy Q-Learning [PDF]

open access: yesIEEE Control Systems Letters, 2020
In this paper, we derive a generalization of the Speedy Q-learning (SQL) algorithm that was proposed in the Reinforcement Learning (RL) literature to handle slow convergence of Watkins' Q-learning. In most RL algorithms such as Q-learning, the Bellman equation and the Bellman operator play an important role.
Indu John   +2 more
openaire   +2 more sources

Home - About - Disclaimer - Privacy