Q-learning - Open Access .click

Results 11 to 20 of about 5,156,964 (309)

Journal of the American Statistical Association, 2020
–Q-learning is a regression-based approach that is widely used to formalize the development of an optimal dynamic treatment strategy. Finite dimensional working models are typically used to estimate certain nuisance parameters, and misspecification of ...
Robert L. Strawderman (2880557) +3 more
core +5 more sources

Self-correcting Q-learning

Proceedings of the AAAI Conference on Artificial Intelligence, 2021
The Q-learning algorithm is known to be affected by the maximization bias, i.e. the systematic overestimation of action values, an important issue that has recently received renewed attention. Double Q-learning has been proposed as an efficient algorithm
Zhu, Rong, Rigotti, Mattia
core +3 more sources

Constrained Deep Q-Learning Gradually Approaching Ordinary Q-Learning [PDF]

Frontiers in Neurorobotics, 2019
A deep Q network (DQN) (Mnih et al., 2013) is an extension of Q learning, which is a typical deep reinforcement learning method. In DQN, a Q function expresses all action values under all states, and it is approximated using a convolutional neural ...
Shota Ohnishi +6 more
doaj +3 more sources

Speedy Q-learning

, 2011
International audienceWe introduce a new convergent variant of Q-learning, called speedy Q-learning, to address the problem of slow convergence in the standard form of the Q-learning algorithm.
Kappen, Hilbert +3 more
core +5 more sources

Periodic Q-learning

CoRR, 2020
The use of target networks is a common practice in deep reinforcement learning for stabilizing the training; however, theoretical understanding of this technique is still limited.
Lee, Donghwan, He, Niao
core +4 more sources

Double Q-learning [PDF]

, 2010
In some stochastic environments the well-known reinforcement learning algorithm Q-learning performs very poorly. This poor performance is caused by large overestimations of action values, which result from a positive bias that is introduced because Q ...
Hasselt, H. P. (Hado) van
core +4 more sources

An Improved Q-Learning Algorithm and Its Application in Path Planning

Taiyuan Ligong Daxue xuebao, 2021
Traditional Q-Learning algorithm has the problems of too many random searches and slow convergence speed. Therefore, in this paper an improved ε-Q-Learning algorithm based on traditional Q-Learning algorithm was propased and applied to path planning. The
Guojun MAO, Shimin GU
doaj +1 more source

Logistic Q-Learning

CoRR, 2020
ISSN:2640 ...
Bas-Serrano, Joan +3 more
openaire +4 more sources

Multi-Source Multi-Destination Hybrid Infrastructure-Aided Traffic Aware Routing in V2V/I Networks

IEEE Access, 2022
The concept of the “connected car” offers the potential for safer, more enjoyable and more efficient driving and eventually autonomous driving.
Teodor Ivanescu +3 more
doaj +1 more source

Generalized Speedy Q-Learning [PDF]

IEEE Control Systems Letters, 2020
In this paper, we derive a generalization of the Speedy Q-learning (SQL) algorithm that was proposed in the Reinforcement Learning (RL) literature to handle slow convergence of Watkins' Q-learning. In most RL algorithms such as Q-learning, the Bellman equation and the Bellman operator play an important role.
Indu John +2 more
openaire +2 more sources

reinforcement learning
deep reinforcement learning
artificial intelligence

machine learning
path planning