Results 1 to 10 of about 6,813,126 (333)
Deep Q-learning From Demonstrations
Deep reinforcement learning (RL) has achieved several high profile successes in difficult decision-making problems. However, these algorithms typically require a huge amount of data before they reach reasonable performance. In fact, their performance during learning can be extremely poor.
Hester, Todd +13 more
semanticscholar +4 more sources
Q-learning (Watkins, 1989) is a simple way for agents to learn how to act optimally in controlled Markovian domains. It amounts to an incremental method for dynamic programming which imposes limited computational demands. It works by successively improving its evaluations of the quality of particular actions at particular states.
Watkins, C., Dayan, P.
openaire +3 more sources
Constrained Deep Q-Learning Gradually Approaching Ordinary Q-Learning [PDF]
A deep Q network (DQN) (Mnih et al., 2013) is an extension of Q learning, which is a typical deep reinforcement learning method. In DQN, a Q function expresses all action values under all states, and it is approximated using a convolutional neural ...
Shota Ohnishi +6 more
doaj +3 more sources
q-Learning in Continuous Time [PDF]
70 pages, 4 figures, appended with an ...
Jia, Yanwei, Zhou, Xun Yu
openaire +4 more sources
In Reinforcement Learning the Q-learning algorithm provably converges to the optimal solution. However, as others have demonstrated, Q-learning can also overestimate the values and thereby spend too long exploring unhelpful states. Double Q-learning is a provably convergent alternative that mitigates some of the overestimation issues, though sometimes ...
David G. Barber
openalex +3 more sources
Using reinforcement learning in genome assembly: in-depth analysis of a Q-learning assembler [PDF]
Genome assembly remains an unsolved problem, and de novo strategies (i.e., those run without a reference) are relevant but computationally complex tasks in genomics.
Kleber Padovani +7 more
doaj +2 more sources
Modified interactive Q-learning for attenuating the impact of model misspecification with treatment effect heterogeneity. [PDF]
Zhang Y, Vock DM, Patrick ME, Murray TA.
europepmc +2 more sources
Reducing the Prevalence of Coronavirus (COVID-19) in Airlines Based on and the Reinforcement Artificial Intelligence [PDF]
This paper proposes a method based on the artificial intelligence reinforcement Q-learning algorithm and paired comparison technique to solve the problem of health monitoring devices shortage in airlines.
Iman Shafieenejad +3 more
doaj +1 more source
IDQL: Implicit Q-Learning as an Actor-Critic Method with Diffusion Policies [PDF]
Effective offline RL methods require properly handling out-of-distribution actions. Implicit Q-learning (IQL) addresses this by training a Q-function using only dataset actions through a modified Bellman backup.
Philippe Hansen-Estruch +4 more
semanticscholar +1 more source

