Results 11 to 20 of about 6,813,126 (333)

Deep Reinforcement Learning with Double Q-Learning [PDF]

open access: yesAAAI Conference on Artificial Intelligence, 2015
The popular Q-learning algorithm is known to overestimate action values under certain conditions. It was not previously known whether, in practice, such overestimations are common, whether they harm performance, and whether they can generally be ...
H. V. Hasselt, A. Guez, David Silver
semanticscholar   +3 more sources

Q-learning with censored data [PDF]

open access: yesThe Annals of Statistics, 2012
We develop methodology for a multistage-decision problem with flexible number of stages in which the rewards are survival times that are subject to censoring. We present a novel Q-learning algorithm that is adjusted for censored data and allows a flexible number of stages.
Goldberg, Yair, Kosorok, Michael R.
openaire   +6 more sources

Extreme Q-Learning: MaxEnt RL without Entropy [PDF]

open access: yesInternational Conference on Learning Representations, 2023
Modern Deep Reinforcement Learning (RL) algorithms require estimates of the maximal Q-value, which are difficult to compute in continuous domains with an infinite number of possible actions.
Divyansh Garg   +3 more
semanticscholar   +1 more source

Mildly Conservative Q-Learning for Offline Reinforcement Learning [PDF]

open access: yesNeural Information Processing Systems, 2022
Offline reinforcement learning (RL) defines the task of learning from a static logged dataset without continually interacting with the environment. The distribution shift between the learned policy and the behavior policy makes it necessary for the value
Jiafei Lyu   +3 more
semanticscholar   +1 more source

Offline RL for Natural Language Generation with Implicit Language Q Learning [PDF]

open access: yesInternational Conference on Learning Representations, 2022
Large language models distill broad knowledge from text corpora. However, they can be inconsistent when it comes to completing user specified tasks. This issue can be addressed by finetuning such models via supervised learning on curated datasets, or via
Charles Burton Snell   +4 more
semanticscholar   +1 more source

Quantum agents in the Gym: a variational quantum algorithm for deep Q-learning [PDF]

open access: yesQuantum, 2021
Quantum machine learning (QML) has been identified as one of the key fields that could reap advantages from near-term quantum devices, next to optimization and quantum chemistry. Research in this area has focused primarily on variational quantum algorithms
Andrea Skolik, S. Jerbi, V. Dunjko
semanticscholar   +1 more source

The Blessing of Heterogeneity in Federated Q-learning: Linear Speedup and Beyond [PDF]

open access: yesInternational Conference on Machine Learning, 2023
When the data used for reinforcement learning (RL) are collected by multiple agents in a distributed manner, federated versions of RL algorithms allow collaborative learning without the need for agents to share their local data.
Jiin Woo, Gauri Joshi, Yuejie Chi
semanticscholar   +1 more source

Q-learning Decision Transformer: Leveraging Dynamic Programming for Conditional Sequence Modelling in Offline RL [PDF]

open access: yesInternational Conference on Machine Learning, 2022
Recent works have shown that tackling offline reinforcement learning (RL) with a conditional policy produces promising results. The Decision Transformer (DT) combines the conditional policy approach and a transformer architecture, showing competitive ...
Taku Yamagata   +2 more
semanticscholar   +1 more source

Deep Q-Learning based Reinforcement Learning Approach for Network Intrusion Detection [PDF]

open access: yesDe Computis, 2021
The rise of the new generation of cyber threats demands more sophisticated and intelligent cyber defense solutions equipped with autonomous agents capable of learning to make decisions without the knowledge of human experts.
Hooman Alavizadeh   +2 more
semanticscholar   +1 more source

Generalized Speedy Q-Learning [PDF]

open access: yesIEEE Control Systems Letters, 2020
In this paper, we derive a generalization of the Speedy Q-learning (SQL) algorithm that was proposed in the Reinforcement Learning (RL) literature to handle slow convergence of Watkins' Q-learning. In most RL algorithms such as Q-learning, the Bellman equation and the Bellman operator play an important role.
Indu John   +2 more
openaire   +2 more sources

Home - About - Disclaimer - Privacy