Results 11 to 20 of about 241,763 (280)

Constrained Deep Q-Learning Gradually Approaching Ordinary Q-Learning [PDF]

open access: yesFrontiers in Neurorobotics, 2019
A deep Q network (DQN) (Mnih et al., 2013) is an extension of Q learning, which is a typical deep reinforcement learning method. In DQN, a Q function expresses all action values under all states, and it is approximated using a convolutional neural ...
Shota Ohnishi   +6 more
doaj   +3 more sources

Deep Q-learning From Demonstrations

open access: yesProceedings of the AAAI Conference on Artificial Intelligence, 2018
Deep reinforcement learning (RL) has achieved several high profile successes in difficult decision-making problems. However, these algorithms typically require a huge amount of data before they reach reasonable performance. In fact, their performance during learning can be extremely poor.
Hester, Todd   +13 more
openaire   +3 more sources

Using reinforcement learning in genome assembly: in-depth analysis of a Q-learning assembler [PDF]

open access: yesFrontiers in Bioinformatics
Genome assembly remains an unsolved problem, and de novo strategies (i.e., those run without a reference) are relevant but computationally complex tasks in genomics.
Kleber Padovani   +7 more
doaj   +2 more sources

Q-learning with censored data [PDF]

open access: yesThe Annals of Statistics, 2012
We develop methodology for a multistage-decision problem with flexible number of stages in which the rewards are survival times that are subject to censoring. We present a novel Q-learning algorithm that is adjusted for censored data and allows a flexible number of stages.
Goldberg, Yair, Kosorok, Michael R.
openaire   +6 more sources

Reducing the Prevalence of Coronavirus (COVID-19) in Airlines Based on and the Reinforcement Artificial Intelligence [PDF]

open access: yesفناوری در مهندسی هوافضا, 2022
This paper proposes a method based on the artificial intelligence reinforcement Q-learning algorithm and paired comparison technique to solve the problem of health monitoring devices shortage in airlines.
Iman Shafieenejad   +3 more
doaj   +1 more source

Generalized Speedy Q-Learning [PDF]

open access: yesIEEE Control Systems Letters, 2020
In this paper, we derive a generalization of the Speedy Q-learning (SQL) algorithm that was proposed in the Reinforcement Learning (RL) literature to handle slow convergence of Watkins' Q-learning. In most RL algorithms such as Q-learning, the Bellman equation and the Bellman operator play an important role.
Indu John   +2 more
openaire   +2 more sources

Reinforcement learning using Deep $$Q$$ Q networks and $$Q$$ Q learning accurately localizes brain tumors on MRI with very small training sets

open access: yesBMC Medical Imaging, 2022
Background Supervised deep learning in radiology suffers from notorious inherent limitations: 1) It requires large, hand-annotated data sets; (2) It is non-generalizable; and (3) It lacks explainability and intuition.
J. N. Stember, H. Shalu
doaj   +1 more source

Reinforcement Learning-Based Routing Protocols in Vehicular and Flying Ad Hoc Networks – A Literature Survey

open access: yesPromet (Zagreb), 2022
Vehicular and flying ad hoc networks (VANETs and FANETs) are becoming increasingly important with the development of smart cities and intelligent transportation systems (ITSs).
Pavle Bugarčić   +2 more
doaj   +1 more source

QSPCA: A two-stage efficient power control approach in D2D communication for 5G networks

open access: yesIntelligent and Converged Networks, 2021
The existing literature on device-to-device (D2D) architecture suffers from a dearth of analysis under imperfect channel conditions. There is a need for rigorous analyses on the policy improvement and evaluation of network performance. Accordingly, a two-
Saurabh Chandra   +4 more
doaj   +1 more source

Cooperative Output Regulation By Q-learning For Discrete Multi-agent Systems In Finite-time

open access: yesJournal of Applied Science and Engineering, 2022
This article studies the output regulation of discrete-time multi-agent systems with an unknown model by a finite-time optimal control algorithm based on Q-learning that uses the method of the linear quadratic regulator (LQR).
Wenjun Wei, Jingyuan Tang
doaj   +1 more source

Home - About - Disclaimer - Privacy