Results 61 to 70 of about 6,813,126 (333)

Successive Over-Relaxation ${Q}$ -Learning [PDF]

open access: yesIEEE Control Systems Letters, 2020
In a discounted reward Markov Decision Process (MDP), the objective is to find the optimal value function, i.e., the value function corresponding to an optimal policy. This problem reduces to solving a functional equation known as the Bellman equation and a fixed point iteration scheme known as the value iteration is utilized to obtain the solution. In
Chandramouli Kamanchi   +2 more
openaire   +2 more sources

Uncertainty-aware Path Planning using Reinforcement Learning and Deep Learning Methods [PDF]

open access: yesComputer and Knowledge Engineering, 2020
This paper proposes new algorithms to improve Reinforcement Learning (RL) and Deep Q-Network (DQN) methods for path planning considering uncertainty in the perception of environment.
Nematollah Ab azar   +2 more
doaj   +1 more source

Contextual Q-Learning

open access: yes, 2020
This work has received funding from the EU Horizon 2020 research and innovation program under project DOMINOES (grant agreement No 771066) and from FEDER Funds through COMPETE program and from National Funds through FCT under projects CEECIND/01811/2017 and UIDB/00760 ...
Vale, Zita, Pinto, Tiago
openaire   +2 more sources

Switch-based Active Deep Dyna-Q: Efficient Adaptive Planning for Task-Completion Dialogue Policy Learning

open access: yes, 2018
Training task-completion dialogue agents with reinforcement learning usually requires a large number of real user experiences. The Dyna-Q algorithm extends Q-learning by integrating a world model, and thus can effectively boost training efficiency using ...
Gao, Jianfeng   +4 more
core   +1 more source

Deep functional measurements of Fragile X syndrome human neurons reveal multiparametric electrophysiological disease phenotype

open access: yesCommunications Biology
Fragile X syndrome (FXS) is a neurodevelopmental disorder caused by hypermethylation of expanded CGG repeats (>200) in the FMR1 gene leading to gene silencing and loss of Fragile X Messenger Ribonucleoprotein (FMRP) expression. FMRP plays important roles
James J. Fink   +20 more
doaj   +1 more source

Ramp Metering Control Based on the Q-Learning Algorithm

open access: yesCybernetics and Information Technologies, 2015
Modern urban highways are under the influence of increased traffic demand and cannot fulfill the desired level of service anymore. In most of the cases there is no space available for any infrastructure building.
Ivanjko Edouard   +5 more
doaj   +1 more source

Transfer Q-learning

open access: yes, 2022
Time-inhomogeneous finite-horizon Markov decision processes (MDP) are frequently employed to model decision-making in dynamic treatment regimes and other statistical reinforcement learning (RL) scenarios. These fields, especially healthcare and business, often face challenges such as high-dimensional state spaces and time-inhomogeneity of the MDP ...
Chen, Elynn, Li, Sai, Jordan, Michael I.
openaire   +2 more sources

Mapping the evolution of mitochondrial complex I through structural variation

open access: yesFEBS Letters, EarlyView.
Respiratory complex I (CI) is crucial for bioenergetic metabolism in many prokaryotes and eukaryotes. It is composed of a conserved set of core subunits and additional accessory subunits that vary depending on the organism. Here, we categorize CI subunits from available structures to map the evolution of CI across eukaryotes. Respiratory complex I (CI)
Dong‐Woo Shin   +2 more
wiley   +1 more source

Complexification through gradual involvement and reward Providing in deep reinforcement learning

open access: yesСистемный анализ и прикладная информатика
Training a relatively big neural network within the framework of deep reinforcement learning that has enough capacity for complex tasks is challenging. In real life the process of task solving requires system of knowledge, where more complex skills are ...
E. V. Rulko,
doaj   +1 more source

Periodic Q-Learning

open access: yes, 2020
The use of target networks is a common practice in deep reinforcement learning for stabilizing the training; however, theoretical understanding of this technique is still limited. In this paper, we study the so-called periodic Q-learning algorithm (PQ-learning for short), which resembles the technique used in deep Q-learning for solving infinite ...
Lee, Donghwan, He, Niao
openaire   +2 more sources

Home - About - Disclaimer - Privacy