Results 1 to 10 of about 169,331 (192)

Q-learning [PDF]

open access: bronzeMachine Learning, 1992
\cal Q-learning (Watkins, 1989) is a simple way for agents to learn how to act optimally in controlled Markovian domains. It amounts to an incremental method for dynamic programming which imposes limited computational demands. It works by successively improving its evaluations of the quality of particular actions at particular states.
Christopher J. Watkins, Peter Dayan
openalex   +5 more sources

Constrained Deep Q-Learning Gradually Approaching Ordinary Q-Learning [PDF]

open access: yesFrontiers in Neurorobotics, 2019
A deep Q network (DQN) (Mnih et al., 2013) is an extension of Q learning, which is a typical deep reinforcement learning method. In DQN, a Q function expresses all action values under all states, and it is approximated using a convolutional neural ...
Shota Ohnishi   +6 more
doaj   +4 more sources

Q-learning for Robots [PDF]

open access: green, 2003
Robot learning is a challenging – and somewhat unique – research domain. If a robot behavior is defined as a mapping between situations that occurred in the real world and actions to be accomplished, then the supervised learning of a robot behavior requires a set of representative examples (situation, desired action). In order to be able to gather such
Claude Touzet
  +7 more sources

Q-learning with censored data

open access: greenThe Annals of Statistics, 2012
We develop methodology for a multistage-decision problem with flexible number of stages in which the rewards are survival times that are subject to censoring. We present a novel Q-learning algorithm that is adjusted for censored data and allows a flexible number of stages.
Yair Goldberg, Michael R. Kosorok
openalex   +7 more sources

Applying Deep Reinforcement Learning to Cable Driven Parallel Robots for Balancing Unstable Loads: A Ball Case Study

open access: yesFrontiers in Robotics and AI, 2021
The current pandemic has highlighted the need for rapid construction of structures to treat patients and ensure manufacturing of health care products such as vaccines.
Alex Grimshaw, John Oyekan
doaj   +1 more source

Methods and software for solar power plant cluster management

open access: yesAdaptivni Sistemi Avtomatičnogo Upravlinnâ, 2022
Object is solar power plant management software. Nowadays, solar panel production technologies are developing rapidly, investments in solar energy are growing, so users are interested in increasing energy production for faster return on investment.
А. Мокрий, І. Баклан
doaj   +1 more source

Cooperative Output Regulation By Q-learning For Discrete Multi-agent Systems In Finite-time

open access: yesJournal of Applied Science and Engineering, 2022
This article studies the output regulation of discrete-time multi-agent systems with an unknown model by a finite-time optimal control algorithm based on Q-learning that uses the method of the linear quadratic regulator (LQR).
Wenjun Wei, Jingyuan Tang
doaj   +1 more source

Multi-Source Multi-Destination Hybrid Infrastructure-Aided Traffic Aware Routing in V2V/I Networks

open access: yesIEEE Access, 2022
The concept of the “connected car” offers the potential for safer, more enjoyable and more efficient driving and eventually autonomous driving.
Teodor Ivanescu   +3 more
doaj   +1 more source

QIBMRMN: Design of a Q-Learning based Iterative sleep-scheduling & hybrid Bioinspired Multipath Routing model for Multimedia Networks [PDF]

open access: yesInternational Journal of Electronics and Telecommunications, 2023
Multimedia networks utilize low-power scalar nodes to modify wakeup cycles of high-performance multimedia nodes, which assists in optimizing the power-toperformance ratios.
Minaxi Doorwar, P Malathi
doaj   +1 more source

Reinforcement learning using Deep $$Q$$ Q networks and $$Q$$ Q learning accurately localizes brain tumors on MRI with very small training sets

open access: yesBMC Medical Imaging, 2022
Background Supervised deep learning in radiology suffers from notorious inherent limitations: 1) It requires large, hand-annotated data sets; (2) It is non-generalizable; and (3) It lacks explainability and intuition.
J. N. Stember, H. Shalu
doaj   +1 more source

Home - About - Disclaimer - Privacy