Results 1 to 10 of about 169,331 (192)
\cal Q-learning (Watkins, 1989) is a simple way for agents to learn how to act optimally in controlled Markovian domains. It amounts to an incremental method for dynamic programming which imposes limited computational demands. It works by successively improving its evaluations of the quality of particular actions at particular states.
Christopher J. Watkins, Peter Dayan
openalex +5 more sources
Constrained Deep Q-Learning Gradually Approaching Ordinary Q-Learning [PDF]
A deep Q network (DQN) (Mnih et al., 2013) is an extension of Q learning, which is a typical deep reinforcement learning method. In DQN, a Q function expresses all action values under all states, and it is approximated using a convolutional neural ...
Shota Ohnishi+6 more
doaj +4 more sources
Robot learning is a challenging – and somewhat unique – research domain. If a robot behavior is defined as a mapping between situations that occurred in the real world and actions to be accomplished, then the supervised learning of a robot behavior requires a set of representative examples (situation, desired action). In order to be able to gather such
Claude Touzet
+7 more sources
We develop methodology for a multistage-decision problem with flexible number of stages in which the rewards are survival times that are subject to censoring. We present a novel Q-learning algorithm that is adjusted for censored data and allows a flexible number of stages.
Yair Goldberg, Michael R. Kosorok
openalex +7 more sources
The current pandemic has highlighted the need for rapid construction of structures to treat patients and ensure manufacturing of health care products such as vaccines.
Alex Grimshaw, John Oyekan
doaj +1 more source
Methods and software for solar power plant cluster management
Object is solar power plant management software. Nowadays, solar panel production technologies are developing rapidly, investments in solar energy are growing, so users are interested in increasing energy production for faster return on investment.
А. Мокрий, І. Баклан
doaj +1 more source
Cooperative Output Regulation By Q-learning For Discrete Multi-agent Systems In Finite-time
This article studies the output regulation of discrete-time multi-agent systems with an unknown model by a finite-time optimal control algorithm based on Q-learning that uses the method of the linear quadratic regulator (LQR).
Wenjun Wei, Jingyuan Tang
doaj +1 more source
Multi-Source Multi-Destination Hybrid Infrastructure-Aided Traffic Aware Routing in V2V/I Networks
The concept of the “connected car” offers the potential for safer, more enjoyable and more efficient driving and eventually autonomous driving.
Teodor Ivanescu+3 more
doaj +1 more source
QIBMRMN: Design of a Q-Learning based Iterative sleep-scheduling & hybrid Bioinspired Multipath Routing model for Multimedia Networks [PDF]
Multimedia networks utilize low-power scalar nodes to modify wakeup cycles of high-performance multimedia nodes, which assists in optimizing the power-toperformance ratios.
Minaxi Doorwar, P Malathi
doaj +1 more source
Background Supervised deep learning in radiology suffers from notorious inherent limitations: 1) It requires large, hand-annotated data sets; (2) It is non-generalizable; and (3) It lacks explainability and intuition.
J. N. Stember, H. Shalu
doaj +1 more source