Results 21 to 30 of about 6,813,126 (333)

Reinforcement learning using Deep $$Q$$ Q networks and $$Q$$ Q learning accurately localizes brain tumors on MRI with very small training sets

open access: yesBMC Medical Imaging, 2022
Background Supervised deep learning in radiology suffers from notorious inherent limitations: 1) It requires large, hand-annotated data sets; (2) It is non-generalizable; and (3) It lacks explainability and intuition.
J. N. Stember, H. Shalu
doaj   +1 more source

Reinforcement Learning-Based Routing Protocols in Vehicular and Flying Ad Hoc Networks – A Literature Survey

open access: yesPromet (Zagreb), 2022
Vehicular and flying ad hoc networks (VANETs and FANETs) are becoming increasingly important with the development of smart cities and intelligent transportation systems (ITSs).
Pavle Bugarčić   +2 more
doaj   +1 more source

Constraints Penalized Q-Learning for Safe Offline Reinforcement Learning [PDF]

open access: yesAAAI Conference on Artificial Intelligence, 2021
We study the problem of safe offline reinforcement learning (RL), the goal is to learn a policy that maximizes long-term reward while satisfying safety constraints given only offline data, without further interaction with the environment. This problem is
Haoran Xu, Xianyuan Zhan, Xiangyu Zhu
semanticscholar   +1 more source

QSPCA: A two-stage efficient power control approach in D2D communication for 5G networks

open access: yesIntelligent and Converged Networks, 2021
The existing literature on device-to-device (D2D) architecture suffers from a dearth of analysis under imperfect channel conditions. There is a need for rigorous analyses on the policy improvement and evaluation of network performance. Accordingly, a two-
Saurabh Chandra   +4 more
doaj   +1 more source

QIBMRMN: Design of a Q-Learning based Iterative sleep-scheduling & hybrid Bioinspired Multipath Routing model for Multimedia Networks [PDF]

open access: yesInternational Journal of Electronics and Telecommunications, 2023
Multimedia networks utilize low-power scalar nodes to modify wakeup cycles of high-performance multimedia nodes, which assists in optimizing the power-toperformance ratios.
Minaxi Doorwar, P Malathi
doaj   +1 more source

Is Q-Learning Minimax Optimal? A Tight Sample Complexity Analysis [PDF]

open access: yesOperational Research, 2021
This paper investigates a model-free algorithm of broad interest in reinforcement learning, namely, Q-learning. Whereas substantial progress had been made toward understanding the sample efficiency of Q-learning in recent years, it remained largely ...
Gen Li, Ee, Changxiao Cai, Yuting Wei
semanticscholar   +1 more source

Cooperative Output Regulation By Q-learning For Discrete Multi-agent Systems In Finite-time

open access: yesJournal of Applied Science and Engineering, 2022
This article studies the output regulation of discrete-time multi-agent systems with an unknown model by a finite-time optimal control algorithm based on Q-learning that uses the method of the linear quadratic regulator (LQR).
Wenjun Wei, Jingyuan Tang
doaj   +1 more source

Offline Q-Learning on Diverse Multi-Task Data Both Scales And Generalizes [PDF]

open access: yesInternational Conference on Learning Representations, 2022
The potential of offline reinforcement learning (RL) is that high-capacity models trained on large, heterogeneous datasets can lead to agents that generalize broadly, analogously to similar advances in vision and NLP.
Aviral Kumar   +4 more
semanticscholar   +1 more source

Methods and software for solar power plant cluster management

open access: yesAdaptivni Sistemi Avtomatičnogo Upravlinnâ, 2022
Object is solar power plant management software. Nowadays, solar panel production technologies are developing rapidly, investments in solar energy are growing, so users are interested in increasing energy production for faster return on investment.
А. Мокрий, І. Баклан
doaj   +1 more source

Robust Q-Learning

open access: yesJournal of the American Statistical Association, 2020
Q-learning is a regression-based approach that is widely used to formalize the development of an optimal dynamic treatment strategy. Finite dimensional working models are typically used to estimate certain nuisance parameters, and misspecification of these working models can result in residual confounding and/or efficiency loss.
Ashkan Ertefaie   +3 more
openaire   +4 more sources

Home - About - Disclaimer - Privacy