Results 11 to 20 of about 171,350 (313)
Maxmin Q-learning: Controlling the Estimation Bias of Q-learning [PDF]
ICLR ...
Lan, Qingfeng+3 more
openaire +3 more sources
Smooth Q-learning: Accelerate Convergence of Q-learning Using Similarity [PDF]
An improvement of Q-learning is proposed in this paper. It is different from classic Q-learning in that the similarity between different states and actions is considered in the proposed method. During the training, a new updating mechanism is used, in which the Q value of the similar state-action pairs are updated synchronously. The proposed method can
Liao, Wei, Wei, Xiaohui, Lai, Jizhou
openaire +3 more sources
In Reinforcement Learning the Q-learning algorithm provably converges to the optimal solution. However, as others have demonstrated, Q-learning can also overestimate the values and thereby spend too long exploring unhelpful states. Double Q-learning is a provably convergent alternative that mitigates some of the overestimation issues, though sometimes ...
openaire +3 more sources
QSPCA: A two-stage efficient power control approach in D2D communication for 5G networks
The existing literature on device-to-device (D2D) architecture suffers from a dearth of analysis under imperfect channel conditions. There is a need for rigorous analyses on the policy improvement and evaluation of network performance. Accordingly, a two-
Saurabh Chandra+4 more
doaj +1 more source
Image quality improvement in low‐dose chest CT with deep learning image reconstruction
Abstract Objectives To investigate the clinical utility of deep learning image reconstruction (DLIR) for improving image quality in low‐dose chest CT in comparison with 40% adaptive statistical iterative reconstruction‐Veo (ASiR‐V40%) algorithm. Methods This retrospective study included 86 patients who underwent low‐dose CT for lung cancer screening ...
Qian Tian+7 more
wiley +1 more source
Generalized Speedy Q-Learning [PDF]
In this paper, we derive a generalization of the Speedy Q-learning (SQL) algorithm that was proposed in the Reinforcement Learning (RL) literature to handle slow convergence of Watkins' Q-learning. In most RL algorithms such as Q-learning, the Bellman equation and the Bellman operator play an important role.
Indu John+2 more
openaire +3 more sources
Abstract Purpose To evaluate the impact of various noise reduction algorithms and template matching parameters on the accuracy of markerless tumor tracking (MTT) using dual‐energy (DE) imaging. Methods A Varian TrueBeam linear accelerator was used to acquire a series of alternating 60 and 120 kVp images (over a 180° arc) using fast kV switching, on ...
Mandeep Kaur+9 more
wiley +1 more source
Frame Size Optimization for Dynamic Framed Slotted ALOHA in RFID Systems
In recent years, the State Grid has actively promoted the construction of ubiquitous power Internet of things, so as to realize the interconnection and optimized management of things in the power system. Specifically, radio frequency identification (RFID)
HE Jindong, BU Yanling, SHI Congcong, XIE Lei
doaj +1 more source
Expert Q-learning: Deep Reinforcement Learning with Coarse State Values from Offline Expert Examples [PDF]
In this article, we propose a novel algorithm for deep reinforcement learning named Expert Q-learning. Expert Q-learning is inspired by Dueling Q-learning and aims at incorporating semi-supervised learning into reinforcement learning through splitting Q-values into state values and action advantages. We require that an offline expert assesses the value
arxiv +1 more source
Vehicular and flying ad hoc networks (VANETs and FANETs) are becoming increasingly important with the development of smart cities and intelligent transportation systems (ITSs).
Pavle Bugarčić+2 more
doaj +1 more source