Q-learning - Open Access .click

Results 111 to 120 of about 171,350 (313)

Stratum Corneum‐Inspired Zwitterionic Hydrogels with Intrinsic Water Retention and Anti‐Freezing Properties for Intelligent Flexible Sensors

Advanced Functional Materials, EarlyView.
A novel stratum corneum‐inspired zwitterionic hydrogel is developed for intelligent, flexible sensors, featuring intrinsic water retention and anti‐freezing properties. The quasi‐gel, composed of hygroscopic polymers and bound water, maintains its softness across a wide range of humidity.
Meng Wu +8 more
wiley +1 more source

Stochastic Primal-Dual Q-Learning [PDF]

arXiv, 2018
In this work, we present a new model-free and off-policy reinforcement learning (RL) algorithm, that is capable of finding a near-optimal policy with state-action observations from arbitrary behavior policies. Our algorithm, called the stochastic primal-dual Q-learning (SPD Q-learning), hinges upon a new linear programming formulation and a dual ...
arxiv

Role of Pressure and Expansion on the Degradation in Solid‐State Silicon Batteries: Implementing Electrochemistry in Particle Dynamics

Advanced Functional Materials, EarlyView.
A simulation technique for assessing both the fabrication and operation of a solid‐state Si battery is demonstrated by integrating particle dynamics with mass/charge transport. Although, the fabrication pressure (Pfab) increased the inter‐particle contacts and reduced the concentration (ηconc) and Li‐ion (ηLi+) overpotentials during discharging, it ...
Magnus So +4 more
wiley +1 more source

DQL: A New Updating Strategy for Reinforcement Learning Based on Q-Learning [PDF]

, 2001
Carlos E. Mariano, Eduardo F. Morales
openalex +1 more source

Faster Q-Learning Algorithms for Restless Bandits [PDF]

arXiv
We study the Whittle index learning algorithm for restless multi-armed bandits (RMAB). We first present Q-learning algorithm and its variants -- speedy Q-learning (SQL), generalized speedy Q-learning (GSQL) and phase Q-learning (PhaseQL). We also discuss exploration policies -- $\epsilon$-greedy and Upper confidence bound (UCB).
arxiv

Reinforcement Learning for Traffic Control with Adaptive Horizon [PDF]

arXiv, 2019
This paper proposes a reinforcement learning approach for traffic control with the adaptive horizon. To build the controller for the traffic network, a Q-learning-based strategy that controls the green light passing time at the network intersections is applied.
arxiv

Temperature‐Resilient Polymeric Memristors for Effective Deblurring in Static and Dynamic Imaging

Advanced Functional Materials, EarlyView.
A thermally stable organic memristor based on a thiadiazolobenzotriazole (TBZ) and 2,5‐Dioctyl‐3,6‐di(thiophen‐2‐yl)pyrrolo[3,4‐c]pyrrole‐1,4(2H,5H)‐dione (DPP)‐based conjugated polymer is presented, demonstrating reliable, gradual resistance switching across a wide temperature range (153–573 K).
Ziyu Lv +15 more
wiley +1 more source

A Unified Switching System Perspective and O.D.E. Analysis of Q-Learning Algorithms [PDF]

NeurIPS2020, 2019
In this paper, we introduce a unified framework for analyzing a large family of Q-learning algorithms, based on switching system perspectives and ODE-based stochastic approximation. We show that the nonlinear ODE models associated with these Q-learning algorithms can be formulated as switched linear systems, and analyze their asymptotic stability by ...
arxiv

Quantum Emitters in Hexagonal Boron Nitride: Principles, Engineering and Applications

Advanced Functional Materials, EarlyView.
Quantum emitters in hexagonal boron nitride have emerged as a promising candidate for quantum information science. This review examines the fundamentals of these quantum emitters, including their level structures, defect engineering, and their possible chemical structures.
Thi Ngoc Anh Mai +8 more
wiley +1 more source

Exploration design for Q-learning-based adaptive linear quadratic optimal regulators under stochastic disturbances

SICE Journal of Control, Measurement, and System Integration
This study considers a discrete-time, linear state feedback control strategy rooted in Q-learning, one of the Reinforcement Learning (RL) approaches, to address an adaptive Linear Quadratic (LQ) problem under stochastic disturbances. Q-learning optimizes
Vina Putri Virgiani, Shiro Masuda
doaj +1 more source

reinforcement learning
computer science
artificial intelligence

mathematics
fos: computer and information sciences
machine learning

robot
computer science - machine learning
machine learning cs.lg