Results 41 to 50 of about 199,848 (147)
Time-inhomogeneous finite-horizon Markov decision processes (MDP) are frequently employed to model decision-making in dynamic treatment regimes and other statistical reinforcement learning (RL) scenarios. These fields, especially healthcare and business, often face challenges such as high-dimensional state spaces and time-inhomogeneity of the MDP ...
Chen, Elynn, Li, Sai, Jordan, Michael I.
openaire +2 more sources
The use of target networks is a common practice in deep reinforcement learning for stabilizing the training; however, theoretical understanding of this technique is still limited. In this paper, we study the so-called periodic Q-learning algorithm (PQ-learning for short), which resembles the technique used in deep Q-learning for solving infinite ...
Lee, Donghwan, He, Niao
openaire +2 more sources
Remanufacturing is regarded as a sustainable manufacturing paradigm of energy conservation and environment protection. To improve the efficiency of the remanufacturing process, this work investigates an integrated scheduling problem for disassembly and ...
Fuquan Wang +4 more
doaj +1 more source
Cyberattack Correlation and Mitigation for Distribution Systems via Machine Learning
Cyber-physical system security for electric distribution systems is critical. In direct switching attacks, often coordinated, attackers seek to toggle remote-controlled switches in the distribution network.
Jennifer Appiah-Kubi, Chen-Ching Liu
doaj +1 more source
Distributional reinforcement learning (distributional RL) has seen empirical success in complex Markov Decision Processes (MDPs) in the setting of nonlinear function approximation. However, there are many different ways in which one can leverage the distributional approach to reinforcement learning.
Doan, Thang +2 more
openaire +2 more sources
To address voltage violations in distribution networks caused by the intermittency and uncertainty of distributed photovoltaic (PV) generation, this paper proposes an optimal cluster dispatch strategy integrating Q-learning and pinning consensus ...
GONG Diyang +5 more
doaj +1 more source
Expertness based cooperative Q-learning [PDF]
By using other agents' experiences and knowledge, a learning agent may learn faster, make fewer mistakes, and create some rules for unseen situations. These benefits would be gained if the learning agent can extract proper rules from the other agents' knowledge for its own requirements.
M N, Ahmadabadi, M, Asadpour
openaire +2 more sources
Smooth Q-learning: Accelerate Convergence of Q-learning Using Similarity
An improvement of Q-learning is proposed in this paper. It is different from classic Q-learning in that the similarity between different states and actions is considered in the proposed method. During the training, a new updating mechanism is used, in which the Q value of the similar state-action pairs are updated synchronously. The proposed method can
Liao, Wei, Wei, Xiaohui, Lai, Jizhou
openaire +2 more sources
Serverless computing has evolved as a prominent paradigm within cloud computing, providing on-demand resource provisioning and capabilities crucial to Science and Technology for Energy Transition (STET) applications.
Kaur Jasmine, Chana Inderveer, Bala Anju
doaj +1 more source

