HAO-AVP: An Entropy-Gini Reinforcement Learning Assisted Hierarchical Void Repair Protocol for Underwater Wireless Sensor Networks. [PDF]
Hao L, Ma C, Ao J.
europepmc +1 more source
Task Offloading and Resource Allocation Strategy in Non-Terrestrial Networks for Continuous Distributed Task Scenarios. [PDF]
Qi Y, Du Y, Guo Y, Hao J.
europepmc +1 more source
​​Classified modeling and day-ahead optimal scheduling of multi-type adjustable industrial loads in industrial microgrid using improved approximate dynamic programming. [PDF]
Sun T +5 more
europepmc +1 more source
Multi-Agent Reinforcement Learning in Games: Research and Applications. [PDF]
Li H +5 more
europepmc +1 more source
BeamCraft: Deep Reinforcement Learning-DrivenMulti-Objective Beamforming for ISAC
Dao DN, Miao Y.
europepmc +1 more source
Related searches:
Continuous Time Discounted Jump Markov Decision Processes: A Discrete-Event Approach
Mathematics of Operations Research, 2004This paper introduces and develops a new approach to the theory of continuous time jump Markov decision processes (CTJMDP). This approach reduces discounted CTJMDPs to discounted semi-Markov decision processes (SMDPs) and eventually to discrete-time Markov decision processes (MDPs).
openaire +4 more sources
The Transformation Method for Continuous-Time Markov Decision Processes
Journal of Optimization Theory and Applications, 2012zbMATH Open Web Interface contents unavailable due to conflicting licenses.
Piunovskiy, Alexey, Zhang, Yi
openaire +2 more sources
Preferred Rules in Continuous Time Markov Decision Processes
Management Science, 1974Motivated by a planning horizon result for continuous time Markov decision chains, we study decision rules, called preferred, which may be used in the initially stationary part of nearly optimal policies. We characterize these rules and then, under conditions involving state recurrence and accessibility, consider finding such rules.
openaire +2 more sources

