Continuous-time markov decision processes

Results 211 to 220 of about 120,210 (259)

HAO-AVP: An Entropy-Gini Reinforcement Learning Assisted Hierarchical Void Repair Protocol for Underwater Wireless Sensor Networks. [PDF]

Sensors (Basel)
Hao L, Ma C, Ao J.
europepmc +1 more source

Task Offloading and Resource Allocation Strategy in Non-Terrestrial Networks for Continuous Distributed Task Scenarios. [PDF]

Sensors (Basel)
Qi Y, Du Y, Guo Y, Hao J.
europepmc +1 more source

Classified modeling and day-ahead optimal scheduling of multi-type adjustable industrial loads in industrial microgrid using improved approximate dynamic programming. [PDF]

Sci Rep
Sun T, Yang P, Sun Y, Luo X, Lu J, Duan K. +5 more
europepmc +1 more source

Multi-Agent Reinforcement Learning in Games: Research and Applications. [PDF]

Biomimetics (Basel)
Li H, Yang P, Liu W, Yan S, Zhang X, Zhu D. +5 more
europepmc +1 more source

BeamCraft: Deep Reinforcement Learning-DrivenMulti-Objective Beamforming for ISAC

Dao DN, Miao Y.
europepmc +1 more source

Some of the next articles are maybe not open access.

Related searches:

reinforcement learning
mathematics
markov decision processes

Continuous Time Discounted Jump Markov Decision Processes: A Discrete-Event Approach

Mathematics of Operations Research, 2004
This paper introduces and develops a new approach to the theory of continuous time jump Markov decision processes (CTJMDP). This approach reduces discounted CTJMDPs to discounted semi-Markov decision processes (SMDPs) and eventually to discrete-time Markov decision processes (MDPs).
openaire +4 more sources

The Transformation Method for Continuous-Time Markov Decision Processes

Journal of Optimization Theory and Applications, 2012
zbMATH Open Web Interface contents unavailable due to conflicting licenses.
Piunovskiy, Alexey, Zhang, Yi
openaire +2 more sources

Preferred Rules in Continuous Time Markov Decision Processes

Management Science, 1974
Motivated by a planning horizon result for continuous time Markov decision chains, we study decision rules, called preferred, which may be used in the initially stationary part of nearly optimal policies. We characterize these rules and then, under conditions involving state recurrence and accessibility, consider finding such rules.
openaire +2 more sources

reinforcement learning
mathematics
markov decision processes