Markov and semi-markov decision processes

Results 241 to 250 of about 13,567 (273)

Some of the next articles are maybe not open access.

Reinforcement learning for semi-Markov decision processes with applications

2023
This thesis focuses on semi-Markov decision processes and their connection with Reinforcement Learning via Q-learning technique. We start by discussing some general ideas around Machine Learning, Reinforcement Learning and Hierarchical Reinforcement Learning.
openaire +1 more source

Average Reward Reinforcement Learning for Semi-Markov Decision Processes

2017
In this paper, we study new reinforcement learning (RL) algorithms for Semi-Markov decision processes (SMDPs) with an average reward criterion. Based on the discrete-time type Bellman optimality equation, we use incremental value iteration (IVI), stochastic shortest path (SSP) value iteration and bisection algorithms to derive novel RL algorithms in a ...
Jiayuan Yang, Yanjie Li, Haoyao Chen, Jiangang Li +3 more
openaire +1 more source

Constrained Discounted Semi-Markov Decision Processes

2002
This paper reduces problems on the existence and the finding of optimal policies for multiple criterion discounted SMDPs to similar problems for MDPs. We prove this reduction and illustrate it by extending to SMDPs several results for constrained discounted MDPs.
openaire +1 more source

Semi-Markov decision processes with a reachable state-subset

Optimization, 1989
We consider the problem of minimizing the long-run average expected cost per unit time in a semi-Markov decision process with arbitrary state and action space, Assuming the existence .of a Borel subset of state space called a reachable state-subset, we derive the optimality equation for the unbounded costs.
openaire +1 more source

On the Optimality Conditions for Semi-Markov Decision Processes

1977
The paper presents a recurrence formula for the difference between expected rewards and sojourn times generated by N transitions of a semi-Markov decision process with finite state space. Using the recurrence formula convergence of policy iteration method can be easily verified and also necessary and sufficient optimality conditions for average optimal
openaire +1 more source

Deterministic policy gradient algorithms for semi‐Markov decision processes

International Journal of Intelligent Systems, 2021
Ashkan Haji Hosseinloo, Munther A. Dahleh +1 more
openaire +1 more source

Learning Automaton for Finite Semi-Markov Decision Processes

1983
A finite semi-Markov decision process is studied to maximize the expected average reward. The semi-Markov kernel of the process depends on an unknown parameter taking values in a subset [a, b] of ℝS. A controller modelled as a learning automaton updates sequentially the probabilities of generating decisions based on the observed decisions, states, and ...
openaire +1 more source

Reinforcement learning with options in semi Markov decision processes

2021
Treball fi de màster de: Master in Intelligent Interactive ...
openaire +1 more source

Semi-Markov Decision-Making Processes with Vector Gains

Theory of Probability & Its Applications, 1984
Vinogradskaya, T. M., Geninson, B. A., Rubchinskij, A. A. +2 more
openaire +3 more sources

Semi-Markov Decision Processes with Vector Pay-Offs

2022
openaire +1 more source