Results 241 to 250 of about 13,567 (273)
Some of the next articles are maybe not open access.
Reinforcement learning for semi-Markov decision processes with applications
2023This thesis focuses on semi-Markov decision processes and their connection with Reinforcement Learning via Q-learning technique. We start by discussing some general ideas around Machine Learning, Reinforcement Learning and Hierarchical Reinforcement Learning.
openaire +1 more source
Average Reward Reinforcement Learning for Semi-Markov Decision Processes
2017In this paper, we study new reinforcement learning (RL) algorithms for Semi-Markov decision processes (SMDPs) with an average reward criterion. Based on the discrete-time type Bellman optimality equation, we use incremental value iteration (IVI), stochastic shortest path (SSP) value iteration and bisection algorithms to derive novel RL algorithms in a ...
Jiayuan Yang +3 more
openaire +1 more source
Constrained Discounted Semi-Markov Decision Processes
2002This paper reduces problems on the existence and the finding of optimal policies for multiple criterion discounted SMDPs to similar problems for MDPs. We prove this reduction and illustrate it by extending to SMDPs several results for constrained discounted MDPs.
openaire +1 more source
Semi-Markov decision processes with a reachable state-subset
Optimization, 1989We consider the problem of minimizing the long-run average expected cost per unit time in a semi-Markov decision process with arbitrary state and action space, Assuming the existence .of a Borel subset of state space called a reachable state-subset, we derive the optimality equation for the unbounded costs.
openaire +1 more source
On the Optimality Conditions for Semi-Markov Decision Processes
1977The paper presents a recurrence formula for the difference between expected rewards and sojourn times generated by N transitions of a semi-Markov decision process with finite state space. Using the recurrence formula convergence of policy iteration method can be easily verified and also necessary and sufficient optimality conditions for average optimal
openaire +1 more source
Deterministic policy gradient algorithms for semi‐Markov decision processes
International Journal of Intelligent Systems, 2021Ashkan Haji Hosseinloo +1 more
openaire +1 more source
Learning Automaton for Finite Semi-Markov Decision Processes
1983A finite semi-Markov decision process is studied to maximize the expected average reward. The semi-Markov kernel of the process depends on an unknown parameter taking values in a subset [a, b] of ℝS. A controller modelled as a learning automaton updates sequentially the probabilities of generating decisions based on the observed decisions, states, and ...
openaire +1 more source
Reinforcement learning with options in semi Markov decision processes
2021Treball fi de màster de: Master in Intelligent Interactive ...
openaire +1 more source
Semi-Markov Decision-Making Processes with Vector Gains
Theory of Probability & Its Applications, 1984Vinogradskaya, T. M. +2 more
openaire +3 more sources

