Results 211 to 220 of about 68,246 (249)
Some of the next articles are maybe not open access.

Constrained Semi-Markov decision processes with average rewards

ZOR Zeitschrift f�r Operations Research Mathematical Methods of Opeartions Research, 1994
This paper deals with constrained average reward semi-Markov decision processes with finite state and action sets. Two average reward criteria are considered, namely time average and ratio average. The author proved the existence of optimal mixed stationary policies and showed, under the unichain condition, the existence of randomized stationary ...
openaire   +2 more sources

Semi-Markov decision processes with polynomial reward

Journal of Applied Probability, 1982
A semi-Markov decision process, with a denumerable multidimensional state space, is considered. At any given state only a finite number of actions can be taken to control the process. The immediate reward earned in one transition period is merely assumed to be bounded by a polynomial and a bound is imposed on a weighted moment of the next state reached
openaire   +1 more source

Discrete-time equivalence for constrained semi-Markov decision processes

1985 24th IEEE Conference on Decision and Control, 1985
A continuous-time average reward Markov decision process problem is most easily solved in terms of an equivalent discrete-time Markov decision process (DMDP); customary hypotheses include that the process is a Markov jump process with denumerable state space and bounded transition rates, that actions are chosen at the jump points of the process, and ...
Frederick Beutler, Keith Ross
openaire   +1 more source

Constrained Discounted Semi-Markov Decision Processes

2002
This paper reduces problems on the existence and the finding of optimal policies for multiple criterion discounted SMDPs to similar problems for MDPs. We prove this reduction and illustrate it by extending to SMDPs several results for constrained discounted MDPs.
openaire   +1 more source

Uniformization for semi-Markov decision processes under stationary policies

Journal of Applied Probability, 1987
Uniformization permits the replacement of a semi-Markov decision process (SMDP) by a Markov chain exhibiting the same average rewards for simple (non-randomized) policies. It is shown that various anomalies may occur, especially for stationary (randomized) policies; uniformization introduces virtual jumps with concomitant action changes not present in ...
Beutler, Frederick J., Ross, Keith W.
openaire   +2 more sources

Reinforcement learning for semi-Markov decision processes with applications

2023
This thesis focuses on semi-Markov decision processes and their connection with Reinforcement Learning via Q-learning technique. We start by discussing some general ideas around Machine Learning, Reinforcement Learning and Hierarchical Reinforcement Learning.
openaire   +1 more source

Performance Optimization of Semi-Markov Decision Processes with Discounted-cost Criteria

European Journal of Control, 2008
zbMATH Open Web Interface contents unavailable due to conflicting licenses.
Yin, Baoqun   +3 more
openaire   +1 more source

Finite horizon semi-Markov decision processes with multiple constraints

Proceeding of the 11th World Congress on Intelligent Control and Automation, 2014
This paper focuses on solving a finite horizon semi-Markov decision process with multiple constraints. We convert the problem to a constrained absorbing discrete-time Markov decision process and then to an equivalent linear programm over a class of occupancy measures.
openaire   +1 more source

Learning Automaton for Finite Semi-Markov Decision Processes

1983
A finite semi-Markov decision process is studied to maximize the expected average reward. The semi-Markov kernel of the process depends on an unknown parameter taking values in a subset [a, b] of ℝS. A controller modelled as a learning automaton updates sequentially the probabilities of generating decisions based on the observed decisions, states, and ...
openaire   +1 more source

Semi-Markov decision processes with a reachable state-subset

Optimization, 1989
We consider the problem of minimizing the long-run average expected cost per unit time in a semi-Markov decision process with arbitrary state and action space, Assuming the existence .of a Borel subset of state space called a reachable state-subset, we derive the optimality equation for the unbounded costs.
openaire   +1 more source

Home - About - Disclaimer - Privacy