Results 211 to 220 of about 13,567 (273)
Plenary Abstracts Session & Oral Presentations
HemaSphere, Volume 10, Issue S1, June 2026.
wiley +1 more source
Some of the next articles are maybe not open access.
On Average Reward Semi-Markov Decision Processes with a General Multichain Structure
Mathematics of Operations Research, 2004In this paper we investigate average reward semi-Markov decision processes with a general multichain structure using a data-transformation method. By solving the transformed discrete-time average Markov decision processes, we can obtain significant and interesting information on the original average semi-Markov decision processes. If the original semi-
Jianyong Liu, Xiaobo Zhao
openaire +4 more sources
SEMI-MARKOV DECISION PROCESSES
Probability in the Engineering and Informational Sciences, 2007Considered are semi-Markov decision processes (SMDPs) with finite state and action spaces. We study two criteria: the expected average reward per unit time subject to a sample path constraint on the average cost per unit time and the expected time-average variability.
M. Baykal-Gürsoy, K. Gürsoy
openaire +2 more sources
Risk-aware semi-Markov decision processes
2017 IEEE 56th Annual Conference on Decision and Control (CDC), 2017In this work we construct a basic theory of risk-aware continuous-time Markov decision processes, and even more broadly, that of semi-Markov decision processes. Methods that account for the preferences of risk-aware agents have been introduced and studied in the context of discrete time problems, however, there has been virtually no such development ...
Jukka Isohataia, William B. Haskell 0001
openaire +1 more source
Policy Gradient Semi-markov Decision Process
2008 20th IEEE International Conference on Tools with Artificial Intelligence, 2008This paper proposes a simulation-based algorithm for optimizing the average reward in a parameterized continuous-time, finite-state semi-Markov decision process (SMDP). Our contributions are twofold: First, we compute the approximate gradient of the average reward with respect to the parameters in SMDP controlled by parameterized stochastic policies ...
Ngo, Vien, Chung, TaeChoong
openaire +1 more source

