Markov decision processes - Open Access .click

Results 1 to 10 of about 184,437 (258)

Quantile Markov Decision Processes. [PDF]

Oper Res, 2022
Title: Sequential Decision Making Using Quantiles The goal of a traditional Markov decision process (MDP) is to maximize the expectation of cumulative reward over a finite or infinite horizon. In many applications, however, a decision maker may be interested in optimizing a specific quantile of the cumulative reward. For example, a physician may want
Li X, Zhong H, Brandeau ML.
europepmc +7 more sources

An immediate-return reinforcement learning for the atypical Markov decision processes [PDF]

Frontiers in Neurorobotics, 2022
The atypical Markov decision processes (MDPs) are decision-making for maximizing the immediate returns in only one state transition. Many complex dynamic problems can be regarded as the atypical MDPs, e.g., football trajectory control, approximations of ...
Zebang Pan +6 more
doaj +2 more sources

Learning Markov Decision Processes for Model Checking [PDF]

Electronic Proceedings in Theoretical Computer Science, 2012
Constructing an accurate system model for formal model verification can be both resource demanding and time-consuming. To alleviate this shortcoming, algorithms have been proposed for automatically learning system models based on observed system ...
Hua Mao +5 more
doaj +4 more sources

Entropic Regularization of Markov Decision Processes [PDF]

Entropy, 2019
An optimal feedback controller for a given Markov decision process (MDP) can in principle be synthesized by value or policy iteration. However, if the system dynamics and the reward function are unknown, a learning agent must discover an optimal ...
Boris Belousov, Jan Peters
doaj +2 more sources

Risk-Sensitive Markov Decision Processes of USV Trajectory Planning with Time-Limited Budget [PDF]

Sensors, 2023
Trajectory planning plays a crucial role in ensuring the safe navigation of ships, as it involves complex decision making influenced by various factors. This paper presents a heuristic algorithm, named the Markov decision process Heuristic Algorithm (MHA)
Yi Ding, Hongyang Zhu
doaj +2 more sources

A Faster-Than Relation for Semi-Markov Decision Processes [PDF]

Electronic Proceedings in Theoretical Computer Science, 2020
When modeling concurrent or cyber-physical systems, non-functional requirements such as time are important to consider. In order to improve the timing aspects of a model, it is necessary to have some notion of what it means for a process to be faster ...
Mathias Ruggaard Pedersen, Giorgio Bacci, Kim Guldstrand Larsen +2 more
doaj +6 more sources

Composition of Web Services Using Markov Decision Processes and Dynamic Programming [PDF]

The Scientific World Journal, 2015
We propose a Markov decision process model for solving the Web service composition (WSC) problem. Iterative policy evaluation, value iteration, and policy iteration algorithms are used to experimentally validate our approach, with artificial and real ...
Víctor Uc-Cetina +2 more
doaj +2 more sources

Individual differences in tail risk sensitive exploration using Bayes-adaptive Markov decision processes [PDF]

eLife
Novelty is a double-edged sword for agents and animals alike: they might benefit from untapped resources or face unexpected costs or dangers such as predation. The conventional exploration/exploitation tradeoff is thus colored by risk sensitivity.
Tingke Shen, Peter Dayan
doaj +2 more sources

Trace Refinement in Labelled Markov Decision Processes [PDF]

Logical Methods in Computer Science, 2020
Given two labelled Markov decision processes (MDPs), the trace-refinement problem asks whether for all strategies of the first MDP there exists a strategy of the second MDP such that the induced labelled Markov chains are trace-equivalent.
Nathanaël Fijalkow, Stefan Kiefer, Mahsa Shirmohammadi +2 more
doaj +1 more source

Synchronizing Objectives for Markov Decision Processes [PDF]

Electronic Proceedings in Theoretical Computer Science, 2011
We introduce synchronizing objectives for Markov decision processes (MDP). Intuitively, a synchronizing objective requires that eventually, at every step there is a state which concentrates almost all the probability mass.
Mahsa Shirmohammadi, Laurent Doyen, Thierry Massart +2 more
doaj +1 more source

mathematics
computer science - logic in computer science
markov and semi-markov decision processes

markov decision process
computer science - artificial intelligence
fos: computer and information sciences

dynamic programming
reinforcement learning