Markov decision process - Open Access .click

Results 271 to 280 of about 57,303 (303)

Some of the next articles are maybe not open access.

On constrained Markov decision processes

Operations Research Letters, 1996
zbMATH Open Web Interface contents unavailable due to conflicting licenses.
openaire +2 more sources

Evidential Markov Decision Processes

2011
This paper proposes a new model, the EMDP (Evidential Markov Decision Process). It is a MDP (Markov Decision Process) for belief functions in which rewards are defined for each state transition, like in a classical MDP, whereas the transitions are modeled as in an EMC (Evidential Markov Chain), i.e.
Hélène Soubaras, Christophe Labreuche, Pierre Savéant +2 more
openaire +1 more source

Competing Markov decision processes

Annals of Operations Research, 1991
zbMATH Open Web Interface contents unavailable due to conflicting licenses.
openaire +1 more source

Easy Affine Markov Decision Processes

Operations Research, 2019
Individuals, firms, and governments often face the challenge of making optimal decisions in a dynamic setting amidst a changing and uncertain environment. Although Markov decision processes (MDPs) provide a powerful modeling framework for such problems, solving an MDP is generally difficult.
Jie Ning, Matthew J. Sobel
openaire +1 more source

Variability sensitive Markov decision processes

Proceedings of the 28th IEEE Conference on Decision and Control, 1992
Considered are time-average Markov Decision Processes (MDPs) with finite state and action spaces. Two definitions of variability are introduced, namely, the expected time-average variability and time-average expected variability. The two criteria are in general different, although they can both be employed to penalize for variance in the stream of ...
Melike Baykal-Gürsoy, Keith W. Ross
openaire +2 more sources

Coevolutive planning in markov decision processes

Proceedings of the first international joint conference on Autonomous agents and multiagent systems part 2 - AAMAS '02, 2002
We investigate the idea of having groups of agents coevolving in order to iteratively refine multi-agent plans. This idea we called coevolution is formalized and analyzed in a general purpose and applied to the stochastic control frameworks that use an explicit model of the world\,: coevolution can directly be adapted to the frameworks of Multi-Agent ...
Scherrer, Bruno, Charpillet, François
openaire +2 more sources

On the significance of Markov decision processes

1997
Formulating the problem facing an intelligent agent as a Markov decision process (MDP) is increasingly common in artificial intelligence, reinforcement learning, artificial life, and artificial neural networks. In this short paper we examine some of the reasons for the appeal of this framework.
openaire +1 more source

Policy Bounds for Markov Decision Processes

Operations Research, 1986
This paper demonstrates how a Markov decision process (MDP) can be approximated to generate a policy bound, i.e., a function that bounds the optimal policy from below or from above for all states. We present sufficient conditions for several computationally attractive approximations to generate rigorous policy bounds.
openaire +2 more sources

Accretive Operators and Markov Decision Processes

Mathematics of Operations Research, 1980
The dynamic programming functional equation for an abstract, continuous parameter, Markov decision process is shown to involve an operator which is m-accretive, thus giving rise to a nonlinear semigroup, called the Bellman semigroup. A class of controls is specified for which the maximum expected reward over a finite planning horizon is given by this ...
openaire +1 more source

Markov Decision Process Measurement Model

Psychometrika, 2018
Within-task actions can provide additional information on student competencies but are challenging to model. This paper explores the potential of using a cognitive model for decision making, the Markov decision process, to provide a mapping between within-task actions and latent traits of interest. Psychometric properties of the model are explored, and
openaire +3 more sources

reinforcement learning
q-learning