Markov decision process - Open Access .click

Results 251 to 260 of about 48,213 (289)

Some of the next articles are maybe not open access.

2007
In Chapter 2, we introduced the basic principles of PA and derived the performance derivative formulas for queueing networks and Markov and semi-Markov systems with these principles. In Chapter 3, we developed sample-path-based (on-line learning) algorithms for estimating the performance derivatives and sample-path-based optimization schemes.
+4 more sources

Markov Decision Processes

2021
As discussed in Chapter 1, reinforcement learning involves sequential decision-making. In this chapter, we will formalize the notion of using stochastic processes under the branch of probability that models sequential decision-making behavior. While most of the problems we study in reinforcement learning are modeled as Markov decision processes (MDP ...
+4 more sources

Markov Decision Processes

2009
Markov chains provide a useful modeling tool for determining expected profits or costs associated with certain types of systems. The key characteristic that allows for a Markov model is a probability law in which the future behavior of the system is independent of the past behavior given the present condition of the system. When this Markov property is
Richard M. Feldman, Ciriaco Valdez-Flores +1 more
openaire +1 more source

Markov Decision Processes

2013
We provide a formal description of the discounted reward MDP framework in Chap. 1, including both the finite- and the infinite-horizon settings and summarizing the associated optimality equations. We then present the well-known exact solution algorithms, value iteration and policy iteration, and outline a framework of rolling-horizon control (also ...
Hyeong Soo Chang +3 more
openaire +1 more source

Markov Decision Processes.

The Statistician, 1995
Stephen Brooks, D. J. White
+5 more sources

Markov decision processes

European Journal of Operational Research, 1989
The paper is an introduction to Markov decision processes mainly addressed to possible applicants. Therefore it presents a finite model only, but a broad variety of objectives, algorithms (e.g. aggregation), and extensions (e.g. semi-Markov, partially observed, adaptive multiobjective, and constrained models).
White, Chelsea C. III, White, Douglas J.
openaire +1 more source

Variance-Penalized Markov Decision Processes

Mathematics of Operations Research, 1989
We consider a Markov decision process with both the expected limiting average, and the discounted total return criteria, appropriately modified to include a penalty for the variability in the stream of rewards. In both cases we formulate appropriate nonlinear programs in the space of state-action frequencies (averaged, or discounted) whose optimal ...
Filar, Jerzy A., Kallenberg, L. C. M., Lee, Huey-Miin +2 more
openaire +2 more sources

Markov Decision Processes

2015
This chapter introduces sequential decision problems, in particular Markov decision processes (MDPs). A formal definition of an MDP is given, and the two most common solution techniques are described: value iteration and policy iteration. Then, factored MDPs are described, which provide a representation based on graphical models to solve very large ...
openaire +1 more source

Markov Decision Processes

2022
Ashwin Rao, Tikhon Jelvis
+4 more sources

In situ learning using intrinsic memristor variability via Markov chain Monte Carlo sampling

Nature Electronics, 2021
Thomas Dalgaty, Damien Querlioz, Elisa Vianello +2 more
exaly

mathematics
computer science
markov process

statistics
markov chain
artificial intelligence

mathematical optimization
machine learning
economics