Results 31 to 40 of about 184,437 (258)

A tutorial introduction to reinforcement learning

open access: yesSICE Journal of Control, Measurement, and System Integration, 2023
In this paper, we present a brief survey of reinforcement learning, with particular emphasis on stochastic approximation (SA) as a unifying theme. The scope of the paper includes Markov reward processes, Markov decision processes, SA algorithms, and ...
Mathukumalli Vidyasagar
doaj   +1 more source

Smart Sampling for Lightweight Verification of Markov Decision Processes [PDF]

open access: yes, 2015
Markov decision processes (MDP) are useful to model optimisation problems in concurrent systems. To verify MDPs with efficient Monte Carlo techniques requires that their nondeterminism be resolved by a scheduler.
D'Argenio, Pedro   +3 more
core   +5 more sources

Markov Decision Processes

open access: yesJahresbericht der Deutschen Mathematiker-Vereinigung, 2010
The theory of Markov Decision Processes is the theory of controlled Markov chains. Its origins can be traced back to R. Bellman and L. Shapley in the 1950's. During the decades of the last century this theory has grown dramatically. It has found applications in various areas like e.g.
Bäuerle, N., Rieder, U.
openaire   +3 more sources

Unifying Two Views on Multiple Mean-Payoff Objectives in Markov Decision Processes [PDF]

open access: yesLogical Methods in Computer Science, 2017
We consider Markov decision processes (MDPs) with multiple limit-average (or mean-payoff) objectives. There exist two different views: (i) the expectation semantics, where the goal is to optimize the expected mean-payoff objective, and (ii) the ...
Krishnendu Chatterjee   +2 more
doaj   +1 more source

Configurable Markov Decision Processes

open access: yes, 2018
In many real-world problems, there is the possibility to configure, to a limited extent, some environmental parameters to improve the performance of a learning agent. In this paper, we propose a novel framework, Configurable Markov Decision Processes (Conf-MDPs), to model this new type of interaction with the environment.
Metelli, Alberto Maria   +2 more
openaire   +4 more sources

Markov decision processes in minimization of expected costs

open access: yesCroatian Operational Research Review, 2014
Basics of Markov decision processes will be introduced in order to obtain the optimization goal function for minimizing the long-run expected cost. We focus on mini-mization of such cost of the farmer's policy consisting of different decisions in speci c
Marija Rukav   +3 more
doaj   +1 more source

Probabilistic Opacity for Markov Decision Processes

open access: yes, 2014
Opacity is a generic security property, that has been defined on (non probabilistic) transition systems and later on Markov chains with labels. For a secret predicate, given as a subset of runs, and a function describing the view of an external observer,
Bérard, Béatrice   +2 more
core   +3 more sources

Symblicit algorithms for optimal strategy synthesis in monotonic Markov decision processes [PDF]

open access: yesElectronic Proceedings in Theoretical Computer Science, 2014
When treating Markov decision processes (MDPs) with large state spaces, using explicit representations quickly becomes unfeasible. Lately, Wimmer et al. have proposed a so-called symblicit algorithm for the synthesis of optimal strategies in MDPs, in the
Aaron Bohy   +2 more
doaj   +1 more source

Life is Random, Time is Not: Markov Decision Processes with Window Objectives [PDF]

open access: yesLogical Methods in Computer Science, 2020
The window mechanism was introduced by Chatterjee et al. to strengthen classical game objectives with time bounds. It permits to synthesize system controllers that exhibit acceptable behaviors within a configurable time frame, all along their infinite ...
Thomas Brihaye   +3 more
doaj   +1 more source

Partially Observable Risk-Sensitive Markov Decision Processes

open access: yes, 2016
We consider the problem of minimizing a certainty equivalent of the total or discounted cost over a finite and an infinite time horizon which is generated by a Partially Observable Markov Decision Process (POMDP).
Bäuerle, Nicole, Rieder, Ulrich
core   +1 more source

Home - About - Disclaimer - Privacy