Markov reward approach - Open Access .click

Results 31 to 40 of about 56,185 (161)

Research on Wargame Decision-Making Method Based on Multi-Agent Deep Deterministic Policy Gradient

Applied Sciences, 2023
Wargames are essential simulators for various war scenarios. However, the increasing pace of warfare has rendered traditional wargame decision-making methods inadequate.
Sheng Yu, Wei Zhu, Yong Wang
doaj +1 more source

Peer-to-peer trading in smart grid with demand response and grid outage using deep reinforcement learning

Ain Shams Engineering Journal, 2023
With the price of green energy now more reasonable, users can now produce enough electricity to meet their needs and make a profit by selling the surplus on the underground P2P energy market.
Mohammed Alsolami +3 more
doaj +1 more source

Dynamic Mechanism Design for Repeated Markov Games with Hidden Actions: Computational Approach

Mathematical and Computational Applications
This paper introduces a dynamic mechanism design tailored for uncertain environments where incentive schemes are challenged by the inability to observe players’ actions, known as moral hazard.
Julio B. Clempner
doaj +1 more source

Bounds on the bias terms for the Markov reward approach

, 2019
An important step in the Markov reward approach to error bounds on stationary performance measures of Markov chains is to bound the bias terms. Affine functions have been successfully used for these bounds for various models, but there are also models for which it has not been possible to establish such bounds.
Bai, Xinwei, Goseling, Jasper
openaire +2 more sources

Object Affordance Driven Inverse Reinforcement Learning Through Conceptual Abstraction and Advice

Paladyn, 2018
Within human Intent Recognition (IR), a popular approach to learning from demonstration is Inverse Reinforcement Learning (IRL). IRL extracts an unknown reward function from samples of observed behaviour. Traditional IRL systems require large datasets to
Bhattacharyya Rupam, Hazarika Shyamanta M. +1 more
doaj +1 more source

Optimizing Subway Train Operation With Hierarchical Adaptive Control Approach

IEEE Access, 2023
The proportional integral derivative (PID) method is widely used in industrial control applications. However, when applied to complex and dynamic train operation control systems, real-time parameter adjustment becomes a formidable challenge.
Gaoyun Cheng +6 more
doaj +1 more source

A State Aggregation Approach To Singularly Perturbed Markov Reward Processes

, 2008
In this paper, we propose a single sample path based algorithm with state aggregation to optimize the average rewards of singularly perturbed Markov reward processes (SPMRPs) with a large scale state spaces. It is assumed that such a reward process depend on a set of parameters.
Zhang, Dali, Baoqun Yin, Hongsheng Xi
openaire +1 more source

A Learning Based Approach to Control Synthesis of Markov Decision Processes for Linear Temporal Logic Specifications [PDF]

, 2014
We propose to synthesize a control policy for a Markov decision process (MDP) such that the resulting traces of the MDP satisfy a linear temporal logic (LTL) property. We construct a product MDP that incorporates a deterministic Rabin automaton generated
Coogan, Samuel +4 more
core +2 more sources

Asymptotic Optimality and Rates of Convergence of Quantized Stationary Policies in Continuous-Time Markov Decision Processes

Discrete Dynamics in Nature and Society, 2022
This paper is concerned with the asymptotic optimality of quantized stationary policies for continuous-time Markov decision processes (CTMDPs) in Polish spaces with state-dependent discount factors, where the transition rates and reward rates are allowed
Xiao Wu, Yanqiu Tang
doaj +1 more source

Discrete time Non-homogeneous Semi-Markov Processes applied to Models for Disability Insurance [PDF]

, 2012
In this paper, we present a stochastic model for disability insurance contracts. The model is based on a discrete time non-homogeneous semi-Markov process (DTNHSMP) to which the backward recurrence time process is introduced.
D'Amico, Guglielmo +3 more
core +1 more source

reinforcement learning
markov decision process
mathematics

decision-making
inverse reinforcement learning
biology general

availability
policy improvement
probability math.pr