Results 31 to 40 of about 56,185 (161)
Research on Wargame Decision-Making Method Based on Multi-Agent Deep Deterministic Policy Gradient
Wargames are essential simulators for various war scenarios. However, the increasing pace of warfare has rendered traditional wargame decision-making methods inadequate.
Sheng Yu, Wei Zhu, Yong Wang
doaj +1 more source
With the price of green energy now more reasonable, users can now produce enough electricity to meet their needs and make a profit by selling the surplus on the underground P2P energy market.
Mohammed Alsolami +3 more
doaj +1 more source
Dynamic Mechanism Design for Repeated Markov Games with Hidden Actions: Computational Approach
This paper introduces a dynamic mechanism design tailored for uncertain environments where incentive schemes are challenged by the inability to observe players’ actions, known as moral hazard.
Julio B. Clempner
doaj +1 more source
Bounds on the bias terms for the Markov reward approach
An important step in the Markov reward approach to error bounds on stationary performance measures of Markov chains is to bound the bias terms. Affine functions have been successfully used for these bounds for various models, but there are also models for which it has not been possible to establish such bounds.
Bai, Xinwei, Goseling, Jasper
openaire +2 more sources
Object Affordance Driven Inverse Reinforcement Learning Through Conceptual Abstraction and Advice
Within human Intent Recognition (IR), a popular approach to learning from demonstration is Inverse Reinforcement Learning (IRL). IRL extracts an unknown reward function from samples of observed behaviour. Traditional IRL systems require large datasets to
Bhattacharyya Rupam +1 more
doaj +1 more source
Optimizing Subway Train Operation With Hierarchical Adaptive Control Approach
The proportional integral derivative (PID) method is widely used in industrial control applications. However, when applied to complex and dynamic train operation control systems, real-time parameter adjustment becomes a formidable challenge.
Gaoyun Cheng +6 more
doaj +1 more source
A State Aggregation Approach To Singularly Perturbed Markov Reward Processes
In this paper, we propose a single sample path based algorithm with state aggregation to optimize the average rewards of singularly perturbed Markov reward processes (SPMRPs) with a large scale state spaces. It is assumed that such a reward process depend on a set of parameters.
Zhang, Dali, Baoqun Yin, Hongsheng Xi
openaire +1 more source
A Learning Based Approach to Control Synthesis of Markov Decision Processes for Linear Temporal Logic Specifications [PDF]
We propose to synthesize a control policy for a Markov decision process (MDP) such that the resulting traces of the MDP satisfy a linear temporal logic (LTL) property. We construct a product MDP that incorporates a deterministic Rabin automaton generated
Coogan, Samuel +4 more
core +2 more sources
This paper is concerned with the asymptotic optimality of quantized stationary policies for continuous-time Markov decision processes (CTMDPs) in Polish spaces with state-dependent discount factors, where the transition rates and reward rates are allowed
Xiao Wu, Yanqiu Tang
doaj +1 more source
Discrete time Non-homogeneous Semi-Markov Processes applied to Models for Disability Insurance [PDF]
In this paper, we present a stochastic model for disability insurance contracts. The model is based on a discrete time non-homogeneous semi-Markov process (DTNHSMP) to which the backward recurrence time process is introduced.
D'Amico, Guglielmo +3 more
core +1 more source

