Markov decision process - Open Access .click

Results 111 to 120 of about 57,303 (303)

Optimal policies for discrete time risk processes with a Markov chain investment model [PDF]

, 2006
We consider a discrete risk process modelled by a Markov Decision Process. The surplus could be invested in stock market assets. We adopt a realistic point of view and we let the investment return process to be statistically dependent over time.
Romera, Rosario, Diasparra, Maikol
core

Computing the Cramer-Rao bound of Markov random field parameters: Application to the Ising and the Potts models [PDF]

, 2013
This letter considers the problem of computing the Cramer–Rao bound for the parameters of a Markov random field. Computation of the exact bound is not feasible for most fields of interest because their likelihoods are intractable and have intractable ...
Batatia, Hadj +8 more
core +1 more source

Advances in Thermal Modeling and Simulation of Lithium‐Ion Batteries with Machine Learning Approaches

Advanced Intelligent Discovery, EarlyView.
Heat generation in lithium‐ion batteries affects performance, aging, and safety, requiring accurate thermal modeling. Traditional methods face efficiency and adaptability challenges. This article reviews machine learning‐based and hybrid modeling approaches, integrating data and physics to improve parameter estimation and temperature prediction ...
Qi Lin +4 more
wiley +1 more source

State Clustering in Markov Decisions Processes with an Application in Information Sharing

, 2005
This research examines state clustering in Markov Decision processes, specifically addressing the problem referred to as Markov Decision process with restricted observations. The general problem is a special case of a Partially Observable Markov Decision
Berrings Davis, Lauren Marie
core

Approximating ergodic average reward continuous: time controlled Markov chains

, 2010
We study the approximation of an ergodic average reward continuous-time denumerable state Markov decision process (MDP) by means of a sequence of MDPs. Our results include the convergence of the corresponding optimal policies and the optimal gains. For a
Lorenzo Magán, José María
core +1 more source

Overcoming the Nyquist Limit in Molecular Hyperspectral Imaging by Reinforcement Learning

Advanced Intelligent Discovery, EarlyView.
Explorative spectral acquisition guide automatically selects informative spectral bands to optimize downstream tasks, outperforming full‐spectrum acquisition. The selected hyperspectral data are used for tasks such as unmixing and segmentation. BandOptiNet encodes selection states and outputs optimal bands to guide spectral acquisition. Recent advances
Xiaobin Tang +4 more
wiley +1 more source

A Markov decision process model for capacity expansion and allocation

, 1999
We present a finite-horizon Markov decision process (MDP) model for providing decision support in semiconductor manufacturing on such critical operational issues as when to add additional capacity and when to convert from one type of production to ...
Marcus, S.I. +4 more
core +1 more source

Robust Reinforcement Learning Control Framework for a Quadrotor Unmanned Aerial Vehicle Using Critic Neural Network

Advanced Intelligent Systems, Volume 7, Issue 3, March 2025.
Quadrotor unmanned aerial vehicle control is critical to maintain flight safety and efficiency, especially when facing external disturbances and model uncertainties. This article presents a robust reinforcement learning control scheme to deal with these challenges.
Yu Cai, Yefeng Yang, Tao Huang, Boyang Li +3 more
wiley +1 more source

On the Practical Art of State Definitions for Markov Decision Process Construction

IEEE Access, 2018
Many problems faced by decision makers today involve the management of large scale, complex systems that can be modeled as state-based control problems, specifically discrete Markov decision process (MDP). Typical examples include transportation systems,
William T. Scherer, Stephen Adams, Peter A. Beling +2 more
doaj +1 more source

Dynamic and Structural Features of Intifada Violence: A Markov Process Approach [PDF]

This paper analyzes the daily incidence of violence during the Second Intifada. We compare several alternative statistical models with different dynamic and structural stability characteristics while keeping modelling complexity to a minimum by only ...
Dale J. Poirier, Ivan Jeliazkov
core

reinforcement learning
q-learning