Continuous-time markov decision process

Results 121 to 130 of about 152,215 (282)

Logarithmic Regret Bounds for Continuous-Time Average-Reward Markov Decision Processes

SIAM Journal on Control and Optimization
We consider reinforcement learning for continuous-time Markov decision processes (MDPs) in the infinite-horizon, average-reward setting. In contrast to discrete-time MDPs, a continuous-time process moves to a state and stays there for a random holding time after an action is taken.
Gao, Xuefeng, Zhou, Xun Yu
openaire +3 more sources

Orchestrating Ecosystem Resources for Sustainability: Coopetition, Digital Transformation, and Disruptive Sustainable Innovation

Business Strategy and the Environment, EarlyView.
ABSTRACT As sustainability transitions accelerate, firms increasingly engage in innovation ecosystems to pursue disruptive sustainable innovation (DSI). Nevertheless, empirical understanding regarding how innovation ecosystem coopetition—simultaneous cooperation and competition among interdependent actors—translates into sustainability‐oriented ...
Jin‐Sup Jung, Min‐Jae Lee
wiley +1 more source

Robustness analysis of a Continuous-Time Markov Jump Process model for herd behaviour under stochastic shocks

Scientific African
Herd behaviour in socio-economic systems is characterised by asynchronous, stochastic decision-making that traditional models capture imperfectly. Discrete-time DTMCs impose fixed-tick synchrony and require binning, which distorts event timing and smears
Samuel Kipsang Kaptum +3 more
doaj +1 more source

A hidden Markov model and reinforcement learning‐based strategy for fault‐tolerant control

The Canadian Journal of Chemical Engineering, EarlyView.
Abstract This study introduces a data‐driven control strategy integrating hidden Markov models (HMM) and reinforcement learning (RL) to achieve resilient, fault‐tolerant operation against persistent disturbances in nonlinear chemical processes. Called hidden Markov model and reinforcement learning (HMMRL), this strategy is evaluated in two case studies
Tamera Leitao, Debaprasad Dutta, Simant R. Upreti +2 more
wiley +1 more source

Restricted Tweedie stochastic block models

Canadian Journal of Statistics, EarlyView.
Abstract The stochastic block model (SBM) is a widely used framework for community detection in networks, where the network structure is typically represented by an adjacency matrix. However, conventional SBMs are not directly applicable to an adjacency matrix that consists of nonnegative zero‐inflated continuous edge weights.
Jie Jian, Mu Zhu, Peijun Sang
wiley +1 more source

Hidden Markov graphical models with state‐dependent generalized hyperbolic distributions

Canadian Journal of Statistics, EarlyView.
Abstract In this article, we develop a novel hidden Markov graphical model to investigate time‐varying interconnectedness between different financial markets. To identify conditional correlation structures under varying market conditions and accommodate shape features embedded in financial time series, we rely upon the generalized hyperbolic family of ...
Beatrice Foroni, Luca Merlo, Lea Petrella +2 more
wiley +1 more source

On some continuous time discounted Markov decision process.

, 1998
zbMATH Open Web Interface contents unavailable due to conflicting licenses.
openaire +2 more sources

A goodness‐of‐fit test for regression models with discrete outcomes

Canadian Journal of Statistics, EarlyView.
Abstract Regression models are often used to analyze discrete outcomes, but classical goodness‐of‐fit tests such as those based on the deviance or Pearson's statistic can be misleading or have little power in this context. To address this issue, we propose a new test, inspired by the work of Czado et al.
Lu Yang +2 more
wiley +1 more source

Nonstationary Continuous Time Markov Decision Processes in a Semi-Markov Environment with Discounted Criterion

Journal of Mathematical Analysis and Applications, 1995
zbMATH Open Web Interface contents unavailable due to conflicting licenses.
openaire +2 more sources

Invariant Measure and Universality of the 2D Yang–Mills Langevin Dynamic

Communications on Pure and Applied Mathematics, EarlyView.
ABSTRACT We prove that the Yang–Mills (YM) measure for the trivial principal bundle over the two‐dimensional torus, with any connected, compact structure group, is invariant for the associated renormalised Langevin dynamic. Our argument relies on a combination of regularity structures, lattice gauge‐fixing and Bourgain's method for invariant measures ...
Ilya Chevyrev, Hao Shen
wiley +1 more source

markov and semi-markov decision processes
deep reinforcement learning
continuous-time markov decision processes

fos: mathematics
optimization and control math.oc
mathematics - optimization and control

optimal stochastic control
90c40
dynamic programming