Deep Reinforcement Learning Under Signal Temporal Logic Constraints Using Lagrangian Relaxation
Deep reinforcement learning (DRL) has attracted much attention as an approach to solve optimal control problems without mathematical models of systems. On the other hand, in general, constraints may be imposed on optimal control problems.
Junya Ikemoto, Toshimitsu Ushio
doaj +1 more source
Policy learning in continuous-time Markov decision processes using Gaussian Processes [PDF]
Continuous-time Markov decision processes provide a very powerful mathematical framework to solve policy-making problems in a wide range of applications, ranging from the control of populations to cyber-physical systems. The key problem to solve for these models is to efficiently compute an optimal policy to control the system in order to maximise the ...
Bartocci, E. +4 more
openaire +6 more sources
Electric Vehicle Charging Management Based on Deep Reinforcement Learning
A time-variable time-of-use electricity price can be used to reduce the charging costs for electric vehicle (EV) owners. Considering the uncertainty of price fluctuation and the randomness of EV owner's commuting behavior, we propose a deep ...
Sichen Li +6 more
doaj +1 more source
Realizable Strategies in Continuous-Time Markov Decision Processes [PDF]
zbMATH Open Web Interface contents unavailable due to conflicting licenses.
openaire +1 more source
Continuous-Time Markov Decision Processes with Controlled Observations [PDF]
In this paper, we study a continuous-time discounted jump Markov decision process with both controlled actions and observations. The observation is only available for a discrete set of time instances. At each time of observation, one has to select an optimal timing for the next observation and a control trajectory for the time interval between two ...
Huang, Yunhan +2 more
openaire +2 more sources
Impact of Mode Decision Delay on Estimation Error in Continuous-Time Controlled System
For highly maneuvering target interception in terminal guidance, the maximal admissible mode decision delay (MAMDD) and the least required mode sojourn time (LRMST) are calculated in continuous-time controlled system.
Shengwen Xiang, Hongqi Fan, Qiang Fu
doaj +1 more source
An approximation approach for the deviation matrix of continuous-time Markov processes with application to Markov decision theory [PDF]
We present an update formula that allows the expression of the deviation matrix of a continuous-time Markov process with denumerable state space having generator matrix Q* through a continuous-time Markov process with generator matrix Q.
Arie Hordijk +13 more
core +2 more sources
Optimal strategy selection approach of moving target defense based on Markov time game
For the problem that the existed game model was challenging to model the dynamic continuous characteristics of network attack and defense confrontation effectively,a method based on Markov time game was proposed to select the optimal strategy for moving ...
Jinglei TAN +4 more
doaj +2 more sources
Finite horizon optimal stopping of time-discontinuous functionals with applications to impulse control with delay [PDF]
We study finite horizon optimal stopping problems for continuous-time Feller–Markov processes. The functional depends on time, state, and external parameters and may exhibit discontinuities with respect to the time variable.
El Karoui N. +4 more
core +1 more source
Constrained total undiscounted continuous-time Markov decision processes [PDF]
The present paper considers the constrained optimal control problem with total undiscounted criteria for a continuous-time Markov decision process (CTMDP) in Borel state and action spaces. Under the standard compactness and continuity conditions, we show the existence of an optimal stationary policy out of the class of general nonstationary ones.
Guo, Xianping, Zhang, Yi
openaire +4 more sources

