Results 1 to 10 of about 93,202 (286)
Approximate Policy Iteration Schemes: A Comparison [PDF]
We consider the infinite-horizon discounted optimal control problem formalized by Markov Decision Processes. We focus on several approximate variations of the Policy Iteration algorithm: Approximate Policy Iteration, Conservative Policy Iteration (CPI ...
Scherrer, Bruno
core +11 more sources
Approximate policy iteration: A survey and some new methods [PDF]
We consider the classical policy iteration method of dynamic programming (DP), where approximations and simulation are used to deal with the curse of dimensionality.
A. G. Barto +82 more
core +8 more sources
Rollout sampling approximate policy iteration [PDF]
Several researchers have recently investigated the connection between reinforcement learning and classification. We are motivated by proposals of approximate policy iteration schemes without value functions which focus on policy representation using classifiers and address policy learning as a supervised learning problem.
Dimitrakakis Christos() +2 more
openaire +9 more sources
Adaptive Approximate Policy Iteration [PDF]
Model-free reinforcement learning algorithms combined with value function approximation have recently achieved impressive performance in a variety of application domains. However, the theoretical understanding of such algorithms is limited, and existing results are largely focused on episodic or discounted Markov decision processes (MDPs). In this work,
Hao, Botao +4 more
+6 more sources
Approximate policy iteration using regularised Bellman residuals minimisation [PDF]
Reinforcement Learning (RL) provides a general methodology to solve complex uncertain decision problems, which are very challenging in many real-world applications. RL problem is modeled as a Markov Decision Process (MDP) deeply studied in the literature. We consider Policy Iteration (PI) algorithms for RL which iteratively evaluate and improve control
Esposito, Gennaro, Martín Muñoz, Mario
openaire +4 more sources
Sarsa(Λ)-Based Logistics Planning Approximated by Value Function with Policy Iteration [PDF]
The logistics planning problem has been extensively investigated for a long time. However, with the increasing number of stochastic events occurred in road, increasing number of stochastic factors should be taken into consideration. A dynamic approach is
Yu Tang
doaj +2 more sources
Algorithms and Bounds for Rollout Sampling Approximate Policy Iteration [PDF]
14 pages, presented at EWRL ...
Dimitrakakis, C., Lagoudakis, M.G.
+9 more sources
Approximate Midpoint Policy Iteration for Linear Quadratic Control [PDF]
We present a midpoint policy iteration algorithm to solve linear quadratic optimal control problems in both model-based and model-free settings. The algorithm is a variation of Newton's method, and we show that in the model-based setting it achieves cubic convergence, which is superior to standard policy iteration and policy gradient algorithms that ...
Gravell, Benjamin +2 more
openaire +3 more sources
Solving Common-Payoff Games with Approximate Policy Iteration [PDF]
For artificially intelligent learning systems to have widespread applicability in real-world settings, it is important that they be able to operate decentrally. Unfortunately, decentralized control is difficult---computing even an epsilon-optimal joint policy is a NEXP complete problem.
Sokota, Samuel +8 more
openaire +3 more sources
An approximate policy iteration viewpoint of actor–critic algorithms [PDF]
In this work, we consider policy-based methods for solving the reinforcement learning problem, and establish the sample complexity guarantees. A policy-based algorithm typically consists of an actor and a critic. We consider using various policy update rules for the actor, including the celebrated natural policy gradient.
Chen, Zaiwei, Maguluri, Siva Theja
openaire +4 more sources

