Approximate policy iteration - Open Access .click

Results 41 to 50 of about 93,202 (286)

Least-squares methods for policy iteration [PDF]

, 2011
Approximate reinforcement learning deals with the essential problem of applying reinforcement learning in large and continuous state-action spaces, by using function approximators to represent the solution.
Babuska, Robert +5 more
core +3 more sources

Tuning approximate dynamic programming policies for ambulance redeployment via direct search

Stochastic Systems, 2014
In this paper we consider approximate dynamic programming methods for ambulance redeployment. We first demonstrate through simple examples how typical value function fitting techniques, such as approximate policy iteration and linear programming, may not
Matthew S. Maxwell, Shane G. Henderson, Huseyin Topaloglu +2 more
doaj +1 more source

Adaptive Optimal Robust Control for Uncertain Nonlinear Systems Using Neural Network Approximation in Policy Iteration

Applied Sciences, 2021
In this study, based on the policy iteration (PI) in reinforcement learning (RL), an optimal adaptive control approach is established to solve robust control problems of nonlinear systems with internal and input uncertainties.
Dengguo Xu, Qinglin Wang, Yuan Li
doaj +1 more source

Observer-Based Adaptive Control of Uncertain Nonlinear Systems Via Neural Networks

IEEE Access, 2018
In this paper, a novel observer-based control strategy is proposed for a class of uncertain continuous-time nonlinear systems based on the Hamilton-Jacobi-Bellman (HJB) equation.
Chaoxu Mu, Yong Zhang, Ke Wang
doaj +1 more source

Optimal adaptive leader-follower consensus of linear multi-agent systems: Known and unknown dynamics [PDF]

Journal of Artificial Intelligence and Data Mining, 2015
In this paper, the optimal adaptive leader-follower consensus of linear continuous time multi-agent systems is considered. The error dynamics of each player depends on its neighbors’ information.
F. Tatari, M. B. Naghibi-Sistani
doaj +1 more source

Application of machine learning to assess the value of information in polymer flooding

Petroleum Research, 2021
In this work, we provide a more consistent alternative for performing value of information (VOI) analyses to address sequential decision problems in reservoir management and generate insights on the process of reservoir decision-making.
Amine Tadjer +3 more
doaj +1 more source

Newton’s method for reinforcement learning and model predictive control

Results in Control and Optimization, 2022
The purpose of this paper is to propose and develop a new conceptual framework for approximate Dynamic Programming (DP) and Reinforcement Learning (RL). This framework centers around two algorithms, which are designed largely independently of each other ...
Dimitri Bertsekas
doaj +1 more source

Near-Optimal Tracking Control of a Nonholonomic Mobile Robot with Uncertainties

International Journal of Advanced Robotic Systems, 2012
A combined kinematic/torque control law is developed by using a backstepping design approach for a nonholonomic mobile robot with two driving wheels mounted on the same axis to track a reference trajectory.
Kai Wang
doaj +1 more source

Value-iteration based fitted policy iteration: learning with a single trajectory [PDF]

, 2007
ADPRL 2007. Honolulu, Hawaii, Apr 1-5, 2007. We consider batch reinforcement learning problems in continuous space, expected total discounted-reward Markovian Decision Problems when the training data is composed of the trajectory of some fixed behaviour
Antos, András, Munos, Rémi, Szepesvári, Csaba +2 more
core +3 more sources

Dynamic Virtual Resource Allocation for 5G and Beyond Network Slicing

IEEE Open Journal of Vehicular Technology, 2020
The fifth generation and beyond wireless communication will support vastly heterogeneous services and user demands such as massive connection, low latency and high transmission rate.
Fei Song +5 more
doaj +1 more source

computer science
mathematics
mathematical optimization

reinforcement learning
artificial intelligence
statistics

machine learning
markov decision process
applied mathematics