Approximate policy iteration - Open Access .click

Results 61 to 70 of about 93,202 (286)

A Q‐Learning Algorithm to Solve the Two‐Player Zero‐Sum Game Problem for Nonlinear Systems

International Journal of Adaptive Control and Signal Processing, Volume 39, Issue 3, Page 566-581, March 2025.
A Q‐learning algorithm to solve the two‐player zero‐sum game problem for nonlinear systems. ABSTRACT This paper deals with the two‐player zero‐sum game problem, which is a bounded L2$$ {L}_2 $$‐gain robust control problem. Finding an analytical solution to the complex Hamilton‐Jacobi‐Issacs (HJI) equation is a challenging task.
Afreen Islam, Anthony Siming Chen, Guido Herrmann +2 more
wiley +1 more source

On the Performance Bounds of some Policy Search Dynamic Programming Algorithms [PDF]

, 2013
We consider the infinite-horizon discounted optimal control problem formalized by Markov Decision Processes. We focus on Policy Search algorithms, that compute an approximately optimal policy by following the standard Policy Iteration (PI) scheme via an -
Scherrer, Bruno
core +3 more sources

A Robust Adaptive One‐Sample‐Ahead Preview Super‐Twisting Sliding Mode Controller

International Journal of Adaptive Control and Signal Processing, EarlyView.
Block Diagram of the Robust Adaptive One‐Sample‐Ahead Preview Super‐Twisting Sliding Mode Controller. ABSTRACT This article introduces a discrete‐time robust adaptive one‐sample‐ahead preview super‐twisting sliding mode controller. A stability analysis of the controller by Lyapunov criteria is developed to demonstrate its robustness in handling both ...
Guilherme Vieira Hollweg +5 more
wiley +1 more source

Conservative and Greedy Approaches to Classification-based Policy Iteration [PDF]

, 2012
International audienceThe existing classification-based policy iteration (CBPI) algorithms can be divided into two categories: {\em direct policy iteration} (DPI) methods that directly assign the output of the classifier (the approximate greedy policy w ...
Ghavamzadeh, Mohammad, Lazaric, Alessandro +1 more
core +1 more source

ONLINE AND LIGHTWEIGHT KERNEL-BASED APPROXIMATE POLICY ITERATION FOR DYNAMIC P-NORM LINEAR ADAPTIVE FILTERING [PDF]

, 2022
Yuki AKIYAMA, Minh T. Vu, Konstantinos Slavakis +2 more
openalex +1 more source

Synthetic Nanobiology Actuated Lipometabolic Cell Factory for Autologous Tumor Immunotherapy

Advanced Functional Materials, EarlyView.
FA plays a crucial role in the interaction between tumor cells and the tumor microenvironment, especially for the immune response. A biocatalytic immunoenhancement strategy is developed to boost antitumor immunity by FA metabolic orientation to ceramide. Through the design of this delicate catalytic immunoenhancement strategy, the synthetic nanobiology
Shoujie Zhao +8 more
wiley +1 more source

Approximate dynamic programming for two-player zero-sum Markov games [PDF]

, 2015
International audienceThis paper provides an analysis of error propagation in Approximate Dynamic Programming applied to zero-sum two-player Stochastic Games.
Perolat, Julien +3 more
core +1 more source

Selective Separation of the Rare Earth Elements Dysprosium and Neodymium via Tailoring Nanocellulose Chemical Structure

Advanced Functional Materials, EarlyView.
Dicarboxylate‐modified anionic hairy cellulose nanocrystals exhibit a high selectivity for dysprosium(III) over neodymium(III). This selectivity arises from disordered dicarboxylate cellulose “hairs” that enable cooperative ionic coordination, hydrogen bonding, and strain‐induced conformational shrinkage.
Roya Koshani +6 more
wiley +1 more source

Model-Free Gradient-Based Adaptive Learning Controller for an Unmanned Flexible Wing Aircraft

Robotics, 2018
Classical gradient-based approximate dynamic programming approaches provide reliable and fast solution platforms for various optimal control problems. However, their dependence on accurate modeling approaches poses a major concern, where the efficiency ...
Mohammed Abouheaf, Wail Gueaieb, Frank Lewis +2 more
doaj +1 more source

Thermo‐Mechanically Recyclable Smart Textiles from Circularly Knitted Liquid Crystal Elastomer Fibers

Advanced Functional Materials, EarlyView.
Reprogrammable multi‐material smart textiles knitted from liquid crystal elastomer fibers undergo 2D and 3D deformation under thermal and photo stimuli. Circularly knitted tubular structures reversibly contract in radial and axial directions, enabling autonomous climbing, liquid release, and micro pumping.
Xue Wan +8 more
wiley +1 more source

computer science
mathematics
mathematical optimization

reinforcement learning
artificial intelligence
statistics

machine learning
markov decision process
applied mathematics