Results 11 to 20 of about 8,751,520 (309)

Compatible natural gradient policy search [PDF]

open access: yesMachine Learning, 2019
Trust-region methods have yielded state-of-the-art results in policy search. A common approach is to use KL-divergence to bound the region of trust resulting in a natural gradient policy update. We show that the natural gradient and trust region optimization are equivalent if we use the natural parameterization of a standard exponential policy ...
Joni Pajarinen   +4 more
openaire   +9 more sources

Gradient-Aware Model-Based Policy Search [PDF]

open access: yesProceedings of the AAAI Conference on Artificial Intelligence, 2020
Traditional model-based reinforcement learning approaches learn a model of the environment dynamics without explicitly considering how it will be used by the agent. In the presence of misspecified model classes, this can lead to poor estimates, as some relevant available information is ignored.
Pierluca D'Oro   +4 more
openaire   +5 more sources

Augmented Bayesian Policy Search

open access: yesInternational Conference on Learning Representations
Accepted to the International Conference on Learning Representations (ICLR ...
Kallel, Mahdi   +3 more
openaire   +4 more sources

Efficient thrust generation in robotic fish caudal fins using policy search

open access: yesIET Cyber-systems and Robotics, 2019
Thrust generation is a crucial aspect of fish locomotion that depends on a variety of morphological and kinematic parameters. In this work, the kinematics of caudal fin motion of a robotic fish are optimised experimentally.
Yixi Shan   +3 more
doaj   +2 more sources

DiffCPS: Diffusion Model based Constrained Policy Search for Offline Reinforcement Learning [PDF]

open access: yesarXiv.org, 2023
Constrained policy search (CPS) is a fundamental problem in offline reinforcement learning, which is generally solved by advantage weighted regression (AWR).
Longxiang He   +3 more
semanticscholar   +1 more source

Policy Search for Model Predictive Control With Application to Agile Drone Flight [PDF]

open access: yesIEEE Transactions on robotics, 2021
Policy search and model predictive control (MPC) are two different paradigms for robot control: policy search has the strength of automatically learning complex policies using experienced data, and MPC can offer optimal control performance using models ...
Yunlong Song, D. Scaramuzza
semanticscholar   +1 more source

Combining Evolution and Deep Reinforcement Learning for Policy Search: A Survey [PDF]

open access: yesACM Transactions on Evolutionary Learning and Optimization, 2022
Deep neuroevolution and deep Reinforcement Learning have received a lot of attention over the past few years. Some works have compared them, highlighting their pros and cons, but an emerging trend combines them so as to benefit from the best of both ...
Olivier Sigaud
semanticscholar   +1 more source

Global Convergence of Direct Policy Search for State-Feedback H∞ Robust Control: A Revisit of Nonsmooth Synthesis with Goldstein Subdifferential [PDF]

open access: yesNeural Information Processing Systems, 2022
Direct policy search has been widely applied in modern reinforcement learning and continuous control. However, the theoretical properties of direct policy search on nonsmooth robust control synthesis have not been fully understood.
Xing-ming Guo, B. Hu
semanticscholar   +1 more source

Two-Stage Reinforcement Learning Policy Search for Grid-Interactive Building Control

open access: yesIEEE Transactions on Smart Grid, 2022
This paper develops an intelligent grid-interactive building controller, which optimizes building operation during both normal hours and demand response (DR) events.
X. Zhang   +6 more
semanticscholar   +1 more source

Evaluating Guided Policy Search for Human-Robot Handovers

open access: yesIEEE Robotics and Automation Letters, 2021
We evaluate the potential of Guided Policy Search (GPS), a model-based reinforcement learning (RL) method, to train a robot controller for human-robot object handovers.
Alap Kshirsagar, Guy Hoffman, A. Biess
semanticscholar   +1 more source

Home - About - Disclaimer - Privacy