Results 261 to 270 of about 8,751,520 (309)
Some of the next articles are maybe not open access.

Enhancing Deep Reinforcement Learning Approaches for Multi-Robot Navigation via Single-Robot Evolutionary Policy Search

IEEE International Conference on Robotics and Automation, 2022
Recent Multi-Agent Deep Reinforcement Learning approaches factorize a global action-value to address non-stationarity and favor cooperation. These methods, however, hinder exploration by introducing constraints (e.g., additive value-decomposition) to ...
Enrico Marchesini, A. Farinelli
semanticscholar   +1 more source

Neuro-Evolutionary Direct Policy Search for Multiobjective Optimal Control

IEEE Transactions on Neural Networks and Learning Systems, 2021
Direct policy search (DPS) is emerging as one of the most effective and widely applied reinforcement learning (RL) methods to design optimal control policies for multiobjective Markov decision processes (MOMDPs).
M. Zaniolo, M. Giuliani, A. Castelletti
semanticscholar   +1 more source

Bootstrap Aggregation and Cross‐Validation Methods to Reduce Overfitting in Reservoir Control Policy Search

Water Resources Research, 2020
Policy search methods provide a heuristic mapping between observations and decisions and have been widely used in reservoir control studies. However, recent studies have observed a tendency for policy search methods to overfit to the hydrologic data used
Z. Brodeur, J. Herman, S. Steinschneider
semanticscholar   +1 more source

Fitted policy search

2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL), 2011
In this paper we address the combination of batch reinforcement-learning (BRL) techniques with direct policy search (DPS) algorithms in the context of robot learning. Batch value-based algorithms (such as fitted Q-iteration) have been proved to outperform online ones in many complex applications, but they share the same difficulties in solving problems
MIGLIAVACCA, MARTINO   +4 more
openaire   +1 more source

Covariant Policy Search

2003
We investigate the problem of non-covariant behavior of policy gradient reinforcement learning algorithms. The policy gradient approach is amenable to analysis by information geometric methods. This leads us to propose a natural metric on controller parameterization that results from considering the manifold of probability distributions over paths ...
J. Andrew Bagnell, Schneider, Jeff
openaire   +1 more source

Nonconvex Policy Search Using Variational Inequalities

Neural Computation, 2017
Policy search is a class of reinforcement learning algorithms for finding optimal policies in control problems with limited feedback. These methods have been shown to be successful in high-dimensional problems such as robotics control. Though successful, current methods can lead to unsafe policy parameters that potentially could damage hardware units.
Zhan, Yusen   +2 more
openaire   +3 more sources

Policy Search by Dynamic Programming

2003
We consider the policy search approach to reinforcement learning. We show that if a “baseline distribution” is given (indicating roughly how often we expect a good policy to visit each state), then we can derive a policy search algorithm that terminates in a finite number of steps, and for which we can provide non-trivial performance guarantees.
J. Andrew Bagnell   +3 more
openaire   +1 more source

Numerical Quadrature for Probabilistic Policy Search

IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020
Learning control policies has become an appealing alternative to the derivation of control laws based on classic control theory. Model-based approaches have proven an outstanding data efficiency, especially when combined with probabilistic models to eliminate model bias. However, a major difficulty for these methods is that multi-step-ahead predictions
Julia Vinogradska   +4 more
openaire   +3 more sources

Relative Entropy Policy Search

Proceedings of the AAAI Conference on Artificial Intelligence, 2010
Policy search is a successful approach to reinforcement learning. However, policy improvements often result in the loss of information. Hence, it has been marred by premature convergence and implausible solutions. As first suggested in the context of covariant policy gradients, many of these problems may be addressed by constraining the
Peters, J., Mülling, K., Altun, Y.
openaire   +2 more sources

Distributed Fusion-Based Policy Search for Fast Robot Locomotion Learning

IEEE Computational Intelligence Magazine, 2019
Deep reinforcement learning methods are developed to deal with challenging locomotion control problems in a robotics domain and can achieve significant performance improvement over conventional control methods.
Zhengcai Cao, Qing Xiao, Mengchu Zhou
semanticscholar   +1 more source

Home - About - Disclaimer - Privacy