Results 1 to 10 of about 8,751,520 (309)
Policy search with rare significant events: Choosing the right partner to cooperate with [PDF]
This paper focuses on a class of reinforcement learning problems where significant events are rare and limited to a single positive reward per episode.
Paul Ecoffet +3 more
doaj +3 more sources
Multi-Task Policy Search for Robotics [PDF]
© 2014 IEEE.Learning policies that generalize across multiple tasks is an important and challenging research topic in reinforcement learning and robotics.
Deisenroth, MP +3 more
core +4 more sources
Learning policies that generalize across multiple tasks is an important and challenging research topic in reinforcement learning and robotics. Training individual policies for every single potential task is often impractical, especially for continuous ...
Deisenroth, MP +3 more
core +6 more sources
Geometric Reinforcement Learning for Robotic Manipulation
Reinforcement learning (RL) is a popular technique that allows an agent to learn by trial and error while interacting with a dynamic environment.
Naseem Alhousani +5 more
doaj +1 more source
Designing Lookahead Policies for Sequential Decision Problems in Transportation and Logistics
There is a wide range of sequential decision problems in transportation and logistics that require dealing with uncertainty. There are four classes of policies that we can draw on for different types of decisions, but many problems in transportation and ...
Warren B. Powell
doaj +1 more source
Learning Replanning Policies With Direct Policy Search [PDF]
Direct policy search has been successful in learning challenging real-world robotic motor skills by learning open-loop movement primitives with high sample efficiency. These primitives can be generalized to different contexts with varying initial configurations and goals.
Florian Brandherm +3 more
openaire +3 more sources
Accelerating Robot Trajectory Learning for Stochastic Tasks
Learning from demonstration provides ways to transfer knowledge and skills from humans to robots. Models based solely on learning from demonstration often have very good generalization capabilities but are not completely accurate when adapting to new ...
Josip Vidakovic +4 more
doaj +1 more source
Path integral guided policy search [PDF]
We present a policy search method for learning complex feedback control policies that map from high-dimensional sensory inputs to motor torques, for manipulation tasks with discontinuous contact dynamics. We build on a prior technique called guided policy search (GPS), which iteratively optimizes a set of local policies for specific instances of a task,
Chebotar, Y. +5 more
openaire +3 more sources
Generalized exploration in policy search [PDF]
To learn control policies in unknown environments, learning agents need to explore by trying actions deemed suboptimal. In prior work, such exploration is performed by either perturbing the actions at each time-step independently, or by perturbing policy parameters over an entire episode.
Herke van Hoof +2 more
openaire +2 more sources
Proximal Policy Optimization for Radiation Source Search
Rapid search and localization for nuclear sources can be an important aspect in preventing human harm from illicit material in dirty bombs or from contamination. In the case of a single mobile radiation detector, there are numerous challenges to overcome
Philippe Proctor +3 more
doaj +1 more source

