Results 131 to 140 of about 88,537 (171)

Pharmacological and pupillary evidence for the noradrenergic contribution to reinforcement learning in Parkinson's disease. [PDF]

open access: yesCommun Biol
O'Callaghan C   +13 more
europepmc   +1 more source

Hierarchical Bayesian Inverse Reinforcement Learning

IEEE Transactions on Cybernetics, 2015
Inverse reinforcement learning (IRL) is the problem of inferring the underlying reward function from the expert's behavior data. The difficulty in IRL mainly arises in choosing the best reward function since there are typically an infinite number of reward functions that yield the given behavior data as optimal.
Choi, JD Choi, Jae-Deug   +1 more
openaire   +3 more sources

Hierarchical Adversarial Inverse Reinforcement Learning

IEEE Transactions on Neural Networks and Learning Systems
Imitation learning (IL) has been proposed to recover the expert policy from demonstrations. However, it would be difficult to learn a single monolithic policy for highly complex long-horizon tasks of which the expert policy usually contains subtask hierarchies.
Jiayu Chen, Tian Lan, Vaneet Aggarwal
openaire   +2 more sources

Apprenticeship learning via inverse reinforcement learning

Twenty-first international conference on Machine learning - ICML '04, 2004
We consider learning in a Markov decision process where we are not explicitly given a reward function, but where instead we can observe an expert demonstrating the task that we want to learn to perform. This setting is useful in applications (such as the task of driving) where it may be difficult to write down an explicit reward function specifying ...
Pieter Abbeel, Andrew Y. Ng
openaire   +1 more source

Inverse Reinforcement Learning and Imitation Learning

2020
This chapter provides an overview of the most popular methods of inverse reinforcement learning (IRL) and imitation learning (IL). These methods solve the problem of optimal control in a data-driven way, similarly to reinforcement learning, however with the critical difference that now rewards are not observed. The problem is rather to learn the reward
Matthew F. Dixon   +2 more
openaire   +1 more source

Home - About - Disclaimer - Privacy