Results 11 to 20 of about 14,737,909 (208)

Natural Actor-Critic for Robust Reinforcement Learning with Function Approximation [PDF]

open access: yesNeural Information Processing Systems, 2023
We study robust reinforcement learning (RL) with the goal of determining a well-performing policy that is robust against model mismatch between the training simulator and the testing environment. Previous policy-based robust RL algorithms mainly focus on
Ruida Zhou   +5 more
semanticscholar   +1 more source

Breaking the Curse of Multiagency: Provably Efficient Decentralized Multi-Agent RL with Function Approximation [PDF]

open access: yesAnnual Conference Computational Learning Theory, 2023
A unique challenge in Multi-Agent Reinforcement Learning (MARL) is the curse of multiagency, where the description length of the game as well as the complexity of many existing learning algorithms scale exponentially with the number of agents.
Yuanhao Wang   +3 more
semanticscholar   +1 more source

Which Explanation Should I Choose? A Function Approximation Perspective to Characterizing Post hoc Explanations [PDF]

open access: yesNeural Information Processing Systems, 2022
A critical problem in the field of post hoc explainability is the lack of a common foundational goal among methods. For example, some methods are motivated by function approximation, some by game theoretic notions, and some by obtaining clean ...
Tessa Han   +2 more
semanticscholar   +1 more source

Human-in-the-loop: Provably Efficient Preference-based Reinforcement Learning with General Function Approximation [PDF]

open access: yesInternational Conference on Machine Learning, 2022
We study human-in-the-loop reinforcement learning (RL) with trajectory preferences, where instead of receiving a numeric reward at each step, the agent only receives preferences over trajectory pairs from a human overseer.
Xiaoyu Chen   +4 more
semanticscholar   +1 more source

Nearly Minimax Optimal Offline Reinforcement Learning with Linear Function Approximation: Single-Agent MDP and Markov Game [PDF]

open access: yesInternational Conference on Learning Representations, 2022
Offline reinforcement learning (RL) aims at learning an optimal strategy using a pre-collected dataset without further interactions with the environment.
Wei Xiong   +5 more
semanticscholar   +1 more source

VOQL: Towards Optimal Regret in Model-free RL with Nonlinear Function Approximation [PDF]

open access: yesarXiv.org, 2022
We study time-inhomogeneous episodic reinforcement learning (RL) under general function approximation and sparse rewards. We design a new algorithm, Variance-weighted Optimistic $Q$-Learning (VO$Q$L), based on $Q$-learning and bound its regret assuming ...
Alekh Agarwal, Yujia Jin, Tong Zhang
semanticscholar   +1 more source

Sigmoid-Weighted Linear Units for Neural Network Function Approximation in Reinforcement Learning [PDF]

open access: yesNeural Networks, 2017
In recent years, neural networks have enjoyed a renaissance as function approximators in reinforcement learning. Two decades after Tesauro's TD-Gammon achieved near top-level human performance in backgammon, the deep reinforcement learning algorithm DQN ...
Stefan Elfwing, E. Uchibe, K. Doya
semanticscholar   +1 more source

Provably Efficient Reinforcement Learning with Linear Function Approximation [PDF]

open access: yesAnnual Conference Computational Learning Theory, 2019
Modern reinforcement learning (RL) is commonly applied to practical problems with an enormous number of states, where function approximation must be deployed to approximate either the value function or the policy.
Chi Jin   +3 more
semanticscholar   +1 more source

A composite neural network that learns from multi-fidelity data: Application to function approximation and inverse PDE problems [PDF]

open access: yesJournal of Computational Physics, 2019
Currently the training of neural networks relies on data of comparable accuracy but in real applications only a very small set of high-fidelity data is available while inexpensive lower fidelity data may be plentiful.
Xuhui Meng, G. Karniadakis
semanticscholar   +1 more source

A Finite Time Analysis of Temporal Difference Learning With Linear Function Approximation [PDF]

open access: yesAnnual Conference Computational Learning Theory, 2018
Temporal difference learning (TD) is a simple iterative algorithm widely used for policy evaluation in Markov reward processes. Bhandari et al. prove finite time convergence rates for TD learning with linear function approximation.
Jalaj Bhandari   +2 more
semanticscholar   +1 more source

Home - About - Disclaimer - Privacy