Results 1 to 10 of about 5,876,040 (331)
Reinforcement Learning: A Survey [PDF]
This paper surveys the field of reinforcement learning from a computer-science perspective. It is written to be accessible to researchers familiar with machine learning.
Kaelbling, L. P.+2 more
core +14 more sources
Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback [PDF]
We apply preference modeling and reinforcement learning from human feedback (RLHF) to finetune language models to act as helpful and harmless assistants.
Yuntao Bai+30 more
semanticscholar +1 more source
Training Diffusion Models with Reinforcement Learning [PDF]
Diffusion models are a class of flexible generative models trained with an approximation to the log-likelihood objective. However, most use cases of diffusion models are not concerned with likelihoods, but instead with downstream objectives such as human-
Kevin Black+4 more
semanticscholar +1 more source
Efficient Online Reinforcement Learning with Offline Data [PDF]
Sample efficiency and exploration remain major challenges in online reinforcement learning (RL). A powerful approach that can be applied to address these issues is the inclusion of offline data, such as prior trajectories from a human expert or a sub ...
Philip J. Ball+3 more
semanticscholar +1 more source
A well-designed demand response (DR) program is essential in smart home to optimize energy usage according to user preferences. In this study, we proposed a multiobjective reinforcement learning (MORL) algorithm to design a DR program.
Song-Jen Chen, Wei-Yu Chiu, Wei-Jen Liu
doaj +1 more source
Learning an Accurate State Transition Dynamics Model by Fitting Both a Function and its Derivative
Learning accurate state transition dynamics model in a sample-efficient way is important to predict the future states from the current states and actions of a system both accurately and efficiently in model-based reinforcement learning for many robotic ...
Youngho Kim, Hoosang Lee, Jeha Ryu
doaj +1 more source
Guiding Pretraining in Reinforcement Learning with Large Language Models [PDF]
Reinforcement learning algorithms typically struggle in the absence of a dense, well-shaped reward function. Intrinsically motivated exploration methods address this limitation by rewarding agents for visiting novel states or transitions, but these ...
Yuqing Du+7 more
semanticscholar +1 more source
Snake-like modular robots (MRs) are highly flexible, but, to traverse a challenging terrain or explore a region of interest, MR needs to attain efficient locomotion depending on a tradeoff between objectives like forward velocity and power consumption of
Akash Singh+3 more
doaj +1 more source
Diffusion Policies as an Expressive Policy Class for Offline Reinforcement Learning [PDF]
Offline reinforcement learning (RL), which aims to learn an optimal policy using a previously collected static dataset, is an important paradigm of RL.
Zhendong Wang+2 more
semanticscholar +1 more source
IntroductionMany American employers seek to alleviate employee mental health symptoms through resources like employee assistance programs (EAPs), yet these programs are often underutilized.
Ashley B. West+4 more
doaj +1 more source