Results 241 to 250 of about 442,274 (281)
Some of the next articles are maybe not open access.

RewardBench: Evaluating Reward Models for Language Modeling

arXiv.org
Reward models (RMs) are at the crux of successfully using RLHF to align pretrained models to human preferences, yet there has been relatively little study that focuses on evaluation of those models.
Nathan Lambert   +11 more
semanticscholar   +1 more source

Rewarding excellence

Nursing Standard, 1991
Professional awards and scholarships are often regarded as accolades that mere mortals within the world of nursing, midwifery and health visiting stand little chance of getting.
openaire   +2 more sources

Interpretable Preferences via Multi-Objective Reward Modeling and Mixture-of-Experts

Conference on Empirical Methods in Natural Language Processing
Reinforcement learning from human feedback (RLHF) has emerged as the primary method for aligning large language models (LLMs) with human preferences.
Haoxiang Wang   +4 more
semanticscholar   +1 more source

Skywork-Reward: Bag of Tricks for Reward Modeling in LLMs

arXiv.org
In this report, we introduce a collection of methods to enhance reward modeling for LLMs, focusing specifically on data-centric techniques. We propose effective data selection and filtering strategies for curating high-quality open-source preference ...
Chris Liu   +8 more
semanticscholar   +1 more source

Rewarding efficiency

Nursing Management, 2006
SINCE THE new year, and for the first time in the history of the NHS, all eligible patients across England have the right to exercise choice over where and when they receive hospital treatment. They can now choose services that meet their individual needs and preferences.
openaire   +2 more sources

Generative Verifiers: Reward Modeling as Next-Token Prediction

International Conference on Learning Representations
Verifiers or reward models are often used to enhance the reasoning performance of large language models (LLMs). A common approach is the Best-of-N method, where N candidate solutions generated by the LLM are ranked by a verifier, and the best one is ...
Lunjun Zhang   +5 more
semanticscholar   +1 more source

Just rewards

Nursing Standard, 1987
Nurses are not just angry at the proposed abolition of special duty payments, they are livid. It is matter of basic bread and butter, not the cream on the cake. Perhaps it would not matter if nurses were adequately rewarded in the first place. The real issue is that nurses are grossly underpaid in the first place.
openaire   +2 more sources

Reward Dependence and Reward Deficiency

2016
Homo sapiens are biologically predisposed to drink, eat, reproduce, and desire pleasurable experiences. Underlying the reward value and affective properties of these behaviors and the stimuli that elicit them is an extended cortical–subcortical network in which dopamine (DA) acts as the major neurotransmitter for reward and reinforcement.
Marlene Oscar-Berman, Kenneth Blum
openaire   +1 more source

[Natural rewarding and drug rewarding].

Sheng li ke xue jin zhan [Progress in physiology], 2006
In the brain of animals and humans there is a rewarding mechanism to encourage the behavior that is beneficial for the living of the individual and for the prolongation of the generation. However, when this system is being abused by drugs of addiction, chronic adaptive changes may occur that would cause serious damage to the organism.
Cai-Lian, Cui, Ji-Sheng, Han
openaire   +1 more source

Adverse health effects of high-effort/low-reward conditions.

Journal of Occupational Health Psychology, 1996
J. Siegrist
semanticscholar   +1 more source

Home - About - Disclaimer - Privacy