Results 311 to 320 of about 4,127,736 (370)
Some of the next articles are maybe not open access.
Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference
International Conference on Machine LearningLarge Language Models (LLMs) have unlocked new capabilities and applications; however, evaluating the alignment with human preferences still poses significant challenges.
Wei-Lin Chiang +10 more
semanticscholar +1 more source
SimPO: Simple Preference Optimization with a Reference-Free Reward
Neural Information Processing SystemsDirect Preference Optimization (DPO) is a widely used offline preference optimization algorithm that reparameterizes reward functions in reinforcement learning from human feedback (RLHF) to enhance simplicity and training stability.
Yu Meng, Mengzhou Xia, Danqi Chen
semanticscholar +1 more source
ORPO: Monolithic Preference Optimization without Reference Model
Conference on Empirical Methods in Natural Language ProcessingWhile recent preference alignment algorithms for language models have demonstrated promising results, supervised fine-tuning (SFT) remains imperative for achieving successful convergence.
Jiwoo Hong, Noah Lee, James Thorne
semanticscholar +1 more source
MeLU: Meta-Learned User Preference Estimator for Cold-Start Recommendation
Knowledge Discovery and Data Mining, 2019This paper proposes a recommender system to alleviate the cold-start problem that can estimate user preferences based on only a small number of items.
Hoyeop Lee +4 more
semanticscholar +1 more source
Metrizable preferences over preferences
Social Choice and Welfare, 2020zbMATH Open Web Interface contents unavailable due to conflicting licenses.
Laffond, Gilbert +2 more
openaire +1 more source
Iterative Reasoning Preference Optimization
Neural Information Processing SystemsIterative preference optimization methods have recently been shown to perform well for general instruction tuning tasks, but typically make little improvement on reasoning tasks (Yuan et al., 2024, Chen et al., 2024). In this work we develop an iterative
Richard Yuanzhe Pang +5 more
semanticscholar +1 more source
HelpSteer2-Preference: Complementing Ratings with Preferences
International Conference on Learning RepresentationsReward models are critical for aligning models to follow instructions, and are typically trained following one of two popular paradigms: Bradley-Terry style or Regression style. However, there is a lack of evidence that either approach is better than the
Zhilin Wang +7 more
semanticscholar +1 more source
Aesthetic preference and lateral preferences
Neuropsychologia, 1986Subjects expressed preference for original or mirror-reversed versions of paintings. Hand preference predicted a significant proportion of the choice variance, but eye, foot and ear preference did not, nor did family sinistrality.
openaire +2 more sources
2012
This article explores our current understanding of why we like and choose to listen to the music that we do. It begins by defining terms and considering methods, moving on to discuss the biological influences of arousal and other personality traits on music preference, questions of style discrimination, and finally the cultural influences of experience
Alinka Greasley, Alexandra Lamont
openaire +1 more source
This article explores our current understanding of why we like and choose to listen to the music that we do. It begins by defining terms and considering methods, moving on to discuss the biological influences of arousal and other personality traits on music preference, questions of style discrimination, and finally the cultural influences of experience
Alinka Greasley, Alexandra Lamont
openaire +1 more source
2012
Personal experience, learned eating behaviors, hormones, neurotransmitters, and genetic variations affect food consumption. The decision of what to eat is modulated by taste, olfaction, and oral textural perception. Taste, in particular, has an important input into food preference, permitting individuals to differentiate nutritive and harmful ...
María Mercedes, Galindo +4 more
openaire +2 more sources
Personal experience, learned eating behaviors, hormones, neurotransmitters, and genetic variations affect food consumption. The decision of what to eat is modulated by taste, olfaction, and oral textural perception. Taste, in particular, has an important input into food preference, permitting individuals to differentiate nutritive and harmful ...
María Mercedes, Galindo +4 more
openaire +2 more sources

