Results 321 to 330 of about 2,703,795 (408)
Some of the next articles are maybe not open access.

Related searches:

Institutional Systems for Equitable Wealth Creation: Replication and an Update of Judge et al. (2014)

Management and Organization Review, 2020
This replication study was invited by the Editor in Chief of Management and Organization Review, Arie Y. Lewin. The original study by Judge, Fainshmidt, and Brown (2014) spanned the global financial crisis (2005–2010), and as such, this anomalous time ...
William Q. Judge   +2 more
exaly   +2 more sources

A Survey on LLM-as-a-Judge

arXiv.org
Accurate and consistent evaluation is crucial for decision-making across numerous fields, yet it remains a challenging task due to inherent subjectivity, variability, and scale. Large Language Models (LLMs) have achieved remarkable success across diverse
Jiawei Gu   +11 more
semanticscholar   +1 more source

Preference Leakage: A Contamination Problem in LLM-as-a-judge

arXiv.org
Large Language Models (LLMs) as judges and LLM-based data synthesis have emerged as two fundamental LLM-driven data annotation methods in model development.
Dawei Li   +8 more
semanticscholar   +1 more source

Judging Judge Fixed Effects

American Economic Review, 2023
We propose a nonparametric test for the exclusion and monotonicity assumptions invoked in instrumental variable (IV) designs based on the random assignment of cases to judges. We show its asymptotic validity and demonstrate its finite-sample performance in simulations. We apply our test in an empirical setting from the literature examining the effects
Brigham Frandsen   +2 more
openaire   +1 more source

Can LLMs Replace Human Evaluators? An Empirical Study of LLM-as-a-Judge in Software Engineering

Proc. ACM Softw. Eng.
Recently, large language models (LLMs) have been deployed to tackle various software engineering (SE) tasks like code generation, significantly advancing the automation of SE tasks.
Ruiqi Wang   +5 more
semanticscholar   +1 more source

JudgeLRM: Large Reasoning Models as a Judge

arXiv.org
Large Language Models (LLMs) are increasingly adopted as evaluators, offering a scalable alternative to human annotation. However, existing supervised fine-tuning (SFT) approaches often fall short in domains that demand complex reasoning.
Nuo Chen   +6 more
semanticscholar   +1 more source

Learning to Plan & Reason for Evaluation with Thinking-LLM-as-a-Judge

International Conference on Machine Learning
LLM-as-a-Judge models generate chain-of-thought (CoT) sequences intended to capture the step-bystep reasoning process that underlies the final evaluation of a response.
Swarnadeep Saha   +4 more
semanticscholar   +1 more source

From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge

Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Assessment and evaluation have long been critical challenges in artificial intelligence (AI) and natural language processing (NLP). Traditional methods, usually matching-based or small model-based, often fall short in open-ended and dynamic scenarios ...
Dawei Li   +12 more
semanticscholar   +1 more source

Home - About - Disclaimer - Privacy