Results 351 to 360 of about 2,703,795 (408)
Some of the next articles are maybe not open access.
ACM Transactions on Software Engineering and Methodology
Evaluating the alignment of large language models (LLMs) with user-defined coding preferences is a challenging endeavor that requires a deep assessment of LLMs’ outputs.
M. Weyssow +3 more
semanticscholar +1 more source
Evaluating the alignment of large language models (LLMs) with user-defined coding preferences is a challenging endeavor that requires a deep assessment of LLMs’ outputs.
M. Weyssow +3 more
semanticscholar +1 more source
Annual Meeting of the Association for Computational Linguistics
Recently, there has been a growing trend of utilizing Large Language Model (LLM) to evaluate the quality of other LLMs. Many studies have fine-tuned judge models based on open-source LLMs for evaluation.
Hui Huang +6 more
semanticscholar +1 more source
Recently, there has been a growing trend of utilizing Large Language Model (LLM) to evaluate the quality of other LLMs. Many studies have fine-tuned judge models based on open-source LLMs for evaluation.
Hui Huang +6 more
semanticscholar +1 more source
CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution
arXiv.orgEfficient and accurate evaluation is crucial for the continuous improvement of large language models (LLMs). Among various assessment methods, subjective evaluation has garnered significant attention due to its superior alignment with real-world usage ...
Maosong Cao +5 more
semanticscholar +1 more source
Human-Centered Design Recommendations for LLM-as-a-judge
HUCLLMTraditional reference-based metrics, such as BLEU and ROUGE, are less effective for assessing outputs from Large Language Models (LLMs) that produce highly creative or superior-quality text, or in situations where reference outputs are unavailable. While
Qian Pan +7 more
semanticscholar +1 more source
Judge response theory? A call to upgrade our psychometrical account of creativity judgments.
Psychology of Aesthetics, Creativity, and the Arts, 2019The Consensual Assessment Technique (CAT)—more generally, using product creativity judgments—is a central and actively debated method to assess product and individual creativity.
Nils Myszkowski, M. Storme
semanticscholar +1 more source
2006
Abstract Several of the preceding chapters have examined the evolution of the Court’s case law over time in particular fields. In this final chapter, that process of evolution takes centre stage and an attempt is made to set it in its wider legal and political context.
openaire +1 more source
Abstract Several of the preceding chapters have examined the evolution of the Court’s case law over time in particular fields. In this final chapter, that process of evolution takes centre stage and an attempt is made to set it in its wider legal and political context.
openaire +1 more source
2012
“ The nation will judge both the offender and judges for themselves.” Jefferson to William B. Giles, April 20, 1807 “…His Honor did not for two days understand either the questions or himself…” Burr on Marshall, September 20, 1807 “Our Treason Laws may be defective, but I believe Marshall’s Conduct strictly and correctly legal as the Laws now ...
openaire +1 more source
“ The nation will judge both the offender and judges for themselves.” Jefferson to William B. Giles, April 20, 1807 “…His Honor did not for two days understand either the questions or himself…” Burr on Marshall, September 20, 1807 “Our Treason Laws may be defective, but I believe Marshall’s Conduct strictly and correctly legal as the Laws now ...
openaire +1 more source
ANZ Journal of Surgery, 2007
In three distinct situations, judges may be obliged to pronounce on doctors’ opinions or conduct. The first of these is where they are deciding actions involving claims for personal injuries in respect of which doctors have given opinions to the court. The second situation in which the judge may be obliged to pronounce on a doctor’s opinion or conduct ...
openaire +2 more sources
In three distinct situations, judges may be obliged to pronounce on doctors’ opinions or conduct. The first of these is where they are deciding actions involving claims for personal injuries in respect of which doctors have given opinions to the court. The second situation in which the judge may be obliged to pronounce on a doctor’s opinion or conduct ...
openaire +2 more sources
Soviet Law and Government, 1988
It happened on the eve of the election. A reader called the editorial office to ask: "Is it true that your correspondent Borin wrote an article defending his relative?" Generally, such "sensations" are nothing new to newspapermen; no sooner do we return from an assignment than the mud is already flying at our backs, faster than speeding bullets.
openaire +1 more source
It happened on the eve of the election. A reader called the editorial office to ask: "Is it true that your correspondent Borin wrote an article defending his relative?" Generally, such "sensations" are nothing new to newspapermen; no sooner do we return from an assignment than the mud is already flying at our backs, faster than speeding bullets.
openaire +1 more source
Abstract There are already some small-scale automated decision-making processes that have been introduced in the judicial arena. In addition, there are AI systems that can ‘nudge’, ‘prompt’, or ‘correct’ judges when making decisions, as well as generative forms of AI that could support judicial decision-making.
Tania Sourdin, Ella Sourdin Brown
openaire +1 more source
Tania Sourdin, Ella Sourdin Brown
openaire +1 more source

