Results 1 to 10 of about 31,794,968 (380)
A Survey on Evaluation of Large Language Models [PDF]
Large language models (LLMs) are gaining increasing popularity in both academia and industry, owing to their unprecedented performance in various applications. As LLMs continue to play a vital role in both research and daily use, their evaluation becomes
Yu-Chu Chang+15 more
semanticscholar +1 more source
CLIPScore: A Reference-free Evaluation Metric for Image Captioning [PDF]
Image captioning has conventionally relied on reference-based automatic evaluations, where machine captions are compared against captions written by humans. This is in contrast to the reference-free manner in which humans assess caption quality.
Jack Hessel+4 more
semanticscholar +1 more source
A research agenda for malaria eradication: monitoring, evaluation, and surveillance. [PDF]
Monitoring, evaluation, and surveillance measure how well public health programs operate over time and achieve their goals. As countries approach malaria elimination, these activities will need to shift from measuring reductions in morbidity and ...
malERA Consultative Group on Monitoring, Evaluation, and Surveillance
doaj +1 more source
Holistic Evaluation of Language Models
Language models (LMs) like GPT‐3, PaLM, and ChatGPT are the foundation for almost all major language technologies, but their capabilities, limitations, and risks are not well understood. We present Holistic Evaluation of Language Models (HELM) to improve
Percy Liang+49 more
semanticscholar +1 more source
Risk of herpes zoster following mRNA COVID-19 vaccine administration
Background Adverse events following mRNA COVID-19 vaccines, including herpes zoster (HZ), have been reported. We conducted a cohort study to evaluate the association between mRNA COVID-19 vaccination and subsequent HZ at Kaiser Permanente Southern ...
Ana Florea+7 more
doaj +1 more source
Methacholine challenges: comparison of different tidal breathing challenge methods
Tidal-breathing methacholine challenges are now recommended by guidelines, to avoid the bronchoprotective effects of deep inhalation. This study compared different tidal breathing methacholine challenge methods; assessed the agreement between tidal ...
James Dean+4 more
doaj +1 more source
CIDEr: Consensus-based image description evaluation [PDF]
Automatically describing an image with a sentence is a long-standing challenge in computer vision and natural language processing. Due to recent progress in object detection, attribute classification, action recognition, etc., there is renewed interest ...
Ramakrishna Vedantam+2 more
semanticscholar +1 more source
Background Many countries have made substantial progress in scaling-up and sustaining malaria intervention coverage, leading to more focalized and heterogeneous transmission in many settings.
Ruth A. Ashton+4 more
doaj +1 more source
Introduction SB12 is a biosimilar to eculizumab reference product [SolirisTM (Soliris is a trademark of Alexion Pharmaceuticals, Inc.)] that acts as a C5 complement protein inhibitor. The infusion stability of in-use (diluted) SB12 outside the conditions
Minji Tak+6 more
doaj +1 more source
Background The effects of triple therapy on gas trapping in COPD are not fully understood. We evaluated the effects of the long acting bronchodilator components of the extrafine single inhaler triple therapy beclometasone dipropionate/formoterol ...
James Dean+3 more
doaj +1 more source