Results 71 to 80 of about 186,001 (296)
LLMs Judging LLMs: A Simplex Perspective
Given the challenge of automatically evaluating free-form outputs from large language models (LLMs), an increasingly common solution is to use LLMs themselves as the judging mechanism, without any gold-standard scores. Implicitly, this practice accounts for only sampling variability (aleatoric uncertainty) and ignores uncertainty about judge quality ...
Vossler, Patrick +4 more
openaire +2 more sources
Particle-hole Asymmetry of Fractional Quantum Hall States in the Second Landau Level of a Two-dimensional Hole System [PDF]
We report the first unambiguous observation of a fractional quantum Hall state in the Landau level of a two-dimensional hole sample at the filling factor $\nu=8/3$.
Csathy, G. A. +5 more
core +3 more sources
Evaluating the Utilities of Foundation Models in Single‐Cell Data Analysis
This study delivers the first systematic, task‐level evaluation of single‐cell foundation models across eight core analytical tasks. By benchmarking 10 leading models with the scEval framework, it reveals where foundation models truly add value, where task‐specific methods still dominate, and provides concrete, reproducible guidelines to steer the next
Tianyu Liu +4 more
wiley +1 more source
This study explores the effectiveness of prompt optimization techniques for legal case outcome extraction using Large Language Models (LLMs). Two state-of-the-art LLMs, LLaMA3 70b and Mixtral 8x7b, are used in a zero-shot data extraction task on a ...
Guillaume Zambrano
doaj +1 more source
LLM-SQL-Solver: Can LLMs Determine SQL Equivalence?
Judging the equivalence between two SQL queries is a fundamental problem with many practical applications in data management and SQL generation (i.e., evaluating the quality of generated SQL queries in text-to-SQL task). While the research community has reasoned about SQL equivalence for decades, it poses considerable difficulties and no complete ...
Zhao, Fuheng +5 more
openaire +2 more sources
This study generates high‐fidelity synthetic longitudinal records for a million‐patient diabetes cohort, successfully replicating clinical predictive performance. However, deeper analysis reveals algorithmic biases and trajectory inconsistencies that escape standard quality metrics. These findings challenge current validation norms, demonstrating why a
Francisco Ortuño +5 more
wiley +1 more source
LLM-Supported Manufacturing Mapping Generation [PDF]
In large manufacturing companies, such as Bosch, that operate thousands of production lines with each comprising up to dozens of production machines and other equipment, even simple inventory questions such as of location and quantities of a particular ...
Schmidt, Wilma Johanna +3 more
doaj +1 more source
LLM-AutoDiff: Auto-Differentiate Any LLM Workflow
Large Language Models (LLMs) have reshaped natural language processing, powering applications from multi-hop retrieval and question answering to autonomous agent workflows. Yet, prompt engineering -- the task of crafting textual inputs to effectively direct LLMs -- remains difficult and labor-intensive, particularly for complex pipelines that combine ...
Yin, Li, Wang, Zhangyang
openaire +2 more sources
Causal Prediction of TP53 Variant Pathogenicity Using a Perturbation‐Informed Protein Language Model
A TP53‐specific predictor, CaVepP53, is developed by fine‐tuning ESMC on experimentally validated variants, quantifying pathogenicity via Euclidean distances. It outperforms general‐purpose models and extends to five cancer genes, enabling interpretable variant classification for precision medicine.
Huiying Chen +15 more
wiley +1 more source
Large Language Models: А Socio-Philosophical Essay
Neural networks have filled the information space. On the one hand, this indicates the scientific and technological movement of contemporary society (perhaps, AGI is already waiting for us outside the door). On the other hand, in everyday discourse there
Regina V. Penner
doaj +1 more source

