Results 1 to 10 of about 3,188,697 (327)

The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. [PDF]

open access: goldBMC Genomics, 2020
AbstractBackgroundTo evaluate binary classifications and their confusion matrices, scientific researchers can employ several statistical rates, accordingly to the goal of the experiment they are investigating. Despite being a crucial issue in machine learning, no widespread consensus has been reached on a unified elective chosen measure yet.
Chicco D, Jurman G.
europepmc   +10 more sources

90% F1 Score in Relation Triple Extraction: Is it Real? [PDF]

open access: hybridProceedings of the 1st GenBench Workshop on (Benchmarking) Generalisation in NLP, 2023
Accepted in GenBench workshop @ EMNLP ...
Pratik Saini   +3 more
semanticscholar   +5 more sources

Machine-learning classification of astronomical sources: estimating F1-score in the absence of ground truth [PDF]

open access: greenMonthly Notices of the Royal Astronomical Society: Letters, 2022
ABSTRACT Machine-learning based classifiers have become indispensable in the field of astrophysics, allowing separation of astronomical sources into various classes, with computational efficiency suitable for application to the enormous data volumes that wide-area surveys now typically produce.
A. Humphrey   +7 more
semanticscholar   +7 more sources

Anomaly Detection: How to Artificially Increase your F1-Score with a Biased Evaluation Protocol [PDF]

open access: greenECML/PKDD, 2021
Anomaly detection is a widely explored domain in machine learning. Many models are proposed in the literature, and compared through different metrics measured on various datasets. The most popular metrics used to compare performances are F1-score, AUC and AVPR. In this paper, we show that F1-score and AVPR are highly sensitive to the contamination rate.
Damien Fourure   +3 more
semanticscholar   +7 more sources

Maximum F1-score training for end-to-end mispronunciation detection and diagnosis of L2 English speech [PDF]

open access: greenIEEE International Conference on Multimedia and Expo, 2021
End-to-end (E2E) neural models are increasingly attracting attention as a promising modeling approach for mispronunciation detection and diagnosis (MDD). Typically, these models are trained by optimizing a cross-entropy criterion, which corresponds to improving the log-likelihood of the training data.
Bicheng Yan   +3 more
semanticscholar   +5 more sources

Probabilistic Extension of Precision, Recall, and F1 Score for More Thorough Evaluation of Classification Models [PDF]

open access: hybridProceedings of the First Workshop on Evaluation and Comparison of NLP Systems, 2020
In pursuit of the perfect supervised NLP classifier, razor thin margins and low-resource test sets can make modeling decisions difficult. Popular metrics such as Accuracy, Precision, and Recall are often insufficient as they fail to give a complete picture of the model’s behavior. We present a probabilistic extension of Precision, Recall, and F1 score,
Reda Yacouby, Dustin Axman
semanticscholar   +4 more sources

An intruder from another world: F1-score.

open access: diamondRevista Electrónica AnestesiaR
El F1-score, también llamado F-score o medida F, es un estimador de la capacidad de clasificación de una prueba que se usa con frecuencia en la ciencia de datos y en los algoritmos de inteligencia artificial y que puede ser de utilidad para la valoración de las pruebas diagnósticas.
Manuel Molina
semanticscholar   +6 more sources

About Evaluation of F1 Score for RECENT Relation Extraction System [PDF]

open access: greenarXiv.org, 2023
This document contains a discussion of the F1 score evaluation used in the article 'Relation Classification with Entity Type Restriction' by Shengfei Lyu, Huanhuan Chen published on Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021.
Michał Olek
semanticscholar   +5 more sources

Confidence Intervals for the F1 Score: A Comparison of Four Methods

open access: green, 2023
In Natural Language Processing (NLP), binary classification algorithms are often evaluated using the F1 score. Because the sample F1 score is an estimate of the population F1 score, it is not sufficient to report the sample F1 score without an indication
Kevin Fu Yuan Lam
core   +5 more sources

Keeping Pathologists in the Loop and an Adaptive F1-Score Threshold Method for Mitosis Detection in Canine Perivascular Wall Tumours. [PDF]

open access: goldCancers (Basel)
Performing a mitosis count (MC) is the diagnostic task of histologically grading canine Soft Tissue Sarcoma (cSTS). However, mitosis count is subject to inter- and intra-observer variability. Deep learning models can offer a standardisation in the process of MC used to histologically grade canine Soft Tissue Sarcomas.
Rai T   +8 more
europepmc   +6 more sources

Home - About - Disclaimer - Privacy