Results 1 to 10 of about 64,931 (224)
Unsupervised Named Entity Disambiguation for Low Resource Domains [PDF]
In the ever-evolving landscape of natural language processing and information retrieval, the need for robust and domain-specific entity linking algorithms has become increasingly apparent. It is crucial in a considerable number of fields such as humanities, technical writing and biomedical sciences to enrich texts with semantics and discover more ...
D. V. Datta, Soumajit Pramanik
arxiv +6 more sources
A State of the Art of Word Sense Induction: A Way Towards Word Sense Disambiguation for Under-Resourced Languages [PDF]
Word Sense Disambiguation (WSD), the process of automatically identifying the meaning of a polysemous word in a sentence, is a fundamental task in Natural Language Processing (NLP). Progress in this approach to WSD opens up many promising developments in the field of NLP and its applications.
Mohammad Nasiruddin
arxiv +5 more sources
In this paper, a contrastive learning approach for morphological disambiguation (MD) using large language models (LLMs) is presented. A contrastive loss function is introduced for training the approach, which reduces the distance between the correct ...
Gulmira Tolegen+2 more
doaj +3 more sources
In natural language processing, word sense disambiguation (WSD) continues to be a major difficulty, especially for low-resource languages where linguistic variation and a lack of data make model training and evaluation more difficult.
Hlaudi Daniel Masethe+4 more
doaj +3 more sources
Minimalist Entity Disambiguation for Mid-Resource Languages [PDF]
For many languages and applications, even though enough data is available for training Named Entity Disambiguation (NED) systems, few off-the-shelf models are available for use in practice. This is due to both the large size of state-of-the-art models, and to the computational requirements for recreating them from scratch.
Benno Kruit
openalex +3 more sources
Exploiting a lexical resource for discourse connective disambiguation in German [PDF]
In this paper we focus on connective identification and sense classification for explicit discourse relations in German, as two individual sub-tasks of the overarching Shallow Discourse Parsing task. We successively augment a purely-empirical approach based on contextualised embeddings with linguistic knowledge encoded in a connective lexicon.
Peter Bourgonje, Manfred Stede
openalex +3 more sources
Data sets for author name disambiguation: an empirical analysis and a new resource [PDF]
Data sets of publication meta data with manually disambiguated author names play an important role in current author name disambiguation (AND) research. We review the most important data sets used so far, and compare their respective advantages and shortcomings. From the results of this review, we derive a set of general requirements to future AND data
Mark-Christoph Müller+2 more
openalex +5 more sources
An Unsupervised Word Sense Disambiguation System for Under-Resourced Languages
In this paper, we present Watasense, an unsupervised system for word sense disambiguation. Given a sentence, the system chooses the most relevant sense of each input word with respect to the semantic similarity between the given sentence and the synset constituting the sense of the target word. Watasense has two modes of operation. The sparse mode uses
Dmitry Ustalov+5 more
+8 more sources
GeneToList: A Web Application to Assist with Gene Identifiers for the Non-Bioinformatics-Savvy Scientist [PDF]
The increasing incorporation of omics technologies into biomedical research and translational medicine presents challenges to end users of the large and complex datasets that are generated by these methods.
Joshua D. Breidenbach+3 more
doaj +2 more sources
A comprehensive dataset for Arabic word sense disambiguation [PDF]
This data paper introduces a comprehensive dataset tailored for word sense disambiguation tasks, explicitly focusing on a hundred polysemous words frequently employed in Modern Standard Arabic.
Sanaa Kaddoura, Reem Nassar
doaj +2 more sources