Results 1 to 10 of about 19,142 (209)
In this paper, a contrastive learning approach for morphological disambiguation (MD) using large language models (LLMs) is presented. A contrastive loss function is introduced for training the approach, which reduces the distance between the correct ...
Gulmira Tolegen+2 more
doaj +3 more sources
Hybrid Transformer-Based Large Language Models for Word Sense Disambiguation in the Low-Resource Sesotho sa Leboa Language [PDF]
This study addresses a lexical ambiguity issue in Sesotho sa Leboa that arises from terms with various meanings, often known as homonyms or polysemous words.
Hlaudi Daniel Masethe+4 more
doaj +3 more sources
In natural language processing, word sense disambiguation (WSD) continues to be a major difficulty, especially for low-resource languages where linguistic variation and a lack of data make model training and evaluation more difficult.
Hlaudi Daniel Masethe+4 more
doaj +3 more sources
Minimalist Entity Disambiguation for Mid-Resource Languages [PDF]
For many languages and applications, even though enough data is available for training Named Entity Disambiguation (NED) systems, few off-the-shelf models are available for use in practice. This is due to both the large size of state-of-the-art models, and to the computational requirements for recreating them from scratch.
Benno Kruit
openalex +3 more sources
Exploiting a lexical resource for discourse connective disambiguation in German [PDF]
In this paper we focus on connective identification and sense classification for explicit discourse relations in German, as two individual sub-tasks of the overarching Shallow Discourse Parsing task. We successively augment a purely-empirical approach based on contextualised embeddings with linguistic knowledge encoded in a connective lexicon.
Peter Bourgonje, Manfred Stede
openalex +3 more sources
Unsupervised Named Entity Disambiguation for Low Resource Domains [PDF]
In the ever-evolving landscape of natural language processing and information retrieval, the need for robust and domain-specific entity linking algorithms has become increasingly apparent. It is crucial in a considerable number of fields such as humanities, technical writing and biomedical sciences to enrich texts with semantics and discover more ...
D. V. Datta, Soumajit Pramanik
+5 more sources
Data sets for author name disambiguation: an empirical analysis and a new resource [PDF]
Data sets of publication meta data with manually disambiguated author names play an important role in current author name disambiguation (AND) research. We review the most important data sets used so far, and compare their respective advantages and shortcomings. From the results of this review, we derive a set of general requirements to future AND data
Mark-Christoph Müller+2 more
openalex +5 more sources
An Unsupervised Word Sense Disambiguation System for Under-Resourced Languages
In this paper, we present Watasense, an unsupervised system for word sense disambiguation. Given a sentence, the system chooses the most relevant sense of each input word with respect to the semantic similarity between the given sentence and the synset constituting the sense of the target word. Watasense has two modes of operation. The sparse mode uses
Dmitry Ustalov+5 more
+8 more sources
A comprehensive dataset for Arabic word sense disambiguation [PDF]
This data paper introduces a comprehensive dataset tailored for word sense disambiguation tasks, explicitly focusing on a hundred polysemous words frequently employed in Modern Standard Arabic.
Sanaa Kaddoura, Reem Nassar
doaj +2 more sources
Word Sense Disambiguation in Native Spanish: A Comprehensive Lexical Evaluation Resource [PDF]
Human language, while aimed at conveying meaning, inherently carries ambiguity. It poses challenges for speech and language processing, but also serves crucial communicative functions. Efficiently solve ambiguity is both a desired and a necessary characteristic.
Pablo Ortega+4 more
openalex +3 more sources