Results 1 to 10 of about 4,321 (229)

Kurdish Kurmanji Lemmatization and Spell-checker with Spell-correction

open access: yesUHD Journal of Science and Technology, 2023
There are many studies about using lemmatization and spell-checker with spell-correction regarding English, Arabic, and Persian languages but only few studies found regarding low-resource languages such as Kurdish language and more specifically for ...
Hanar Hoshyar Mustafa, Rebwar M. Nabi
doaj   +3 more sources

Hybrid lemmatization in HuSpaCy [PDF]

open access: yesCoRR, 2023
Lemmatization is still not a trivial task for morphologically rich languages. Previous studies showed that hybrid architectures usually work better for these languages and can yield great results. This paper presents a hybrid lemmatizer utilizing both a neural model, dictionaries and hand-crafted rules.
Péter Berkecz   +4 more
core   +4 more sources

Towards an Optimal Solution to Lemmatization in Arabic

open access: yesProcedia Computer Science, 2018
Abstract Lemmatization—computing the canonical forms of words in running text—is an important component in any NLP system and a key preprocessing step for most applications that rely on natural language understanding. In the case of Arabic, lemmatization is a complex task because of the rich morphology, agglutinative aspects, and lexical ambiguity ...
Abed Alhakim Freihat   +2 more
exaly   +4 more sources

The Lemmatization of Copulatives in Northern Sotho *

open access: yesLexikos, 2011
<p>Abstract: For learners of Northern Sotho as a second or even foreign language, the copulative system is probably the most complicated grammatical system to master. The encoding needs of such learners, i.e.
D.J. Prinsloo
doaj   +4 more sources

Korpus DIA1900: jeho koncepce a vytváření [PDF]

open access: yesČasopis pro Moderní Filologii, 2023
The objective of the paper is to describe the principles for building the onemillionword DIA1900 Corpus consisting of Czech texts published between 1851 and 1900, designed to be both balanced and representative.
Lucie Benešová   +4 more
doaj   +1 more source

Universal Lemmatizer: A sequence-to-sequence model for lemmatizing Universal Dependencies treebanks [PDF]

open access: yesNatural Language Engineering, 2020
AbstractIn this paper, we present a novel lemmatization method based on a sequence-to-sequence neural network architecture and morphosyntactic context representation. In the proposed method, our context-sensitive lemmatizer generates the lemma one character at a time based on the surface form characters and its morphosyntactic features obtained from a ...
Jenna Kanerva   +2 more
openaire   +2 more sources

Advances in the automatic lemmatization of Old English: class V strong verbs (L-Y)

open access: yesRevista de Lingüística y Lenguas Aplicadas, 2022
The grammatical description of Old English lacks complete and systematic lemmatization, which hinders Natural Language Processing studies in this language, as they strongly rely on the existence of large, annotated corpora.
Roberto Torre Alonso
doaj   +1 more source

BaNeL: an encoder-decoder based Bangla neural lemmatizer

open access: yesSN Applied Sciences, 2022
This study presents an efficient framework of deriving lemma from an inflected Bangla word considering its parts-of-speech as context. Bangla is a morphologically rich Indo-Aryan language where around 70% words are inflected, and some words have around ...
Md. Ashraful Islam   +4 more
doaj   +1 more source

An alternative proposal for eliciting key words [PDF]

open access: yesEnglish Studies at NBU, 2015
The article reports research on the concept of key words as statistically significant items in a text or corpus. It reviews approaches to eliciting key words used in various software products for language analysis and the rationale for adopting them ...
Elena Tarasheva
doaj   +1 more source

Enhancing Accuracy of Semantic Relatedness Measurement by Word Single-Meaning Embeddings

open access: yesIEEE Access, 2021
We propose a lightweight algorithm of learning word single-meaning embeddings (WSME), by exploring WordNet synsets and Doc2vec document embeddings, to enhance the accuracy of semantic relatedness measurement.
Xiaotao Li, Shujuan You, Wai Chen
doaj   +1 more source

Home - About - Disclaimer - Privacy