Results 191 to 200 of about 4,321 (229)

Lemmatization of Inflected Nouns

2021
In this chapter, we describe a process of lemmatization of inflected nouns in Bengali as a part of lexical processing. Inflected nouns are used at a very high frequency in Bengali texts. We first collect a large number of inflected nouns from a Bengali corpus and compile a noun database.
Niladri Sekhar Dash, Dash Niladri Sekhar
exaly   +2 more sources

Graph-based lemmatization of Turkish words by using morphological similarity

open access: yes, 2016
Bitdefender;Department of Computers and Information Technology of the Faculty of Automation, Computers and Electronics;Department of Informatics of the Faculty of Mathematics and Natural Sciences;Department of Statistics and Business Informatics of the ...
Enis Arslan, Umut Orhan
exaly   +2 more sources

A hybrid approach for Arabic lemmatization

International Journal of Speech Technology, 2018
We present in this article an Arabic lemmatizer that assigns to each word of an Arabic sentence, a single lemma taking into account the word context. The proposed system comprises two modules. The first one consists in an analysis out of context, based on the morphosyntactic analyser Alkhalil Morpho Sys 2.
Mohamed Boudchiche, Azzeddine Mazroui
openaire   +1 more source

Hybrid Lemmatizer for Estonian

2014
In this paper, we present a lemmatizer for the Estonian language, which employs a hybrid approach to handle both in- and out-of-vocabulary words. Our method uses only publicly available data and does not require any external tools such as a POS tagger. In the process of experimentation, we achieved the accuracy of 91%.
Alexander Tkachenko   +2 more
openaire   +1 more source

Automatic lemmatization of Persian words*

Journal of Quantitative Linguistics, 2006
Abstract This study presents a rather novel method for suffix and prefix stripping of Persian words. The method presented is a language independent one and mostly relies on a specially arranged corpus composed of a list of roots, word-forms, prefixes, and suffixes which has been manually compiled.
openaire   +1 more source

A Lemmatizer Tool for Assamese Language

2019
Word Sense Disambiguation (WSD) requires sense tagged corpora. Words in a corpus appear in inflected or morphed forms. Sense tagging can only be done with words in their root or lemmatized forms. Similarly for Part of Speech Tagging (POS), the words in a corpus are required to be available in their root forms.
Arindam Roy   +2 more
openaire   +1 more source

Lemmatization and Headword Structure

1997
Abstract We now come to the actual structure and presentation of Palsgrave’s word list, and, as we can see from the following examples, his lexicographical method of presentation differs greatly from modem practice in bilingual English-French dictionaries.
openaire   +1 more source

Amazlem: The First Amazigh Lemmatizer

2023
Rkia Bani   +3 more
openaire   +1 more source

Home - About - Disclaimer - Privacy