Results 11 to 20 of about 14,660 (202)

Development of a Hindi Lemmatizer [PDF]

open access: yes, 2013
We live in a translingual society, in order to communicate with people from different parts of the world we need to have an expertise in their respective languages.
Joshi, Nisheeth   +2 more
core   +3 more sources

Simple data-driven context-sensitive lemmatization [PDF]

open access: yes, 2006
Lemmatization for languages with rich inflectional morphology is one of the basic, indispensable steps in a language processing pipeline. In this paper we present a simple data-driven context-sensitive approach to lemmatizating word forms in running text.
Chrupała, Grzegorz
core   +4 more sources

Universal Lemmatizer: A sequence-to-sequence model for lemmatizing Universal Dependencies treebanks [PDF]

open access: yesNatural Language Engineering, 2020
AbstractIn this paper, we present a novel lemmatization method based on a sequence-to-sequence neural network architecture and morphosyntactic context representation. In the proposed method, our context-sensitive lemmatizer generates the lemma one character at a time based on the surface form characters and its morphosyntactic features obtained from a ...
Jenna Kanerva   +2 more
openaire   +2 more sources

Lemmatization of Polish person names [PDF]

open access: yesProceedings of the Workshop on Balto-Slavonic Natural Language Processing Information Extraction and Enabling Technologies - ACL '07, 2007
The paper presents two techniques for lemmatization of Polish person names. First, we apply a rule-based approach which relies on linguistic information and heuristics. Then, we investigate an alternative knowledge-poor method which employs string distance measures. We provide an evaluation of the adopted techniques using a set of newspaper texts.
Piskorski, Jakub   +2 more
openaire   +2 more sources

Translating Speech to Indian Sign Language Using Natural Language Processing

open access: yesFuture Internet, 2022
Language plays a vital role in the communication of ideas, thoughts, and information to others. Hearing-impaired people also understand our thoughts using a language known as sign language.
Purushottam Sharma   +4 more
doaj   +1 more source

Hybrid lemmatization in HuSpaCy

open access: yes, 2023
Lemmatization is still not a trivial task for morphologically rich languages. Previous studies showed that hybrid architectures usually work better for these languages and can yield great results. This paper presents a hybrid lemmatizer utilizing both a neural model, dictionaries and hand-crafted rules.
Berkecz, Péter   +4 more
openaire   +2 more sources

Morphological Tagging and Lemmatization in the Albanian Language

open access: yesSEEU Review, 2021
An important element of Natural Language Processing is parts of speech tagging. With fine-grained word-class annotations, the word forms in a text can be enhanced and can also be used in downstream processes, such as dependency parsing.
Mati Diellza Nagavci   +2 more
doaj   +1 more source

Developing Core Technologies for Resource-Scarce Nguni Languages

open access: yesInformation, 2021
The creation of linguistic resources is crucial to the continued growth of research and development efforts in the field of natural language processing, especially for resource-scarce languages.
Jakobus S. du Toit, Martin J. Puttkammer
doaj   +1 more source

Revolutionizing bantu lexicography: a Zulu case study [PDF]

open access: yes, 2010
Zulu uses a conjunctive writing system, that is, a system whereby relatively short linguistic words are joined together to form long orthographic words with complex morphological structures.
de Schryver, Gilles-Maurice
core   +4 more sources

Automatic Lemmatizer Construction with Focus on OOV Words Lemmatization [PDF]

open access: yes, 2005
This paper deals with the automatic construction of a lemmatizer from a Full Form – Lemma (FFL) training dictionary and with lemmatization of new, in the FFL dictionary unseen, i.e. out-of-vocabulary (OOV) words. Three methods of lemmatization of three kinds of OOV words (missing full forms, unknown words, and compound words) are introduced.
Jakub Kanis, Luděk Müller
openaire   +1 more source

Home - About - Disclaimer - Privacy