Results 11 to 20 of about 4,321 (229)
Lemmatization of Polish person names [PDF]
The paper presents two techniques for lemmatization of Polish person names. First, we apply a rule-based approach which relies on linguistic information and heuristics. Then, we investigate an alternative knowledge-poor method which employs string distance measures. We provide an evaluation of the adopted techniques using a set of newspaper texts.
Piskorski, Jakub +2 more
core +5 more sources
Joint Lemmatization and Morphological Tagging with Lemming [PDF]
We present LEMMING, a modular log-linear model that jointly models lemmatization and tagging and supports the integration of arbitrary global features. It is trainable on corpora annotated with gold standard tags and lemmata and does not rely on morphological dictionaries or analyzers.
Thomas Müller 0009 +3 more
openaire +3 more sources
Contextual Urdu Lemmatization Using Recurrent Neural Network Models [PDF]
In the field of natural language processing, machine translation is a colossally developing research area that helps humans communicate more effectively by bridging the linguistic gap.
Rabab Hafeez +7 more
doaj +2 more sources
Lemmatization with reversed dictionary and fuzzy sets
This paper deals with the problem of lemmatization of unknown words in Russian and German. For this purpose, the improved analogy method is used. The analogy method being built around reverse dictionary is very efficient and simple to realize.
Gashkov Alexander, Eltsova Mariia
doaj +3 more sources
Italian Lemmatization by Rules with Getaruns [PDF]
We present an approach to lemmatization based on exhaustive morphological analysis and use of external knowledge sources to help disambiguation which is the most relevant issue to cope with. Our system GETARUNS was not concerned with lemmatization directly and used morphological analysis only as backoff solution in case the word was not retrieved in ...
DELMONTE, Rodolfo +2 more
openaire +2 more sources
INTERNATIONAL SYSTEM OF KNOWLEDGE EXCHANGE FOR YOUNG SCIENTISTS
The paper proposes a system which is electronic data storage (of qualification works of students from different countries) and provides the capability to identify and connect young scientists conducting research on a related problem area. The purpose of
Olesia Barkovska +3 more
doaj +1 more source
Automatic Lemmatizer Construction with Focus on OOV Words Lemmatization [PDF]
This paper deals with the automatic construction of a lemmatizer from a Full Form – Lemma (FFL) training dictionary and with lemmatization of new, in the FFL dictionary unseen, i.e. out-of-vocabulary (OOV) words. Three methods of lemmatization of three kinds of OOV words (missing full forms, unknown words, and compound words) are introduced.
Jakub Kanis, Ludek Müller
openaire +2 more sources
A complex network as an abstraction of a language system has attracted much attention during the last decade. Linguistic typological research using quantitative measures is a current research topic based on the complex network approach.
Aldo Ramirez-Arellano
doaj +1 more source
Neural Lemmatization of Multiword Expressions [PDF]
This article focuses on the lemmatization of multiword expressions (MWEs). We propose a deep encoder-decoder architecture generating for every MWE word its corresponding part in the lemma, based on the internal context of the MWE. The encoder relies on recurrent networks based on (1) the character sequence of the individual words to capture their ...
Marine Schmitt, Mathieu Constant
openaire +2 more sources
nikopartanen/old-literary-finnish-lemmatization: Old Literary Finnish Lemmatization Dataset
This is a dataset that contains randomly selected and manually lemmatized sentences from the corpus of Old Literary Finnish. Please cite and consult the original corpus as well: Institute for the Languages of Finland (2013).
Niko Partanen +3 more
core +1 more source

