POS Tagging and Lemmatization of Historical Varieties of Languages. The Challenge of Old Italian
The paper discusses the challenges of POS tagging and lemmatization of historical varieties of Italian, and reports for both tasks the results of experiments carried out in a classical supervised domain adaptation scenario using the diachronic and ...
Manuel Favaro +2 more
doaj +1 more source
The lemmatization of copulatives in northern Sotho
For learners of Northern Sotho as a second or even foreign language, the copulative system is probably the most complicated grammatical system to master. The encoding needs of such learners, i.e. to find enough information in dictionaries in order to actively use copulatives in speech and writing, are poorly served in currently available dictionaries ...
openaire +5 more sources
Specyfika staropolszczyzny a anotacja gramatyczna. O lematyzacji tekstu staropolskiego
SPECIFIC FEATURES OF OLD POLISH LANGUAGE AND GRAMMATICAL ANNOTATION: LEMMATIZATION OF OLD POLISH TEXTS The article is dedicated to grammatical annotation of Old Polish texts. It discusses the problem with the lemmatization of a medieval text.
Magdalena Wismont +3 more
doaj +1 more source
Method of lemmatizer selections in multiplexing lemmatization
O A Sychev, N A Penskoy
openaire +1 more source
Analyzing Multilingual French and Russian Text using NLTK, spaCy, and Stanza
This lesson covers tokenization, part-of-speech tagging, and lemmatization, as well as automatic language detection, for non-English and multilingual text.
Ian Goodale
doaj +1 more source
Improving Statistical MT through Morphological Analysis [PDF]
Goldwater, Sharon, McClosky, David
core +1 more source
BabyLemmatizer : A Lemmatizer and POS-tagger for Akkadian
We present a hybrid lemmatizer and POS-tagger for Akkadian, the language of the ancient Assyrians and Babylonians, documented from 2350 BCE to 100 CE. In our approach the text is first POS-tagged and lemmatized with TurkuNLP trained with human-verified labels, and then post-corrected with dictionary-based methods to improve the lemmatization quality ...
Sahala Aleksi +3 more
openaire +2 more sources
BanglaVerb: A sentence-level dataset for transitivity classification in Bangla NLP. [PDF]
Koli ZM, Alam MJ, Sultana Z, Khan AA.
europepmc +1 more source
ConversationAlign: Open-source software for analyzing patterns of lexical use and alignment in conversation transcripts. [PDF]
Sacks B +7 more
europepmc +1 more source
AI driven web crawling for semantic extraction of news content from newspapers. [PDF]
S S, A K AA.
europepmc +1 more source

