Results 31 to 40 of about 4,321 (229)

The LiLa Lemma Bank: A Knowledge Base of Latin Canonical Forms

open access: yesJournal of Open Humanities Data, 2023
The dataset contains a list of 215,102 Latin dictionary forms (known as canonical forms or lemmas). The dataset is a set of 1,699,687 Resource Description Framework (RDF) triples that describe, using a series of Web Ontology Language (OWL) ontologies for
Francesco Mambrini   +1 more
doaj   +1 more source

The regularization of Old English weak verbs

open access: yesRevista de Lingüística y Lenguas Aplicadas, 2015
This article deals with the regularization of non-standard spellings of the verbal forms extracted from a corpus. It addresses the question of what the limits of regularization are when lemmatizing Old English weak verbs.
Marta Tío Sáenz
doaj   +1 more source

Annif Analyzer Shootout: Comparing text lemmatization methods for automated subject indexing

open access: yesCode4Lib Journal, 2022
Automated text classification is an important function for many AI systems relevant to libraries, including automated subject indexing and classification.
Osma Suominen, Ilkka Koskenniemi
doaj  

BanglaLem: A Transformer-based Bangla Lemmatizer with an Enhanced Dataset

open access: yesSystems and Soft Computing
Lemmatization plays a crucial role in various natural language processing (NLP) tasks, such as information retrieval, sentiment analysis, text summarization, and text classification.
Md Fuadul Islam   +4 more
doaj   +1 more source

Text Stemming and Lemmatization of Regional Languages in Indonesia: A Systematic Literature Review

open access: yesJournal of Information Systems Engineering and Business Intelligence
Background: Stemming is significantly essential in natural language processing (NLP) due to the ability to minimize word variations to fundamental forms. This procedure facilitates the analysis of textual data and enhances the precision of classification
Zaenal Abidin, Akmal Junaidi, Wamiliana
doaj   +1 more source

Editing Middle English Medical Manuscripts : The Case of Glasgow University Library MS Hunter 509

open access: yesJournal of English Studies, 2011
It has been pointed out that the editing of a scientific treatise should be “an extended and challenging exercise in judgment, requiring an earnest commitment to scholarship” (Keiser 1998: 110).
María Laura Esteban-Segura
doaj   +1 more source

Transformer-based part-of-speech tagging and lemmatization for Latin [PDF]

open access: yes, 2022
The paper presents a submission to the EvaLatin 2022 shared task. Our system places first for lemmatization, part-of-speech and morphological tagging in both closed and open modalities.
Wróbel, Krzysztof, Nowak, Krzysztof
core  

The lemmatization of Old English Verbs from the second weak class on a lexical database

open access: yesJournal of English Studies, 2015
This article compiles a list of lemmas of the second class weak verbs of Old English by using the latest version of the lexical database Nerthus, which incorporates the texts of the Dictionary of Old English Corpus.
Marta Tío Sáenz
doaj   +1 more source

Processing Tools for Greek and Other Languages of the Christian Middle East [PDF]

open access: yesJournal of Data Mining and Digital Humanities, 2018
This paper presents some computer tools and linguistic resources of the GREgORI project. These developments allow automated processing of texts written in the main languages of the Christian Middel East, such as Greek, Arabic, Syriac, Armenian and ...
Bastien Kindt
doaj   +1 more source

Home - About - Disclaimer - Privacy