Results 41 to 50 of about 14,660 (202)
Occupational information occurs in many historical sources. For a large number of research areas, not only standardization, but above all classification of these is a central prerequisite for ...
Jan Michael Goldberg, Katrin Moeller
doaj +1 more source
Text Stemming and Lemmatization of Regional Languages in Indonesia: A Systematic Literature Review
Background: Stemming is significantly essential in natural language processing (NLP) due to the ability to minimize word variations to fundamental forms. This procedure facilitates the analysis of textual data and enhances the precision of classification
Zaenal Abidin, Akmal Junaidi, Wamiliana
doaj +1 more source
An Extensible Multilingual Open Source Lemmatizer
We present GATE DictLemmatizer, a multilingual open source lemmatizer for the GATE NLP framework that currently supports English, German, Italian, French, Dutch, and Spanish, and is easily extensible to other languages. The software is freely available under the LGPL license. The lemmatization is based on the Helsinki Finite-State Transducer Technology
Aker, Ahmet +2 more
openaire +2 more sources
Occupational information occurs in many historical sources. For a large number of research areas, not only standardization, but above all classification of these is a central prerequisite for ...
Jan Michael Goldberg +1 more
doaj +1 more source
ABSTRACT Although recent literature on the circular economy (CE) has highlighted the important role of ecosystems, there is still limited understanding of the main themes that characterize circular ecosystems. This study addresses this gap by combining a comprehensive topic modeling analysis employing latent Dirichlet allocation (LDA) with a systematic
Aline Gabriela Ferrari +4 more
wiley +1 more source
Modeling Topics in DFA-Based Lemmatized Gujarati Text
Topic modeling is a machine learning algorithm based on statistics that follows unsupervised machine learning techniques for mapping a high-dimensional corpus to a low-dimensional topical subspace, but it could be better.
Uttam Chauhan +9 more
doaj +1 more source
Beyond Stemming and Lemmatization: Ultra-stemming to Improve Automatic Text Summarization [PDF]
In Automatic Text Summarization, preprocessing is an important phase to reduce the space of textual representation. Classically, stemming and lemmatization have been widely used for normalizing words. However, even using normalization on large texts, the
Torres-Moreno, Juan-Manuel
core
Towards a flexible open-source software library for multi-layered scholarly textual studies: An Arabic case study dealing with semi-automatic language processing [PDF]
This paper presents both the general model and a case study of the Computational and Collaborative Philology Library (CoPhiLib), an ongoing initiative underway at the Institute for Computational Linguistics (ILC) of the National Research Council (CNR ...
Del Grosso, Angelo Mario, NAHLI, OUAFAE
core +1 more source
ABSTRACT Sustainability reports (SRs) are widely criticized for vague disclosures and selective emphasis on positive outcomes, yet systematic research on two core SR challenges remains limited: materiality (whether disclosed content is relevant) and balance (whether both achievements and challenges are reported).
Mahsa Mohammadrezaei +1 more
wiley +1 more source
Learning morphology with Morfette [PDF]
Morfette is a modular, data-driven, probabilistic system which learns to perform joint morphological tagging and lemmatization from morphologically annotated corpora.
Chrupała, Grzegorz +2 more
core +2 more sources

