Cross-Lingual Link Discovery for Under-Resourced Languages [PDF]
This article is based upon work from COST Action NexusLinguarum – "European network for Webcentered linguistic data science" (CA18209), supported by COST (European Cooperation in Science and Technology) www.cost.eu. This work is also partially supported by the I+D+i project PID2020-113903RBI00, funded by MCIN/AEI/10.13039/501100011033, by DGA/FEDER ...
Rosner, Michael +12 more
core +8 more sources
Acoustic Modelling for Under-Resourced Languages
Automatic speech recognition systems have so far been developed only for very few languages out of the 4,000-7,000 existing ones. In this thesis we examine methods to rapidly create acoustic models in new, possibly under-resourced languages, in a time and cost effective manner.
Stüker, Sebastian
openaire +5 more sources
Modeling under-resourced languages for speech recognition [PDF]
One particular problem in large vocabulary continuous speech recognition for low-resourced languages is finding relevant training data for the statistical language models. Large amount of data is required, because models should estimate the probability for all possible word sequences.
Mikko Kurimo +5 more
core +5 more sources
Improving Wordnets for Under-Resourced Languages Using Machine Translation
Wordnets are extensively used in natural language processing, but the current approaches for manually building a wordnet from scratch involves large research groups for a long period of time, which are typically not available for under-resourced languages.
Chakravarthi, Bharathi Raja +2 more
openaire +3 more sources
Introduction to the special issue on processing under-resourced languages
School of Interdisciplinary Research and Graduate StudiesUniversity of South AfricaPretoriaSouth Africa Institute for Computational Linguistics A.
Laurent Besacier +3 more
openaire +5 more sources
MaCoCu: Massive collection and curation of monolingual and bilingual data: focus on under-resourced languages [PDF]
We introduce the project MaCoCu: Massive collection and curation of monolingual and bilingual data: focus on under-resourced languages, funded by the Connecting Europe Facility, which is aimed at building monolingual and parallel corpora for under ...
Vanroy, Bram +28 more
core +8 more sources
Cross-lingual sentiment analysis for under-resourced languages [PDF]
Sentiment Analysis is a task that aims to calculate the polarity of text automatically. While some languages, such as English, have a vast array of resources to enable sentiment analysis, most under-resourced languages lack them. Cross-lingual Sentiment Analysis (CLSA) attempts to make use of resource-rich languages in order to create or improve ...
Barnes, Jeremy
core +3 more sources
Evaluating Language Tools for Fifteen EU-official Under-resourced Languages [PDF]
This article presents the results of the evaluation campaign of language tools available for fifteen EU-official under-resourced languages. The evaluation was conducted within the MSC ITN CLEOPATRA action that aims at building the cross-lingual event-centric knowledge processing on top of the application of linguistic processing chains (LPCs) for at ...
Valio Antunes Alves, Diego Fernando +2 more
openaire +4 more sources
Bilingual lexicon induction across orthographically-distinct under-resourced Dravidian languages [PDF]
Bilingual lexicons are a vital tool for under-resourced languages and recent state-of-the-art approaches to this leverage pretrained monolingual word embeddings using supervised or semi- supervised approaches.
Rajasekaran, Navaneethan +12 more
core +1 more source
RBMT as an alternative to SMT for under-resourced languages [PDF]
Despite SMT (Statistical Machine Translation) recently revolutionised MT for major language pairs, when addressing under-resourced and, to some extent, mildly-resourced languages, it still faces some difficulties such as the need of important quantities of parallel texts, the limited guaranty of the quality, etc. We thus speculate that RBMT (Rule Based
Guillaume de Malézieux +2 more
openaire +1 more source

