Word-length algorithm for language identification of under-resourced languages
Language identification is widely used in machine learning, text mining, information retrieval, and speech processing. Available techniques for solving the problem of language identification do require large amount of training text that are not available
Ali Selamat, Nicholas Akosu
doaj +4 more sources
Multilingual Sentiment Analysis for Under-Resourced Languages: A Systematic Review of the Landscape
Sentiment analysis automatically evaluates people’s opinions of products or services. It is an emerging research area with promising advancements in high-resource languages such as Indo-European languages (e.g. English).
Koena Ronny Mabokela +2 more
doaj +3 more sources
Eigentrigraphemes for under-resourced languages [PDF]
Grapheme-based modeling has an advantage over phone-based modeling in automatic speech recognition for under-resourced languages when a good dictionary is not available. Recently we proposed a new method for parameter estimation of context-dependent hidden Markov model (HMM) called eigentriphone modeling. Eigentriphone modeling outperforms conventional
Ko, Tom Yu Ting, Mak, Brian Kan Wing
openaire +4 more sources
Creating language resources for under-resourced languages: methodologies, and experiments with Arabic [PDF]
Language resources are important for those working on computational methods to analyse and study languages. These resources are needed to help advancing the research in fields such as natural language processing, machine learning, information retrieval ...
A Roberts +25 more
core +5 more sources
Artificial intelligence translation in healthcare: an urgent call for evidence-informed policy frameworks [PDF]
The deployment of artificial intelligence (AI) translation tools in healthcare is accelerating rapidly, yet regulatory frameworks lag dangerously behind clinical practice.
Jonathan H Chen +8 more
doaj +2 more sources
Spoken word corpus and dictionary definition for an African language [PDF]
The preservation of languages is critical to maintaining and strengthening the cultures and identities of communities, and this is especially true for under-resourced languages with a predominantly oral culture.
Wanjiku Nganga, Ikechukwu Achebe
doaj +3 more sources
Evaluating Language Tools for Fifteen EU-official Under-resourced Languages [PDF]
This article presents the results of the evaluation campaign of language tools available for fifteen EU-official under-resourced languages. The evaluation was conducted within the MSC ITN CLEOPATRA action that aims at building the cross-lingual event-centric knowledge processing on top of the application of linguistic processing chains (LPCs) for at ...
Valio Antunes Alves, Diego Fernando +2 more
openaire +4 more sources
Cross-Lingual Link Discovery for Under-Resourced Languages [PDF]
This article is based upon work from COST Action NexusLinguarum – "European network for Webcentered linguistic data science" (CA18209), supported by COST (European Cooperation in Science and Technology) www.cost.eu. This work is also partially supported by the I+D+i project PID2020-113903RBI00, funded by MCIN/AEI/10.13039/501100011033, by DGA/FEDER ...
Rosner, Michael +12 more
openaire +4 more sources
NCHLT Auxiliary speech data for ASR technology development in South Africa
The aim of the National Centre for Human Language Technology (NCHLT) project was to create speech and text resources that would enable Human Language Technology (HLT) development for the 11 official languages of South Africa. The speech data described in
Jaco Badenhorst, Febe de Wet
doaj +1 more source
Improving the Performance of Low-resourced Speaker Identification with Data Preprocessing
Automatic speaker identification is done to tackle daily security problems. Speech data collection is an essential but very challenging task for under-resourced languages like Burmese.
Win Lai Lai Phyu +2 more
doaj +1 more source

