Results 1 to 10 of about 1,389,721 (276)
Bidirectional Representations for Low-Resource Spoken Language Understanding
Speech representation models lack the ability to efficiently store semantic information and require fine tuning to deliver decent performance. In this research, we introduce a transformer encoder–decoder framework with a multiobjective training strategy,
Quentin Meeus +2 more
doaj +3 more sources
An automated approach to identify sarcasm in low-resource language. [PDF]
Sarcasm detection has emerged due to its applicability in natural language processing (NLP) but lacks substantial exploration in low-resource languages like Urdu, Arabic, Pashto, and Roman-Urdu. While fewer studies identifying sarcasm have focused on low-
Shumaila Khan +6 more
doaj +2 more sources
Enhancing African low-resource languages: Swahili data for language modelling
Language modelling using neural networks requires adequate data to guarantee quality word representation which is important for natural language processing (NLP) tasks.
Casper S. Shikali, Refuoe Mokhosi
doaj +3 more sources
Empowering Low-Resource Languages: Javanese Machine Translation
This study addresses the critical need to preserve and revitalize the Javanese language, which despite its widespread popularity, faces challenges as a low-resource language in Indonesia.
Danang Arbian Sulistyo +4 more
doaj +2 more sources
Offensive language detection in low resource languages: A use case of Persian language.
THIS ARTICLE USES WORDS OR LANGUAGE THAT IS CONSIDERED PROFANE, VULGAR, OR OFFENSIVE BY SOME READERS. Different types of abusive content such as offensive language, hate speech, aggression, etc. have become prevalent in social media and many efforts have
Marzieh Mozafari +3 more
doaj +3 more sources
Multilingual Offensive Language Identification for Low-resource Languages [PDF]
Offensive content is pervasive in social media and a reason for concern to companies and government organizations. Several studies have been recently published investigating methods to detect the various forms of such content (e.g., hate speech, cyberbullying, and cyberaggression). The clear majority of these studies deal with English partially because
Ranasinghe, Tharindu, Zampieri, Marcos
openaire +3 more sources
GlotLID: Language Identification for Low-Resource Languages
Several recent papers have published good solutions for language identification (LID) for about 300 high-resource and medium-resource languages. However, there is no LID available that (i) covers a wide range of low-resource languages, (ii) is rigorously evaluated and reliable and (iii) efficient and easy to use.
Kargaran, Amir +3 more
openaire +3 more sources
Neural Network-Based Bilingual Lexicon Induction for Indonesian Ethnic Languages
Indonesia has a variety of ethnic languages, most of which belong to the same language family: the Austronesian languages. Due to the shared language family, words in Indonesian ethnic languages are very similar.
Kartika Resiandi +2 more
doaj +1 more source
LiDA: Language-Independent Data Augmentation for Text Classification
Developing a high-performance text classification model in a low-resource language is challenging due to the lack of labeled data. Meanwhile, collecting large amounts of labeled data is cost-inefficient.
Yudianto Sujana, Hung-Yu Kao
doaj +1 more source
Neural network language models for low resource languages [PDF]
For resource rich languages, recent works have shown Neural Network based Language Models (NNLMs) to be an effective modeling technique for Automatic Speech Recognition, out performing standard n-gram language models (LMs). For low resource languages, however, the performance of NNLMs has not been well explored.
Gandhe, Ankur, Metze, Florian, Lane, Ian
openaire +2 more sources

