Results 11 to 20 of about 1,389,721 (276)
Urdu is still considered a low-resource language despite being ranked as world’s $10^{th}$ most spoken language with nearly 230 million speakers.
Abdul Ghafoor +6 more
doaj +3 more sources
AgglutiFiT: Efficient Low-Resource Agglutinative Language Model Fine-Tuning
Text classification tends to be difficult when data are inadequate considering the amount of manually labeled text corpora. For low-resource agglutinative languages including Uyghur, Kazakh, and Kyrgyz (UKK languages), in which words are manufactured via
Zhe Li +3 more
doaj +3 more sources
Voice Activation for Low-Resource Languages [PDF]
Voice activation systems are used to find a pre-defined word or phrase in the audio stream. Industry solutions, such as “OK, Google” for Android devices, are trained with millions of samples. In this work, we propose and investigate several ways to train a voice activation system when the in-domain data set is small.
Kolesau, Aliaksei, Šešok, Dmitrij
openaire +2 more sources
Endangered Languages are not Low-Resourced [PDF]
The term low-resourced has been tossed around in the field of natural language processing to a degree that almost any language that is not English can be called "low-resourced"; sometimes even just for the sake of making a mundane or mediocre paper appear more interesting and insightful.
openaire +3 more sources
Low-Resource Machine Translation Training Curriculum Fit for Low-Resource Languages
We conduct an empirical study of neural machine translation (NMT) for truly low-resource languages, and propose a training curriculum fit for cases when both parallel training data and compute resource are lacking, reflecting the reality of most of the world's languages and the researchers working on these languages. Previously, unsupervised NMT, which
Kuwanto, Garry +5 more
openaire +2 more sources
Transformers for Low-Resource Languages: Is Féidir Linn! [PDF]
The Transformer model is the state-of-the-art in Machine Translation. However, in general, neural translation models often under perform on language pairs with insufficient training data. As a consequence, relatively few experiments have been carried out using this architecture on low-resource language pairs.
Lankford, Séamus +2 more
openaire +2 more sources
Corpulyzer: A Novel Framework for Building Low Resource Language Corpora
The rapid proliferation of artificial intelligence has led to the development of sophisticated cutting-edge systems in natural language processing and computational linguistics domains.
Bilal Tahir, Muhammad Amir Mehmood
doaj +1 more source
Pre-trained transformer-based language models for Sundanese
The Sundanese language has over 32 million speakers worldwide, but the language has reaped little to no benefits from the recent advances in natural language understanding.
Wilson Wongso +2 more
doaj +1 more source
Towards Language Service Creation and Customization for Low-Resource Languages
The most challenging issue with low-resource languages is the difficulty of obtaining enough language resources. In this paper, we propose a language service framework for low-resource languages that enables the automatic creation and customization of ...
Donghui Lin, Yohei Murakami, Toru Ishida
doaj +1 more source
Data Augmentation for Low-Resource Neural Machine Translation [PDF]
The quality of a Neural Machine Translation system depends substantially on the availability of sizable parallel corpora. For low-resource language pairs this is not the case, resulting in poor translation quality. Inspired by work in computer vision, we
Bisazza, Arianna +2 more
core +2 more sources

