Results 11 to 20 of about 45,674 (282)
Abstract Language technology is becoming increasingly important across a variety of application domains which have become common place in large, well-resourced languages. However, there is a danger that small, under-resourced languages are being increasingly pushed to the technological margins. Under-resourced languages face significant challenges in
Cunliffe, D +3 more
openaire +4 more sources
Multi-task learning in under-resourced Dravidian languages
AbstractIt is challenging to obtain extensive annotated data for under-resourced languages, so we investigate whether it is beneficial to train models using multi-task learning. Sentiment analysis and offensive language identification share similar discourse properties.
Adeep Hande +2 more
openaire +3 more sources
ASR for Under-Resourced Languages From Probabilistic Transcription
In many under-resourced languages it is possible to find text, and it is possible to find speech, but transcribed speech suitable for training automatic speech recognition ASR is unavailable. In the absence of native transcripts, this paper proposes the use of a probabilistic transcript: A probability mass function over possible phonetic transcripts of
Mark A. Hasegawa-Johnson +15 more
openaire +3 more sources
Offensive Language Detection in Under-Resourced Algerian Dialectal Arabic Language
This paper addresses the problem of detecting the offensive and abusive content in Facebook comments, where we focus on the Algerian dialectal Arabic which is one of under-resourced languages. The latter has a variety of dialects mixed with different languages (i.e. Berber, French and English). In addition, we deal with texts written in both Arabic and
Boucherit, Oussama, Abainia, Kheireddine
openaire +2 more sources
Modeling under-resourced languages for speech recognition [PDF]
One particular problem in large vocabulary continuous speech recognition for low-resourced languages is finding relevant training data for the statistical language models. Large amount of data is required, because models should estimate the probability for all possible word sequences.
Enarvi, Seppo +6 more
openaire +4 more sources
Translation-Based Dictionary Alignment for Under-Resourced Bantu Languages [PDF]
Despite a large number of active speakers, most Bantu languages can be considered as under- or less-resourced languages. This includes especially the current situation of lexicographical data, which is highly unsatisfactory concerning the size, quality ...
Bosch, Sonja +4 more
core +1 more source
En aquesta introducció es presenta un resum del número especial que la Revista de Llengua i Dret, Journal of Language and Law dedica a la traducció i la interpretació (TI) jurídiques en el món de les tecnologies. Tot i que les tecnologies de la traducció
Christopher D. Mellinger +1 more
doaj +1 more source
Comparison of Different Orthographies for Machine Translation of Under-Resourced Dravidian Languages [PDF]
Under-resourced languages are a significant challenge for statistical approaches to machine translation, and recently it has been shown that the usage of training data from closely-related languages can improve machine translation quality of these ...
Arcan, Mihael +2 more
core +1 more source
Strategies for building wordnets for under-resourced languages: The case of African languages
The African Wordnet Project (AWN) aims at building wordnets for five African languages: Setswana, isiXhosa, isiZulu, Sesotho sa Leboa (also referred to as Sepedi or Northern Sotho) and Tshivenda.
Sonja E. Bosch, Marissa Griesel
doaj +1 more source
Speech recognition for under-resourced languages: Data sharing in hidden Markov model systems
For purposes of automated speech recognition in under-resourced environments, techniques used to share acoustic data between closely related or similar languages become important.
Febe de Wet +3 more
doaj +1 more source

