Results 11 to 20 of about 45,674 (282)

Natural language processing for under-resourced languages: Developing a Welsh natural language toolkit

open access: yesComputer Speech & Language, 2022
Abstract Language technology is becoming increasingly important across a variety of application domains which have become common place in large, well-resourced languages. However, there is a danger that small, under-resourced languages are being increasingly pushed to the technological margins. Under-resourced languages face significant challenges in
Cunliffe, D   +3 more
openaire   +4 more sources

Multi-task learning in under-resourced Dravidian languages

open access: yesJournal of Data, Information and Management, 2022
AbstractIt is challenging to obtain extensive annotated data for under-resourced languages, so we investigate whether it is beneficial to train models using multi-task learning. Sentiment analysis and offensive language identification share similar discourse properties.
Adeep Hande   +2 more
openaire   +3 more sources

ASR for Under-Resourced Languages From Probabilistic Transcription

open access: yesIEEE/ACM Transactions on Audio, Speech, and Language Processing, 2017
In many under-resourced languages it is possible to find text, and it is possible to find speech, but transcribed speech suitable for training automatic speech recognition ASR is unavailable. In the absence of native transcripts, this paper proposes the use of a probabilistic transcript: A probability mass function over possible phonetic transcripts of
Mark A. Hasegawa-Johnson   +15 more
openaire   +3 more sources

Offensive Language Detection in Under-Resourced Algerian Dialectal Arabic Language

open access: yes, 2023
This paper addresses the problem of detecting the offensive and abusive content in Facebook comments, where we focus on the Algerian dialectal Arabic which is one of under-resourced languages. The latter has a variety of dialects mixed with different languages (i.e. Berber, French and English). In addition, we deal with texts written in both Arabic and
Boucherit, Oussama, Abainia, Kheireddine
openaire   +2 more sources

Modeling under-resourced languages for speech recognition [PDF]

open access: yesLANGUAGE RESOURCES AND EVALUATION, 2016
One particular problem in large vocabulary continuous speech recognition for low-resourced languages is finding relevant training data for the statistical language models. Large amount of data is required, because models should estimate the probability for all possible word sequences.
Enarvi, Seppo   +6 more
openaire   +4 more sources

Translation-Based Dictionary Alignment for Under-Resourced Bantu Languages [PDF]

open access: yes, 2019
Despite a large number of active speakers, most Bantu languages can be considered as under- or less-resourced languages. This includes especially the current situation of lexicographical data, which is highly unsatisfactory concerning the size, quality ...
Bosch, Sonja   +4 more
core   +1 more source

Ús de la tecnologia en la traducció i la interpretació jurídiques: potencial, disponibilitat i aplicacions dels recursos

open access: yesRevista de Llengua i Dret - Journal of Language and Law, 2022
En aquesta introducció es presenta un resum del número especial que la Revista de Llengua i Dret, Journal of Language and Law dedica a la traducció i la interpretació (TI) jurídiques en el món de les tecnologies. Tot i que les tecnologies de la traducció
Christopher D. Mellinger   +1 more
doaj   +1 more source

Comparison of Different Orthographies for Machine Translation of Under-Resourced Dravidian Languages [PDF]

open access: yes, 2019
Under-resourced languages are a significant challenge for statistical approaches to machine translation, and recently it has been shown that the usage of training data from closely-related languages can improve machine translation quality of these ...
Arcan, Mihael   +2 more
core   +1 more source

Strategies for building wordnets for under-resourced languages: The case of African languages

open access: yesLiterator, 2017
The African Wordnet Project (AWN) aims at building wordnets for five African languages: Setswana, isiXhosa, isiZulu, Sesotho sa Leboa (also referred to as Sepedi or Northern Sotho) and Tshivenda.
Sonja E. Bosch, Marissa Griesel
doaj   +1 more source

Speech recognition for under-resourced languages: Data sharing in hidden Markov model systems

open access: yesSouth African Journal of Science, 2017
For purposes of automated speech recognition in under-resourced environments, techniques used to share acoustic data between closely related or similar languages become important.
Febe de Wet   +3 more
doaj   +1 more source

Home - About - Disclaimer - Privacy