Results 261 to 270 of about 972,067 (310)
Some of the next articles are maybe not open access.
International Conference on Learning Representations
Pre-trained large language models (LLMs) have become a cornerstone of modern natural language processing, with their capabilities extending across a wide range of applications and languages.
Wanru Zhao +6 more
semanticscholar +1 more source
Pre-trained large language models (LLMs) have become a cornerstone of modern natural language processing, with their capabilities extending across a wide range of applications and languages.
Wanru Zhao +6 more
semanticscholar +1 more source
Enhancing Code Generation for Low-Resource Languages: No Silver Bullet
IEEE International Conference on Program ComprehensionThe advent of Large Language Models (LLMs) has significantly advanced the field of automated code generation. LLMs rely on large and diverse datasets to learn syntax, semantics, and usage patterns of programming languages. For low-resource languages (i.e.
Alessandro Giagnorio +2 more
semanticscholar +1 more source
Language Ranker: A Metric for Quantifying LLM Performance Across High and Low-Resource Languages
AAAI Conference on Artificial IntelligenceThe development of Large Language Models (LLMs) relies on extensive text corpora, which are often unevenly distributed across languages. This imbalance results in LLMs performing significantly better on high-resource languages like English, German, and ...
Zihao Li +6 more
semanticscholar +1 more source
arXiv.org
Low-resource languages serve as invaluable repositories of human history, embodying cultural evolution and intellectual diversity. Despite their significance, these languages face critical challenges, including data scarcity and technological limitations,
Tianyang Zhong +11 more
semanticscholar +1 more source
Low-resource languages serve as invaluable repositories of human history, embodying cultural evolution and intellectual diversity. Despite their significance, these languages face critical challenges, including data scarcity and technological limitations,
Tianyang Zhong +11 more
semanticscholar +1 more source
Annual Meeting of the Association for Computational Linguistics
The evolution of speech technology has been spurred by the rapid increase in dataset sizes. Traditional speech models generally depend on a large amount of labeled training data, which is scarce for low-resource languages.
Yifan Yang +15 more
semanticscholar +1 more source
The evolution of speech technology has been spurred by the rapid increase in dataset sizes. Traditional speech models generally depend on a large amount of labeled training data, which is scarce for low-resource languages.
Yifan Yang +15 more
semanticscholar +1 more source
Zero-shot Sentiment Analysis in Low-Resource Languages Using a Multilingual Sentiment Lexicon
Conference of the European Chapter of the Association for Computational LinguisticsImproving multilingual language models capabilities in low-resource languages is generally difficult due to the scarcity of large-scale data in those languages. In this paper, we relax the reliance on texts in low-resource languages by using multilingual
Fajri Koto +4 more
semanticscholar +1 more source
When Is Multilinguality a Curse? Language Modeling for 250 High- and Low-Resource Languages
Conference on Empirical Methods in Natural Language Processing, 2023Multilingual language models are widely used to extend NLP systems to low-resource languages. However, concrete evidence for the effects of multilinguality on language modeling performance in individual languages remains scarce.
Tyler A. Chang +3 more
semanticscholar +1 more source
Teaching Large Language Models to Translate on Low-resource Languages with Textbook Prompting
International Conference on Language Resources and EvaluationLarge Language Models (LLMs) have achieved impressive results in Machine Translation by simply following instructions, even without training on parallel data.
Ping Guo +6 more
semanticscholar +1 more source
Word Embeddings in Low Resource Gujarati Language
2019 International Conference on Document Analysis and Recognition Workshops (ICDARW), 2019Word embeddings/vectors are becoming an extremely important component of natural language processing tasks. Word2vec and fastText are few of the most common word embedding techniques. While large amount of work has been done to obtain embeddings in resource rich languages like English, work still remains to be done for low resource languages. Our focus
Ishani Joshi +2 more
openaire +1 more source
COLING Workshops
Multilingual LLMs support a variety of languages; however, their performance is suboptimal for low-resource languages. In this work, we emphasize the importance of continued pre-training of multilingual LLMs and the use of translation-based synthetic pre-
Raviraj Joshi +8 more
semanticscholar +1 more source
Multilingual LLMs support a variety of languages; however, their performance is suboptimal for low-resource languages. In this work, we emphasize the importance of continued pre-training of multilingual LLMs and the use of translation-based synthetic pre-
Raviraj Joshi +8 more
semanticscholar +1 more source

