Results 261 to 270 of about 972,067 (310)
Some of the next articles are maybe not open access.

Breaking Physical and Linguistic Borders: Multilingual Federated Prompt Tuning for Low-Resource Languages

International Conference on Learning Representations
Pre-trained large language models (LLMs) have become a cornerstone of modern natural language processing, with their capabilities extending across a wide range of applications and languages.
Wanru Zhao   +6 more
semanticscholar   +1 more source

Enhancing Code Generation for Low-Resource Languages: No Silver Bullet

IEEE International Conference on Program Comprehension
The advent of Large Language Models (LLMs) has significantly advanced the field of automated code generation. LLMs rely on large and diverse datasets to learn syntax, semantics, and usage patterns of programming languages. For low-resource languages (i.e.
Alessandro Giagnorio   +2 more
semanticscholar   +1 more source

Language Ranker: A Metric for Quantifying LLM Performance Across High and Low-Resource Languages

AAAI Conference on Artificial Intelligence
The development of Large Language Models (LLMs) relies on extensive text corpora, which are often unevenly distributed across languages. This imbalance results in LLMs performing significantly better on high-resource languages like English, German, and ...
Zihao Li   +6 more
semanticscholar   +1 more source

Opportunities and Challenges of Large Language Models for Low-Resource Languages in Humanities Research

arXiv.org
Low-resource languages serve as invaluable repositories of human history, embodying cultural evolution and intellectual diversity. Despite their significance, these languages face critical challenges, including data scarcity and technological limitations,
Tianyang Zhong   +11 more
semanticscholar   +1 more source

GigaSpeech 2: An Evolving, Large-Scale and Multi-domain ASR Corpus for Low-Resource Languages with Automated Crawling, Transcription and Refinement

Annual Meeting of the Association for Computational Linguistics
The evolution of speech technology has been spurred by the rapid increase in dataset sizes. Traditional speech models generally depend on a large amount of labeled training data, which is scarce for low-resource languages.
Yifan Yang   +15 more
semanticscholar   +1 more source

Zero-shot Sentiment Analysis in Low-Resource Languages Using a Multilingual Sentiment Lexicon

Conference of the European Chapter of the Association for Computational Linguistics
Improving multilingual language models capabilities in low-resource languages is generally difficult due to the scarcity of large-scale data in those languages. In this paper, we relax the reliance on texts in low-resource languages by using multilingual
Fajri Koto   +4 more
semanticscholar   +1 more source

When Is Multilinguality a Curse? Language Modeling for 250 High- and Low-Resource Languages

Conference on Empirical Methods in Natural Language Processing, 2023
Multilingual language models are widely used to extend NLP systems to low-resource languages. However, concrete evidence for the effects of multilinguality on language modeling performance in individual languages remains scarce.
Tyler A. Chang   +3 more
semanticscholar   +1 more source

Teaching Large Language Models to Translate on Low-resource Languages with Textbook Prompting

International Conference on Language Resources and Evaluation
Large Language Models (LLMs) have achieved impressive results in Machine Translation by simply following instructions, even without training on parallel data.
Ping Guo   +6 more
semanticscholar   +1 more source

Word Embeddings in Low Resource Gujarati Language

2019 International Conference on Document Analysis and Recognition Workshops (ICDARW), 2019
Word embeddings/vectors are becoming an extremely important component of natural language processing tasks. Word2vec and fastText are few of the most common word embedding techniques. While large amount of work has been done to obtain embeddings in resource rich languages like English, work still remains to be done for low resource languages. Our focus
Ishani Joshi   +2 more
openaire   +1 more source

Adapting Multilingual LLMs to Low-Resource Languages using Continued Pre-training and Synthetic Corpus

COLING Workshops
Multilingual LLMs support a variety of languages; however, their performance is suboptimal for low-resource languages. In this work, we emphasize the importance of continued pre-training of multilingual LLMs and the use of translation-based synthetic pre-
Raviraj Joshi   +8 more
semanticscholar   +1 more source

Home - About - Disclaimer - Privacy