Results 31 to 40 of about 11,979,430 (376)
From the Editors THE WAY AHEAD IN LANGUAGES FOR SPECIFIC PURPOSES The processes of globalisation, the increasing dominance of English in academic and professional spheres, and the ongoing changes in higher education worldwide have destabilised the ...
Language Value
doaj +1 more source
SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing [PDF]
This paper describes SentencePiece, a language-independent subword tokenizer and detokenizer designed for Neural-based text processing, including Neural Machine Translation. It provides open-source C++ and Python implementations for subword units.
Taku Kudo, John Richardson
semanticscholar +1 more source
SciBERT: A Pretrained Language Model for Scientific Text [PDF]
Obtaining large-scale annotated data for NLP tasks in the scientific domain is challenging and expensive. We release SciBERT, a pretrained language model based on BERT (Devlin et.
Iz Beltagy, Kyle Lo, Arman Cohan
semanticscholar +1 more source
Language Models as Knowledge Bases? [PDF]
Recent progress in pretraining language models on large textual corpora led to a surge of improvements for downstream NLP tasks. Whilst learning linguistic knowledge, these models may also be storing relational knowledge present in the training data, and
F. Petroni+6 more
semanticscholar +1 more source
Defining a Matrix Language in Language Mixing [PDF]
Researchers of bilingual code-switching often assume that one of the participating languages serves as the ‘base’ or ‘matrix’ into which elements of the other language are embedded.
Sharath, Vivek
core +1 more source
Table of Contents From the editors Begoña Bellés-Fortuño Articles Multimodal literacy in academic environments: PowerPoint as a motivational genre.
Language Value
doaj +6 more sources
The Geometry of Multilingual Language Model Representations [PDF]
We assess how multilingual language models maintain a shared multilingual representation space while still encoding language-sensitive information in each language. Using XLM-R as a case study, we show that languages occupy similar linear subspaces after mean-centering, evaluated based on causal effects on language modeling performance and direct ...
arxiv
Language Preference in a Bi-language Digital Library [PDF]
This paper examines user choice of interface language in a bi-language digital library(English and Maori, the language of the indigenous people of New Zealand)/ the majority of collection documents are in Maori, and the interface is available in both ...
Cunningham, Sally Jo+1 more
core +2 more sources
Language Identification for Austronesian Languages [PDF]
This paper provides language identification models for low- and under-resourced languages in the Pacific region with a focus on previously unavailable Austronesian languages. Accurate language identification is an important part of developing language resources.
arxiv