Results 31 to 40 of about 2,430,336 (342)
Dutch Parallel Corpus: A Balanced Copyright-Cleared Parallel Corpus [PDF]
This paper presents the Dutch Parallel Corpus, a high-quality parallel corpus for Dutch, French and English consisting of more than ten million words. The corpus contains five different text types and is balanced with respect to text type and translation direction. All texts included in the corpus have been cleared from copyright.
Macken, Lieve +2 more
openaire +1 more source
A Parallel Corpus of Translationese [PDF]
We describe a set of bilingual English--French and English--German parallel corpora in which the direction of translation is accurately and reliably annotated. The corpora are diverse, consisting of parliamentary proceedings, literary works, transcriptions of TED talks and political commentary.
Ella Rabinovich +2 more
openaire +2 more sources
This paper describes the system submitted by IITP-MT team to Computational Approaches to Linguistic Code-Switching (CALCS 2021) shared task on MT for English→Hinglish.
Ramakrishna Appicharla +3 more
semanticscholar +1 more source
Unsupervised Multilingual Sentence Embeddings for Parallel Corpus Mining [PDF]
Existing models of multilingual sentence embeddings require large parallel data resources which are not available for low-resource languages. We propose a novel unsupervised method to derive multilingual sentence embeddings relying only on monolingual ...
Ivana Kvapilíková +4 more
semanticscholar +1 more source
A Data Augmentation Method for English-Vietnamese Neural Machine Translation
The translation quality of machine translation systems depends on the parallel corpus used for training, particularly on the quantity and quality of the corpus.
Nghia Luan Pham +2 more
doaj +1 more source
Metadiscourse features refer to those elements by which interaction between writer-reader and/or speaker-audience is constructed. Taking this into account, the objective of this contrastive parallel corpus-based study was to explore the way metadiscourse
Mehrdad Vasheghani Farahani, R. Kazemian
semanticscholar +1 more source
Consumer Eroski parallel corpus
This paper introduces the Consumer Eroski Parallel Corpus, a collection of articles originally written in Spanish and later translated to three languages also spoken in Spain: Basque, Catalan and Galician. The articles have been correlated in the four
Asier Alcázar
doaj +1 more source
Parallel Corpus Filtering via Pre-trained Language Models [PDF]
Web-crawled data provides a good source of parallel corpora for training machine translation models. It is automatically obtained, but extremely noisy, and recent work shows that neural machine translation systems are more sensitive to noise than ...
Boliang Zhang, Ajay Nagesh, Kevin Knight
semanticscholar +1 more source
The Bulgarian-Polish-Russian parallel corpus
The Bulgarian-Polish-Russian parallel corpus The Semantics Laboratory Team of Institute of Slavic Studies of Polish Academy of Sciences is planning to begin work on the creation of a Bulgarian-Polish-Russian parallel corpus.
Maksim Duškin +1 more
doaj +1 more source
Introduction – Languages in contrast 20 years on
This special issue of the Nordic Journal of English Studies comprises papers from the symposium Languages in Contrast held in Lund 5 December 2014 in celebration of the 20th anniversary of the Nordic Parallel Corpus Project which led to the English ...
Lene Nordrum +2 more
doaj +1 more source

