Results 301 to 310 of about 2,430,336 (342)
Some of the next articles are maybe not open access.

Extended Parallel Corpus for Amharic-English Machine Translation

International Conference on Language Resources and Evaluation, 2021
This paper describes the acquisition, preprocessing, segmentation, and alignment of an Amharic-English parallel corpus. It will be helpful for machine translation of a low-resource language, Amharic.
A. Gezmu, A. Nürnberger, T. Bati
semanticscholar   +1 more source

Findings of the WMT 2020 Shared Task on Parallel Corpus Filtering and Alignment

Conference on Machine Translation, 2020
Following two preceding WMT Shared Task on Parallel Corpus Filtering (Koehn et al., 2018, 2019), we posed again the challenge of assigning sentence-level quality scores for very noisy corpora of sentence pairs crawled from the web, with the goal of sub ...
Philipp Koehn   +5 more
semanticscholar   +1 more source

KazParC: Kazakh Parallel Corpus for Machine Translation

International Conference on Language Resources and Evaluation
We introduce KazParC, a parallel corpus designed for machine translation across Kazakh, English, Russian, and Turkish. The first and largest publicly available corpus of its kind, KazParC contains a collection of 371,902 parallel sentences covering ...
Rustem Yeshpanov   +2 more
semanticscholar   +1 more source

A Multidialectal Parallel Corpus of Arabic

2014
The daily spoken variety of Arabic is often termed the colloquial or dialect form of Arabic. There are many Arabic dialects across the Arab World and within other Arabic speaking communities. These dialects vary widely from region to region and to a lesser extent from city to city in each region.
Habash, Nizar   +2 more
openaire   +2 more sources

Filtering Noisy Parallel Corpus using Transformers with Proxy Task Learning

Conference on Machine Translation, 2020
This paper illustrates Huawei’s submission to the WMT20 low-resource parallel corpus filtering shared task. Our approach focuses on developing a proxy task learner on top of a transformer-based multilingual pre-trained language model to boost the ...
Haluk Açarçiçek   +4 more
semanticscholar   +1 more source

Building an Italian-Chinese Parallel Corpus for Machine Translation from the Web

International Conference on Smart Objects and Technologies for Social Good, 2020
In an increasingly globalized world, being able to understand texts in different languages (even more so in different alphabets and charsets) has become a necessity.
Rita Tse   +4 more
semanticscholar   +1 more source

Findings of the WMT 2019 Shared Task on Parallel Corpus Filtering for Low-Resource Conditions

Conference on Machine Translation, 2019
Following the WMT 2018 Shared Task on Parallel Corpus Filtering, we posed the challenge of assigning sentence-level quality scores for very noisy corpora of sentence pairs crawled from the web, with the goal of sub-selecting 2% and 10% of the highest ...
Philipp Koehn   +3 more
semanticscholar   +1 more source

The IIT Bombay English-Hindi Parallel Corpus

International Conference on Language Resources and Evaluation, 2017
We present the IIT Bombay English-Hindi Parallel Corpus. The corpus is a compilation of parallel corpora previously available in the public domain as well as new parallel corpora we collected.
Anoop Kunchukuttan   +2 more
semanticscholar   +1 more source

Annotating the Dutch Parallel Corpus

open access: yes, 2010
Proceedings of the Workshop on Annotation and Exploitation of Parallel Corpora AEPC 2010. Editors: Lars Ahrenberg, Jörg Tiedemann and Martin Volk. NEALT Proceedings Series, Vol. 10 (2010), 63-72. © 2010 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt .
Tiedemann, Jörgeditor   +4 more
openaire   +2 more sources

Parallel Translation Corpus

2018
In this chapter, we have addressed some of the theoretical and practical issues relating to the generation, processing and management of a parallel translation corpus (PTC) with reference to some Indian languages. A PTC developed in a consortium-mode project under the aegis of DeitY, Govt. of India is discussed.
Niladri Sekhar Dash, S. Arulmozi
openaire   +1 more source

Home - About - Disclaimer - Privacy