Results 41 to 50 of about 2,430,336 (342)

A large English–Thai parallel corpus from the web and machine-generated text [PDF]

open access: yesLanguage Resources and Evaluation, 2020
The primary objective of our work is to build a large-scale English–Thai dataset for training neural machine translation models. We construct scb-mt-en-th-2020, an English–Thai machine translation dataset with over 1 million segment pairs, curated from ...
Lalita Lowphansirikul   +3 more
semanticscholar   +1 more source

MulTed: a multilingual aligned and tagged parallel corpus [PDF]

open access: yesApplied Computing and Informatics, 2022
Recently, more data-driven approaches are demanding multilingual parallel resources primarily in the cross-language studies. To meet these demands, building multilingual parallel corpora are becoming the focus of many Natural Language Processing (NLP ...
Imad Zeroual, Abdelhak Lakhouaja
doaj   +1 more source

PARALLEL CORPUS OF TEXTS: THEORETICAL, METHODOLOGICAL AND LEXICOGRAPHICAL ANALYSIS OF PRINCIPLES

open access: yesЗаписки з українського мовознавства, 2017
The article deals with theoretical and methodological aspects of processing two or more texts from the corpus translation, indicates a positive lexicographical level of parallel corpus translation (text enrichment stable phrases, idioms, specification ...
Ю. І. Дем’янчук
doaj   +1 more source

Peningkatan Akurasi Mesin Penerjemah Bahasa Inggris - Indonesia dengan Memaksimalkan Kualitas dan Kuantitas Korpus Paralel

open access: yesJurnal Teknologi Informasi dan Ilmu Komputer, 2020
Korpus paralel memiliki peran yang sangat penting dalam mesin penerjemah statistik (MPS). Korpus paralel yang diperoleh berbagai sumber biasanya memiliki kualitas yang kurang baik, sedangkan kuantitas korpus paralel merupakan tuntutan utama bagi hasil ...
Herry Sujaini
doaj   +1 more source

PyThaiNLP/Thai-Lao-Parallel-Corpus: Thai Lao Parallel corpus v0.5

open access: yes, 2020
Thai Lao Parallel corpus Thai Lao Parallel corpus version: 0.5 File vientiane-thaiembassy.csv : Data from Royal Thai Embassy Vientiane, Lao PDR. (http://vientiane.thaiembassy.org).
Wannaphong Phatthiyaphaibun
core   +1 more source

Corpus-based studies in conference interpreting

open access: yesСлово.ру: балтийский акцент, 2019
Corpus-based interpreting studies (CIS) are a relatively recent “[…] Off-shoot of Corpus-based Translation Studies” to quote the seminal paper (1998) by the late Miriam Shlesinger, a constant source of inspiration for the T&I community.
Russo M.
doaj   +1 more source

Comparing k-means clusters on parallel Persian-English corpus [PDF]

open access: yesJournal of Artificial Intelligence and Data Mining, 2015
This paper compares clusters of aligned Persian and English texts obtained from k-means method. Text clustering has many applications in various fields of natural language processing.
A. Khazaei, M. Ghasemzadeh
doaj   +1 more source

An Enhanced Method for Neural Machine Translation via Data Augmentation Based on the Self-Constructed English-Chinese Corpus, WCC-EC

open access: yesIEEE Access, 2023
In an era of increasing globalization, the imperative for understanding multilingual texts elevated the role of translation to an everyday necessity.
Jinyi Zhang   +4 more
doaj   +1 more source

On the Benefits of Foreign Language Learning Based on Parallel Language Corpus

open access: yesCognitive Studies | Études cognitives, 2015
On the Benefits of Foreign Language Learning Based on Parallel Language Corpus A recently observed strong interest in language corpora, which can be defined as a collection of texts in an electronic format, as well as my work within the European ...
Joanna Satoła-Staśkowiak
doaj   +1 more source

A Bilingual Parallel Corpus with Discourse Annotations

open access: yesCoRR, 2022
Machine translation (MT) has almost achieved human parity at sentence-level translation. In response, the MT community has, in part, shifted its focus to document-level translation. However, the development of document-level MT systems is hampered by the lack of parallel document corpora.
Yuchen Eleanor Jiang   +5 more
openaire   +2 more sources

Home - About - Disclaimer - Privacy