Results 31 to 40 of about 95,148 (326)
Semi-Supervised Learning for Neural Machine Translation
While end-to-end neural machine translation (NMT) has made remarkable progress recently, NMT systems only rely on parallel corpora for parameter estimation. Since parallel corpora are usually limited in quantity, quality, and coverage, especially for low-
Cheng, Yong +6 more
core +1 more source
Exploiting parallel treebanks to improve phrase-based statistical machine translation [PDF]
We use existing tools to automatically build two parallel treebanks from existing parallel corpora. We then show that combining the data extracted from both the treebanks and the corpora into a single translation model can improve the translation quality
Hearne, Mary, Tinsley, John, Way, Andy
core +5 more sources
This chapter gives an overview of parallel corpora, i.e. corpora containing source texts in a given language, aligned with their translations in another language. More specifically, it focuses on directional corpora, i.e. parallel corpora where the source and target languages are clearly identified. These types of corpora are widely used in contrastive
openaire +2 more sources
Aligning sentences in parallel corpora [PDF]
In this paper we describe a statistical technique for aligning sentences with their translations in two parallel corpora. In addition to certain anchor points that are available in our data, the only information about the sentences that we use for calculating alignments is the number of tokens that they contain.
Peter F. Brown +2 more
openaire +1 more source
About Certain Semantic Annotation in Parallel Corpora
About Certain Semantic Annotation in Parallel Corpora The semantic notation analyzed in this works is contained in the second stream of semantic theories presented here – in the direct approach semantics.
Violetta Koseska-Toszewa
doaj +1 more source
Automatic Identification of AltLexes using Monolingual Parallel Corpora
The automatic identification of discourse relations is still a challenging task in natural language processing. Discourse connectives, such as "since" or "but", are the most informative cues to identify explicit relations; however discourse parsers ...
Davoodi, Elnaz, Kosseim, Leila
core +1 more source
MultiMWE: building a multi-lingual multi-word expression (MWE) parallel corpora [PDF]
Multi-word expressions (MWEs) are a hot topic in research in natural language processing (NLP), including topics such as MWE detection, MWE decomposition, and research investigating the exploitation of MWEs in other NLP fields such as Machine Translation.
Han, Lifeng +2 more
core +1 more source
This research has been co-financed by the European Union (European Social Fund – ESF) and Greek national funds through the Operational Program "Education and Lifelong Learning" of the National Strategic Reference Framework (NSRF) - Research Funding ...
Eleni Tziafa
doaj +1 more source
On New Manually Aligned and Tagged Bilingual Parallel Corpora and Their Applications This article is devoted to the manually aligned and tagged bilingual parallel CLARIN-PL-BIZ corpora of the Baltic and Slavic languages which are currently being ...
Roman Roszko
doaj +1 more source
Model Transfer for Tagging Low-resource Languages using a Bilingual Dictionary
Cross-lingual model transfer is a compelling and popular method for predicting annotations in a low-resource language, whereby parallel corpora provide a bridge to a high-resource language and its associated annotated corpora.
Cohn, Trevor, Fang, Meng
core +1 more source

