Parallel and comparable corpora - Open Access .click

Results 251 to 260 of about 3,792 (286)

Some of the next articles are maybe not open access.

Parallel and Comparable Bilingual Corpora in Language Teaching and Learning

2000
An abstract is not available.
Peters C, Picchi E, Biagini L
openaire +5 more sources

Parallel and Comparable Corpora for Terminology Analysis in the Domain of Migration

Language for International Communication: Linking Interdisciplinary Perspectives: Language for Specific Purposes in the Era of Multilingualism and Technologies. Volume 4, 2023
The aim of the paper is to present the bilingual (English – Lithuanian) corpora compiled for research on specialised language in the domain of migration. The topic of migration is found to be one of the most significant themes for discussion recently.
Olga Ušinskienė, Sigita Rackevičienė
openaire +1 more source

Parallel Texts Extraction from Multimodal Comparable Corpora

2012
Statistical machine translation (SMT) systems depend on the availability of domain-specific bilingual parallel text. However parallel corpora are a limited resource and they are often not available for some domains or language pairs. We analyze the feasibility of extracting parallel sentences from multimodal comparable corpora.
Haithem Afli, Loïc Barrault, Holger Schwenk +2 more
openaire +1 more source

Mining Parallel Resources for Machine Translation from Comparable Corpora

2015
Good performance of Statistical Machine Translation (SMT) is usually achieved with huge parallel bilingual training corpora, because the translations of words or phrases are computed basing on bilingual data. However, in case of low-resource language pairs such as English-Bengali, the performance is affected by insufficient amount of bilingual training
Santanu Pal +3 more
openaire +1 more source

Matching Graph, a Method for Extracting Parallel Information from Comparable Corpora

ACM Transactions on Asian and Low-Resource Language Information Processing, 2019
Comparable corpora are valuable alternatives for the expensive parallel corpora. They comprise informative parallel fragments that are useful resources for different natural language processing tasks. In this work, a generative model is proposed for efficient extraction of parallel fragments from a pair of comparable documents. The core of the proposed
Somayeh Bakhshaei, Reza Safabakhsh, Shahram Khadivi +2 more
openaire +1 more source

Comparable Multilingual Patents as Large-Scale Parallel Corpora

2013
Parallel corpora are critical resources for building many NLP applications, ranging from machine translation (MT) to cross-lingual information retrieval. In this chapter, we explore a new but important area involving patents by investigating the potential of cultivating large-scale parallel corpora from comparable multilingual patents. Two major issues
Bin Lu, Ka Po Chow, Benjamin K. Tsou
openaire +1 more source

From parallel to comparable text corpora

1996
We present a bilingual corpus management system under development in Pisa. The first component of this system was a set of procedures to create and query parallel text archives; we are now studying the implementation of a second set of procedures to interrogate comparable archives.
Peters C, Picchi E
openaire +2 more sources

Comparable and Parallel Corpora for Machine Translation

2023
Serge Sharoff, Reinhard Rapp, Pierre Zweigenbaum +2 more
openaire +1 more source

Unsupervised Construction of Quasi-comparable Corpora and Probing for Parallel Textual Data

2016
The multilingual nature of the world makes translation a crucial requirement today. Parallel dictionaries constructed by humans are a widely-available resource, but they are limited and do not provide enough coverage for good quality translation purposes, due to out-of-vocabulary words and neologisms.
Krzysztof Wolk, Krzysztof Marasek
openaire +1 more source

The Conditional Perfect, A Quantitative Analysis in English-French Comparable-Parallel Corpora

2020
La fréquence du conditionnel parfait en anglais et en français a été observée dans un corpus de près de 12 millions de mots comprenant quatre sous-corpus comparables et parallèles de 2.9 millions de mots chacun, étiquetés par catégorie grammaticale et par lemme, et analysés par expressions rationnelles (regex).
openaire +2 more sources

parallel corpora
comparable corpora
natural language processing

corpus linguistics
computational linguistics
corpus

[shs] humanities and social sciences
[info] computer science [cs]
corpus-based translation studies