Parallel corpora - Open Access .click

Results 11 to 20 of about 11,846 (303)

Bootstrapping parallel corpora [PDF]

Proceedings of the HLT-NAACL 2003 Workshop on Building and using parallel texts data driven machine translation and beyond -, 2003
We present two methods for the automatic creation of parallel corpora. Whereas previous work into the automatic construction of parallel corpora has focused on harvesting them from the web, we examine the use of existing parallel corpora to bootstrap data for new language pairs.
Chris Callison-Burch, Miles Osborne
openaire +2 more sources

Building English – Punjabi Aligned Parallel Corpora of Nouns from Comparable Corpora

Applied Computer Systems, 2023
Comparable corpora are the right resources for extracting parallel data due to their abundant availability. It is of great importance where parallel data are scarce.
Kaur Dilshad, Singh Satwinder
doaj +1 more source

Automatic Generation of Exercises for Second Language Learning from Parallel Corpus Data [PDF]

International Journal of TESOL Studies, 2021
Creating language learning exercises is a time-consuming task and made-up sample sentences frequently lack authenticity. Authentic samples can be obtained from corpora, but it is necessary to identify material that is suitable for language learners ...
Arianna Zanetti, Elena Volodina, Johannes Graën +2 more
doaj +1 more source

Automatic alignment in parallel corpora [PDF]

Proceedings of the 32nd annual meeting on Association for Computational Linguistics -, 1994
This paper addresses the alignment issue in the framework of exploitation of large bimultilingual corpora for translation purposes. A generic alignment scheme is proposed that can meet varying requirements of different applications. Depending on the level at which alignment is sought, appropriate surface linguistic information is invoked coupled with ...
Harris Papageorgiou, Lambros Cranias, Stelios Piperidis +2 more
openaire +1 more source

MulTed: a multilingual aligned and tagged parallel corpus [PDF]

Applied Computing and Informatics, 2022
Recently, more data-driven approaches are demanding multilingual parallel resources primarily in the cross-language studies. To meet these demands, building multilingual parallel corpora are becoming the focus of many Natural Language Processing (NLP ...
Imad Zeroual, Abdelhak Lakhouaja
doaj +1 more source

Sense discrimination with parallel corpora [PDF]

Proceedings of the ACL-02 workshop on Word sense disambiguation recent successes and future directions -, 2002
This paper describes an experiment that uses translation equivalents derived from parallel corpora to determine sense distinctions that can be used for automatic sense-tagging and other disambiguation tasks. Our results show that sense distinctions derived from cross-lingual information are at least as reliable as those made by human annotators ...
Nancy Ide, Tomaz Erjavec, Dan Tufis
openaire +1 more source

Parallel Corpus Research and Target Language Representativeness: The Contrastive, Typological, and Translation Mining Traditions

Languages, 2022
This paper surveys the strategies that the Contrastive, Typological, and Translation Mining parallel corpus traditions rely on to deal with the issue of target language representativeness of translations.
Bert Le Bruyn +6 more
doaj +1 more source

Paraphrasing with bilingual parallel corpora [PDF]

Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics - ACL '05, 2005
Previous work has used monolingual parallel corpora to extract and generate paraphrases. We show that this task can be done using bilingual parallel corpora, a much more commonly available resource. Using alignment techniques from phrasebased statistical machine translation, we show how paraphrases in one language can be identified using a phrase in ...
Bannard, Colin, Callison-Burch, Chris
openaire +1 more source

Aligning sentences in parallel corpora [PDF]

Proceedings of the 29th annual meeting on Association for Computational Linguistics -, 1991
In this paper we describe a statistical technique for aligning sentences with their translations in two parallel corpora. In addition to certain anchor points that are available in our data, the only information about the sentences that we use for calculating alignments is the number of tokens that they contain.
Peter F. Brown, Jennifer C. Lai, Robert L. Mercer +2 more
openaire +1 more source

Feasibility of using corpora as a tool in translation practice

Australian Journal of Applied Linguistics, 2023
Professional translators commonly employ various tools to streamline and ensure the accuracy and consistency of their work. One such tool is corpora, which becomes particularly crucial when dealing with authentic texts like those from the United Nations
Noureldin Abdelaal
doaj +1 more source

parallel corpus
corpus linguistics
comparable corpora

neural machine translation
machine translation
natural language processing

polish
contrastive studies
corpora