Parallel corpus - Open Access .click

Results 1 to 10 of about 2,430,336 (342)

XDailyDialog: A Multilingual Parallel Dialogue Corpus

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023
High-quality datasets are significant to the development of dialogue models. However, most existing datasets for open-domain dialogue modeling are limited to a single language. The absence of multilingual open-domain dialog datasets not only limits the research on multilingual or cross-lingual transfer learning, but also hinders the development of ...
Zeming Liu +7 more
openaire +2 more sources

Building a Parallel Corpus for English Translation Teaching Based on Computer-Aided Translation Software

Computer-Aided Design and Applications, 2020
This paper conducts an in-depth study on the construction of parallel corpus for English translation teaching through computer-aided translation software, and this study adopts a combination of corpus statistics and analysis to portray and study the ...
Xiaolin Wang
semanticscholar +1 more source

Effective Parallel Corpus Mining using Bilingual Sentence Embeddings [PDF]

Conference on Machine Translation, 2018
This paper presents an effective approach for parallel corpus mining using bilingual sentence embeddings. Our embedding models are trained to produce similar representations exclusively for bilingual sentence pairs that are translations of each other ...
Mandy Guo +10 more
semanticscholar +1 more source

The Italian-Russian Parallel Corpus of the Nacional’nyj Korpus Russkogo Jazyka (NKRJa). Evolution and Applications in Italian Slavistics Research

Studi Slavistici
The aim of this article is to present a comprehensive overview of the studies conducted in Italy using the Italian-Russian parallel corpus of the Nacional’nyj Korpus russkogo jazyka (NKRJa), implemented in 2013 and then expanded since 2015.
Francesca Biagini, Tatsiana Maiko, Valentina Noseda +2 more
doaj +1 more source

Margin-based Parallel Corpus Mining with Multilingual Sentence Embeddings [PDF]

Annual Meeting of the Association for Computational Linguistics, 2018
Machine translation is highly sensitive to the size and quality of the training data, which has led to an increasing interest in collecting and filtering large parallel corpora.
Mikel Artetxe, Holger Schwenk
semanticscholar +1 more source

Translation of Modal Verbs in Media Texts: Corpus-Based Approach

Научный диалог, 2023
The main modal verbs of the English language (can, could, may, must, should, need, will, would) in media texts have been studied, namely the ways of their translation into Russian.
Ya. A. Volkova, A. S. Korzin, A. D. Uryupina +2 more
doaj +1 more source

Kilka uwag o ekwiwalencji międzyjęzykowej: na przykładzie konfrontacji polskiej i ukraińskiej frazeologii

Adeptus, 2021
Some Remarks on Interlingual Equivalence: Based on the Contrastive Analysis of Polish and Ukrainian Phraseology The article focuses on the problem of interlingual equivalence in the context of Polish and Ukrainian contemporary phraseology.
Roman Tymoshuk
doaj +1 more source

Representação gay em corpus literário paralelo Gay representation in parallel literary corpus

Revista Brasileira de Linguística Aplicada, 2010
Este artigo apresenta parte dos resultados de minha pesquisa de doutorado, com foco em como personagens gays e suas realidades de mundo são representados por meio da transitividade (HALLIDAY; MATTHIESSEN, 2004).
Adail Sebastião Rodrigues Júnior
doaj +1 more source

Building an Owl-Ontology for Representing, Linking and Querying SemAF Discourse Annotations

Rasprave Instituta za Hrvatski Jezik i Jezikoslovlje, 2023
Linguistic Linked Open Data (LLOD) are technologies that provide a powerful instrument for representing and interpreting language phenomena on a web-scale.
Christian Chiarcos +8 more
doaj +1 more source

The “New Normal” Terminology: A Corpus-Based Study Into Term Variation in COVID-19-Related EU Legislative Texts

Rasprave Instituta za Hrvatski Jezik i Jezikoslovlje, 2023
This paper provides a contrastive analysis of COVID-19-related EU legislative texts with emphasis on term variants. The analysis carried out on a parallel corpus consisting of English and Croatian texts has identified multiple examples of variation of ...
Katja Dobrić Basaneže, Martina Bajčić +1 more
doaj +1 more source

natural language processing
fos: computer and information sciences
corpus linguistics

computation and language cs.cl
computer science - computation and language
translation

languages and literatures
parallel corpora
neural machine translation