Results 31 to 40 of about 287,304 (284)
Parallel Strands: A Preliminary Investigation into Mining the Web for Bilingual Text [PDF]
Parallel corpora are a valuable resource for machine translation, but at present their availability and utility is limited by genre- and domain-specificity, licensing restrictions, and the basic difficulty of locating parallel texts in all but the most ...
Resnik, Philip
core +4 more sources
A Teacher-Student Framework for Zero-Resource Neural Machine Translation [PDF]
While end-to-end neural machine translation (NMT) has made remarkable progress recently, it still suffers from the data scarcity problem for low-resource language pairs and domains.
Chen, Yun +3 more
core +2 more sources
Language Resources – a Part of World Cultural Heritage
This article briefly reviews multilingual language resources for Bulgarian, developed in the frame of some international projects: the first-ever annotated Bulgarian MTE digital lexical resources, Bulgarian-Polish corpus, Bulgarian-Slovak parallel and ...
Ludmila Dimitrova
doaj +1 more source
The paper relates about our ongoing work on the creation of a corpus of Bulgarian and Ukrainian parallel texts. We discuss some differences in the approaches and the interpretation of some concepts, as well as various problems associated with the ...
Olena Siruk, Ivan Derzhanski
doaj +1 more source
Translation of Modal Verbs in Media Texts: Corpus-Based Approach
The main modal verbs of the English language (can, could, may, must, should, need, will, would) in media texts have been studied, namely the ways of their translation into Russian.
Ya. A. Volkova +2 more
doaj +1 more source
NICT’s Corpus Filtering Systems for the WMT18 Parallel Corpus Filtering Task [PDF]
This paper presents the NICT's participation in the WMT18 shared parallel corpus filtering task. The organizers provided 1 billion words German-English corpus crawled from the web as part of the Paracrawl project. This corpus is too noisy to build an acceptable neural machine translation (NMT) system.
Wang, Rui +3 more
openaire +2 more sources
Some Remarks on Interlingual Equivalence: Based on the Contrastive Analysis of Polish and Ukrainian Phraseology The article focuses on the problem of interlingual equivalence in the context of Polish and Ukrainian contemporary phraseology.
Roman Tymoshuk
doaj +1 more source
The aim of this article is to present a comprehensive overview of the studies conducted in Italy using the Italian-Russian parallel corpus of the Nacional’nyj Korpus russkogo jazyka (NKRJa), implemented in 2013 and then expanded since 2015.
Francesca Biagini +2 more
doaj +1 more source
Building an Owl-Ontology for Representing, Linking and Querying SemAF Discourse Annotations
Linguistic Linked Open Data (LLOD) are technologies that provide a powerful instrument for representing and interpreting language phenomena on a web-scale.
Christian Chiarcos +8 more
doaj +1 more source
A Large Parallel Corpus of Full-Text Scientific Articles
The Scielo database is an important source of scientific information in Latin America, containing articles from several research domains. A striking characteristic of Scielo is that many of its full-text contents are presented in more than one language ...
Becker, Karin +2 more
core +2 more sources

