A Teacher-Student Framework for Zero-Resource Neural Machine Translation [PDF]
While end-to-end neural machine translation (NMT) has made remarkable progress recently, it still suffers from the data scarcity problem for low-resource language pairs and domains.
Chen, Yun +3 more
core +2 more sources
The use of English, Czech and French punctuation marks in reference, parallel and comparable web corpora: a question of methodology [PDF]
This paper analyses the frequency of six punctuation marks (the comma, period, colon, semicolon, question mark and exclamation mark) in three languages (English, French and Czech) in three different types of corpora — comparable web corpora, large ...
Olga Nádvorníková
doaj
UPC: An Open Word-Sense Annotated Parallel Corpora for Machine Translation Study
Machine translation (MT) has recently attracted much research on various advanced techniques (i.e., statistical-based and deep learning-based) and achieved great results for popular languages.
Van-Hai Vu +3 more
doaj +1 more source
Semantics, contrastive linguistics and parallel corpora
Semantics, contrastive linguistics and parallel corpora In view of the ambiguity of the term “semantics”, the author shows the differences between the traditional lexical semantics and the contemporary semantics in the light of various semantic schools.
Violetta Koseska
doaj +1 more source
Semi-automatic ontological alignment of digitized books parallel corpora
In this paper, we present a method for general ontology management integration with an alignment of digitized books paraphrase corpus, which have been compiled from bilingual parallel corpus.
Algirdas Laukaitis, Neda Laukaitytė
doaj +1 more source
Parallel Strands: A Preliminary Investigation into Mining the Web for Bilingual Text [PDF]
Parallel corpora are a valuable resource for machine translation, but at present their availability and utility is limited by genre- and domain-specificity, licensing restrictions, and the basic difficulty of locating parallel texts in all but the most ...
Resnik, Philip
core +4 more sources
Computer Corpora and the law: a new approach to the translation of legal terms [PDF]
The use of computer corpora for the analysis of legal language is not common practice; still less the use of parallel corpora for the comparison of legal terminology. The Bononia Legal Corpus project (BoLC) began two years ago, and now as the first stage
Philip, Gill
core +1 more source
Experimental Polish-Lithuanian Corpus with the Semantic Annotation Elements
Experimental Polish-Lithuanian Corpus with the Semantic Annotation Elements In the article the authors present the experimental Polish-Lithuanian corpus (ECorpPL-LT) formed for the idea of Polish-Lithuanian theoretical contrastive studies, a Polish ...
Danuta Roszko, Roman Roszko
doaj +1 more source
Multilingual corpora in contrastive research on the vocative in Russian, Polish and Lithuanian
The aim of this paper was to conduct a contrastive analysis of the vocative forms in Russian, Polish and Lithuanian, which was to be prefaced by a short introduction that would discuss the benefits of using non-commercial multilingual corpora in such ...
Maksim Duszkin +2 more
doaj
Multilingual digital resources with Bulgarian language
Multilingual digital resources with Bulgarian language The paper presents in brief Bulgarian language resources as a part of multilingual digital resources developed in the frame of some international projects, among them parallel annotated and aligned
Ludmila Dimitrova
doaj +1 more source

