Results 11 to 20 of about 9,919 (261)
The use of English, Czech and French punctuation marks in reference, parallel and comparable web corpora: a question of methodology [PDF]
This paper analyses the frequency of six punctuation marks (the comma, period, colon, semicolon, question mark and exclamation mark) in three languages (English, French and Czech) in three different types of corpora — comparable web corpora, large ...
Olga Nádvorníková
doaj
Experimental Polish-Lithuanian Corpus with the Semantic Annotation Elements
Experimental Polish-Lithuanian Corpus with the Semantic Annotation Elements In the article the authors present the experimental Polish-Lithuanian corpus (ECorpPL-LT) formed for the idea of Polish-Lithuanian theoretical contrastive studies, a Polish ...
Danuta Roszko, Roman Roszko
doaj +1 more source
Parallel corpora are vital components in several applications of Natural Language Processing (NLP), particularly in machine translation. In this paper, we present a novel method for automatically creating parallel sentences from comparable corpora.
Maha Jarallah Althobaiti
doaj +1 more source
Exploiting comparable corpora with TER and TERp [PDF]
In this paper we present an extension of a successful simple and effective method for extracting parallel sentences from comparable corpora and we apply it to an Arabic/English NIST system. We experiment with a new TERp filter, along with WER and TER filters.
Sadaf Abdul-Rauf, Holger Schwenk
openaire +2 more sources
The paper investigates how reader engagement markers (Hyland 2005; Zou and Hyland 2020) are used in tourism promotion to establish interaction with potential customers on the web.
Dragana Vuković Vojnović
doaj +1 more source
Set-Theoretic Alignment for Comparable Corpora [PDF]
We describe and evaluate a simple method to extract parallel sentences from comparable corpora. The approach, termed STACC, is based on expanded lexical sets and the Jaccard similarity coefficient. We evaluate our system against state-of-theart methods on a large range of datasets in different domains, for ten language pairs, showing that it either ...
Thierry Etchegoyhen, Andoni Azpeitia
openaire +1 more source
Bilingual Topic Models for Comparable Corpora
32 pages, 2 ...
Georgios Balikas +2 more
openaire +2 more sources
PyPlutchik: Visualising and comparing emotion-annotated corpora
The increasing availability of textual corpora and data fetched from social networks is fuelling a huge production of works based on the model proposed by psychologist Robert Plutchik, often referred simply as the “Plutchik Wheel”. Related researches range from annotation tasks description to emotions detection tools.
Alfonso Semeraro +2 more
openaire +6 more sources
Online Parallel and Comparable Corpora for Legal Translations
The use of corpus linguistics for technical translations has largely been advocated by scholars over the years. This paper is aimed at providing instances of legal term search in online parallel and comparable corpora.
Patrizia Giampieri
doaj +1 more source
Electronic Corpora for Legal English Translator/ Interpreter Training - A Case Study
This article looks into the experience of using parallel and comparable corpora in the training of future legal English translators and interpreters between English and Montenegrin.
Bulatović Vesna
doaj +1 more source

