Comparable corpora - Open Access .click

Results 11 to 20 of about 9,919 (261)

The use of English, Czech and French punctuation marks in reference, parallel and comparable web corpora: a question of methodology [PDF]

Linguistica Pragensia, 2020
This paper analyses the frequency of six punctuation marks (the comma, period, colon, semicolon, question mark and exclamation mark) in three languages (English, French and Czech) in three different types of corpora — comparable web corpora, large ...
Olga Nádvorníková
doaj

Experimental Polish-Lithuanian Corpus with the Semantic Annotation Elements

Cognitive Studies | Études cognitives, 2015
Experimental Polish-Lithuanian Corpus with the Semantic Annotation Elements In the article the authors present the experimental Polish-Lithuanian corpus (ECorpPL-LT) formed for the idea of Polish-Lithuanian theoretical contrastive studies, a Polish ...
Danuta Roszko, Roman Roszko
doaj +1 more source

A Simple Yet Robust Algorithm for Automatic Extraction of Parallel Sentences: A Case Study on Arabic-English Wikipedia Articles

IEEE Access, 2022
Parallel corpora are vital components in several applications of Natural Language Processing (NLP), particularly in machine translation. In this paper, we present a novel method for automatically creating parallel sentences from comparable corpora.
Maha Jarallah Althobaiti
doaj +1 more source

Exploiting comparable corpora with TER and TERp [PDF]

Proceedings of the 2nd Workshop on Building and Using Comparable Corpora from Parallel to Non-parallel Corpora - BUCC '09, 2009
In this paper we present an extension of a successful simple and effective method for extracting parallel sentences from comparable corpora and we apply it to an Arabic/English NIST system. We experiment with a new TERp filter, along with WER and TER filters.
Sadaf Abdul-Rauf, Holger Schwenk
openaire +2 more sources

‘Experience Norfolk! Experience Fun!’ vs. ‘Doživi više od očekivanog’ – A Corpus-Based Contrastive Study of Reader Engagement Markers on the Web

ELOPE, 2023
The paper investigates how reader engagement markers (Hyland 2005; Zou and Hyland 2020) are used in tourism promotion to establish interaction with potential customers on the web.
Dragana Vuković Vojnović
doaj +1 more source

Set-Theoretic Alignment for Comparable Corpora [PDF]

Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2016
We describe and evaluate a simple method to extract parallel sentences from comparable corpora. The approach, termed STACC, is based on expanded lexical sets and the Jaccard similarity coefficient. We evaluate our system against state-of-theart methods on a large range of datasets in different domains, for ten language pairs, showing that it either ...
Thierry Etchegoyhen, Andoni Azpeitia
openaire +1 more source

Bilingual Topic Models for Comparable Corpora

CoRR, 2021
32 pages, 2 ...
Georgios Balikas, Massih-Reza Amini, Marianne Clausel +2 more
openaire +2 more sources

PyPlutchik: Visualising and comparing emotion-annotated corpora

PLOS ONE, 2021
The increasing availability of textual corpora and data fetched from social networks is fuelling a huge production of works based on the model proposed by psychologist Robert Plutchik, often referred simply as the “Plutchik Wheel”. Related researches range from annotation tasks description to emotions detection tools.
Alfonso Semeraro, Salvatore Vilella, Giancarlo Ruffo +2 more
openaire +6 more sources

Online Parallel and Comparable Corpora for Legal Translations

Altre Modernità, 2018
The use of corpus linguistics for technical translations has largely been advocated by scholars over the years. This paper is aimed at providing instances of legal term search in online parallel and comparable corpora.
Patrizia Giampieri
doaj +1 more source

Electronic Corpora for Legal English Translator/ Interpreter Training - A Case Study

Romanian Journal of English Studies, 2018
This article looks into the experience of using parallel and comparable corpora in the training of future legal English translators and interpreters between English and Montenegrin.
Bulatović Vesna
doaj +1 more source

parallel corpora
natural language processing
computational linguistics

fos: computer and information sciences
corpus linguistics
computation and language cs.cl

comparable corpus
parallel and comparable corpora
computer science - computation and language