Results 11 to 20 of about 25,861 (256)

Annotated web as corpus [PDF]

open access: yesProceedings of the 2nd International Workshop on Web as Corpus - WAC '06, 2006
This paper presents a proposal to facilitate the use of the annotated web as corpus by alleviating the annotation bottleneck for corpus data drawn from the web. We describe a framework for large-scale distributed corpus annotation using peer-to-peer (P2P) technology to meet this need.
Rayson, P.   +3 more
openaire   +1 more source

An Emergent Approach to Text Analysis Based on a Connectionist Model and the Web

open access: yesAlgorithms, 2013
In this paper, we present a method to provide proactive assistance in text checking, based on usage relationships between words structuralized on the Web.
Gigliola Vaglini, Mario G.C.A. Cimino
doaj   +1 more source

Acoustic classification of focus: On the web and in the lab

open access: yesLaboratory Phonology, 2017
We present a new methodological approach which combines both naturally-occurring speech harvested on the web and speech data elicited in the laboratory.
Jonathan Howell   +2 more
doaj   +2 more sources

Using web as corpus in phraseological researches: A damaging practice or a valuable resource?

open access: yesCalidoscópio, 2016
This paper intends to present the methodological approach used in the search for Idiomatic Expressions of Metaphorical Low-Deductible (EIBDM) found in Italian dictionaries of phraseologisms.
Eloísa Moriel Valença   +1 more
doaj   +3 more sources

A learner corpus is born this way: From raw data to processed dataset

open access: yesData in Brief, 2022
This data article presents the development of a learner corpus (i.e. a systematic computerized web-based repository of written texts produced by language learners) from the initial phase of the development where written assignments were collected from ...
Chung Hong Danny Leung   +2 more
doaj   +1 more source

When Intuition Fails us: the World Wide Web as a Corpus

open access: yesGlottodidactica, 2007
In some respects corpus linguistics has made a significant contribution to foreign language (L2) instruction: for example, reference tools like dictionaries and grammar books are at present enriched by various types of information derived from corpora ...
Paweł Scheffler
doaj   +1 more source

THE WEB AS CORPUS AND ONLINE CORPORA FOR LEGAL TRANSLATIONS

open access: yesComparative Legilinguistics, 2019
Legal language is hallmarked by a pedantic and user-unfriendly jargon whose constructs are all but intuitive, not to mention the legal system specificity which makes it unique in every country.
Patrizia GIAMPIERI
doaj   +4 more sources

A Manual for Web Corpus Crawling of Low Resource Languages

open access: yesUmanistica Digitale, 2020
Since the seminal publication of “Web as Corpus” [1], the potential of creating corpora from the web has been realized for good for the creation of both online and offline corpora: noisy vs. clean, balanced vs. convenient, annotated vs.
Armin Hoenen   +2 more
doaj   +1 more source

A Web Corpus and Word Sketches for Japanese

open access: yesJournal of Natural Language Processing, 2008
Of all the major world languages, Japanese is lagging behind in terms of publicly accessible and searchable corpora. In this paper we describe the development of JpWaC (Japanese Web as Corpus), a large corpus of 400 million words of Japanese web text, and its encoding for the Sketch Engine.
Erjavec, Irena Srdanovic   +2 more
openaire   +3 more sources

Home - About - Disclaimer - Privacy