Results 11 to 20 of about 25,861 (256)
This paper presents a proposal to facilitate the use of the annotated web as corpus by alleviating the annotation bottleneck for corpus data drawn from the web. We describe a framework for large-scale distributed corpus annotation using peer-to-peer (P2P) technology to meet this need.
Rayson, P. +3 more
openaire +1 more source
An Emergent Approach to Text Analysis Based on a Connectionist Model and the Web
In this paper, we present a method to provide proactive assistance in text checking, based on usage relationships between words structuralized on the Web.
Gigliola Vaglini, Mario G.C.A. Cimino
doaj +1 more source
Acoustic classification of focus: On the web and in the lab
We present a new methodological approach which combines both naturally-occurring speech harvested on the web and speech data elicited in the laboratory.
Jonathan Howell +2 more
doaj +2 more sources
Using web as corpus in phraseological researches: A damaging practice or a valuable resource?
This paper intends to present the methodological approach used in the search for Idiomatic Expressions of Metaphorical Low-Deductible (EIBDM) found in Italian dictionaries of phraseologisms.
Eloísa Moriel Valença +1 more
doaj +3 more sources
A learner corpus is born this way: From raw data to processed dataset
This data article presents the development of a learner corpus (i.e. a systematic computerized web-based repository of written texts produced by language learners) from the initial phase of the development where written assignments were collected from ...
Chung Hong Danny Leung +2 more
doaj +1 more source
When Intuition Fails us: the World Wide Web as a Corpus
In some respects corpus linguistics has made a significant contribution to foreign language (L2) instruction: for example, reference tools like dictionaries and grammar books are at present enriched by various types of information derived from corpora ...
Paweł Scheffler
doaj +1 more source
THE WEB AS CORPUS AND ONLINE CORPORA FOR LEGAL TRANSLATIONS
Legal language is hallmarked by a pedantic and user-unfriendly jargon whose constructs are all but intuitive, not to mention the legal system specificity which makes it unique in every country.
Patrizia GIAMPIERI
doaj +4 more sources
The World Wide Web as a Linguistic CorpusVersion française
None
T Russon Wooldridge
doaj +1 more source
A Manual for Web Corpus Crawling of Low Resource Languages
Since the seminal publication of “Web as Corpus” [1], the potential of creating corpora from the web has been realized for good for the creation of both online and offline corpora: noisy vs. clean, balanced vs. convenient, annotated vs.
Armin Hoenen +2 more
doaj +1 more source
A Web Corpus and Word Sketches for Japanese
Of all the major world languages, Japanese is lagging behind in terms of publicly accessible and searchable corpora. In this paper we describe the development of JpWaC (Japanese Web as Corpus), a large corpus of 400 million words of Japanese web text, and its encoding for the Sketch Engine.
Erjavec, Irena Srdanovic +2 more
openaire +3 more sources

