Results 31 to 40 of about 323,110 (290)
Moving beyond Kucera and Francis: a critical evaluation of current word frequency norms and the introduction of a new and improved word frequency measure for American English [PDF]
Word frequency is the most important variable in research on word processing and memory. Yet, the main criterion for selecting word frequency norms has been the availability of the measure, rather than its quality.
Brysbaert, Marc, New, Boris
core +1 more source
University of Glasgow at WebCLEF 2005: experiments in per-field normalisation and language specific stemming [PDF]
We participated in the WebCLEF 2005 monolingual task. In this task, a search system aims to retrieve relevant documents from a multilingual corpus of Web documents from Web sites of European governments.
He, B. +4 more
core +1 more source
Investigaram-se neste trabalho compreensões e concepções sobre corpus de análise em dissertações e teses de um programa de pós-graduação em Educação em Ciências e Matemática.
Julio Murilo Trevas dos Santos +1 more
doaj +1 more source
Korpus XIX w. Uniwersytetu Warszawskiego i IJP PAN
CORPUS OF THE 19TH CENTURY OF THE WARSAW UNIVERSITY AND IJP PAN The article describes a historical corpus which documents the 19th and early 20th century.
Marek Łaziński +2 more
doaj +1 more source
In this paper, we present a distributional word embedding model trained on one of the largest available Russian corpora: Araneum Russicum Maximum (over 10 billion words crawled from the web).
Kunilovskaya, Maria, Kutuzov, Andrey
core +1 more source
The article is devoted to the identification of relevant parameters of differentiation for the close emotion concepts ENVY and JEALOUSY based on the analysis of their names profiles in the iWeb web corpus.
Kostiantyn Mizin, Liudmyla Slavova
doaj +1 more source
More effective boilerplate removal – the GoldMiner algorithm [PDF]
—The ever-increasing web is an important source for building large-scale corpora. However, dynamically generated web pages often contain much irrelevant and duplicated text, which impairs the quality of the corpus. To ensure the high quality of web-based
Endrédy, István, Novák, Attila
core +2 more sources
What Do Large Language Models Know About Materials?
If large language models (LLMs) are to be used inside the material discovery and engineering process, they must be benchmarked for the accurateness of intrinsic material knowledge. The current work introduces 1) a reasoning process through the processing–structure–property–performance chain and 2) a tool for benchmarking knowledge of LLMs concerning ...
Adrian Ehrenhofer +2 more
wiley +1 more source
The Web as Corpus in Translation
This thesis introduces corpus-based practices and studies in translation, including monocorpus, parallel corpus and web (as corpus) using in translation. Making an effective query on internet has been demonstrated in this thesis by WebCONC, Webcorp and KWiCFinder alike, and 9 corpus tools have also been listed in the thesis to facilitate translation ...
openaire +2 more sources
The repair and regeneration of brain tissue faces both biological and technical challenges. Injectable bioscaffolds offer new opportunities to stimulate tissue regrowth in the brain by recruiting neural stem cells. Here, the translational issues are reviewed that need to be address to advance this promising new therapeutic approach from the bench to ...
Michel Modo, Alena Kisel
wiley +1 more source

