Results 1 to 10 of about 416,067 (375)
An Analysis of Negation in Natural Language Understanding Corpora [PDF]
arXiv, 2022This paper analyzes negation in eight popular corpora spanning six natural language understanding tasks. We show that these corpora have few negations compared to general-purpose English, and that the few negations in them are often unimportant. Indeed, one can often ignore negations and still make the right predictions.
Md Mosharaf Hossain+2 more
arxiv +3 more sources
Abstracts of the Papers Printed in the Philosophical Transactions of the Royal Society of London
In this paper the author describes the origin, growth, use, and decay of the Corpora lutea. The ovarium, before puberty, is a loose, open texture, in which are a number of globular cells. After puberty, the Corpus luteum forms in the substance of the ovarium. In the cow it appears, when magnified, as a mass of convolutions, somewhat like the brain. Sir
Everard Home
openalex +5 more sources
In this paper the author describes the origin, growth, use, and decay of the Corpora lutea. The ovarium, before puberty, is a loose, open texture, in which are a number of globular cells. After puberty, the Corpus luteum forms in the substance of the ovarium. In the cow it appears, when magnified, as a mass of convolutions, somewhat like the brain. Sir
Everard Home
openalex +5 more sources
Voice Conversion from Non-parallel Corpora Using Variational Auto-encoder [PDF]
arXiv, 2016We propose a flexible framework for spectral conversion (SC) that facilitates training with unaligned corpora. Many SC frameworks require parallel corpora, phonetic alignments, or explicit frame-wise correspondence for learning conversion functions or for synthesizing a target spectrum with the aid of alignments.
Chin-Cheng Hsu+4 more
arxiv +3 more sources
Corpora and Translation. Are Corpora Still an Academic Luxury?
Vertimo Studijos, 2019This paper aims to consider the impact corpora have made on language studies and to touch upon the interface between corpora use and translator training/practice. A small-scale survey conducted among the translation trainers/professionals and translation
Jonė Grigaliūnienė
doaj +3 more sources
A Practical Handbook of Corpus Linguistics, 2020
This chapter deals with learner corpora, that is, collections of (spoken and/or written) texts produced by learners of a language. It describes their main characteristics, with particular emphasis on those that are distinctive of learner corpora. Special types of corpora are introduced, such as longitudinal learner corpora or local learner corpora. The
Gaëtanelle Gilquin
semanticscholar +3 more sources
This chapter deals with learner corpora, that is, collections of (spoken and/or written) texts produced by learners of a language. It describes their main characteristics, with particular emphasis on those that are distinctive of learner corpora. Special types of corpora are introduced, such as longitudinal learner corpora or local learner corpora. The
Gaëtanelle Gilquin
semanticscholar +3 more sources
The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only [PDF]
arXiv.org, 2023Large language models are commonly trained on a mixture of filtered web data and curated high-quality corpora, such as social media conversations, books, or technical papers.
Guilherme Penedo+8 more
semanticscholar +1 more source
Documenting Large Webtext Corpora: A Case Study on the Colossal Clean Crawled Corpus [PDF]
Conference on Empirical Methods in Natural Language Processing, 2021Large language models have led to remarkable progress on many NLP tasks, and researchers are turning to ever-larger text corpora to train them. Some of the largest corpora available are made by scraping significant portions of the internet, and are ...
Jesse Dodge+5 more
semanticscholar +1 more source
Word Alignment by Fine-tuning Embeddings on Parallel Corpora [PDF]
Conference of the European Chapter of the Association for Computational Linguistics, 2021Word alignment over parallel corpora has a wide variety of applications, including learning translation lexicons, cross-lingual transfer of language processing tools, and automatic evaluation or analysis of translation outputs. The great majority of past
Zi-Yi Dou, Graham Neubig
semanticscholar +1 more source
Samanantar: The Largest Publicly Available Parallel Corpora Collection for 11 Indic Languages [PDF]
Transactions of the Association for Computational Linguistics, 2021We present Samanantar, the largest publicly available parallel corpora collection for Indic languages. The collection contains a total of 49.7 million sentence pairs between English and 11 Indic languages (from two language families).
Gowtham Ramesh+16 more
semanticscholar +1 more source
Characteristics of expert´s report as evidence [PDF]
SHS Web of Conferences, 2020In recent years, there has been an increasing need for private expert´s report for judicial evidence. In practise, it appears that a well-developed expert´s report is an important bases for the court´s decision-making. It is no secret that the quality of
Kubica Milan, Švejdová Nikola
doaj +1 more source