Results 121 to 130 of about 23,901 (251)

Variationist: Exploring Multifaceted Variation and Bias in Written Language Data [PDF]

open access: yesarXiv
Exploring and understanding language data is a fundamental stage in all areas dealing with human language. It allows NLP practitioners to uncover quality concerns and harmful biases in data before training, and helps linguists and social scientists to gain insight into language use and human behavior.
arxiv  

Weaving words for textile museums: the development of the linked SILKNOW thesaurus. [PDF]

open access: yesHerit Sci, 2022
Alba E   +4 more
europepmc   +1 more source

Kanembu-Kanuri relationship: a proposal [PDF]

open access: yes, 2006
The paper takes recourse to oral tradition and linguistics to ascertain the assertion that the presentday Kanuri and Kanembu speech forms emerged from the same parent language.
Bulakarima, Shettima Umara
core  

ZuantuSet: A Collection of Historical Chinese Visualizations and Illustrations [PDF]

open access: yes
Historical visualizations are a valuable resource for studying the history of visualization and inspecting the cultural context where they were created. When investigating historical visualizations, it is essential to consider contributions from different cultural frameworks to gain a comprehensive understanding.
arxiv   +1 more source

Adapting Multilingual Embedding Models to Historical Luxembourgish [PDF]

open access: yesarXiv
The growing volume of digitized historical texts requires effective semantic search using text embeddings. However, pre-trained multilingual models face challenges with historical content due to OCR noise and outdated spellings. This study examines multilingual embeddings for cross-lingual semantic search in historical Luxembourgish (LB), a low ...
arxiv  

Geography and language divergence: The case of Andic languages. [PDF]

open access: yesPLoS One, 2022
Koile E, Chechuro I, Moroz G, Daniel M.
europepmc   +1 more source

The speech community [PDF]

open access: yes, 2001
The speech community (SpCom), a core concept in empirical linguistics, is at the intersection of many principal problems in sociolinguistic theory and method.
Patrick, Peter L
core  

Spanish dialect classifications [PDF]

open access: yes
Se ha considerado artículo en vez de capítulo de libroThis paper presents an overview of the proposals for a dialect division of the Spanish language in Europe. Given the enormous number of studies published on this subject, rather than being exhaustive,
Molina Martos, Isabel
core   +1 more source

Contrastive Entity Coreference and Disambiguation for Historical Texts [PDF]

open access: yesarXiv
Massive-scale historical document collections are crucial for social science research. Despite increasing digitization, these documents typically lack unique cross-document identifiers for individuals mentioned within the texts, as well as individual identifiers from external knowledgebases like Wikipedia/Wikidata.
arxiv  

Home - About - Disclaimer - Privacy