Results 121 to 130 of about 23,901 (251)
Variationist: Exploring Multifaceted Variation and Bias in Written Language Data [PDF]
Exploring and understanding language data is a fundamental stage in all areas dealing with human language. It allows NLP practitioners to uncover quality concerns and harmful biases in data before training, and helps linguists and social scientists to gain insight into language use and human behavior.
arxiv
Weaving words for textile museums: the development of the linked SILKNOW thesaurus. [PDF]
Alba E+4 more
europepmc +1 more source
Kanembu-Kanuri relationship: a proposal [PDF]
The paper takes recourse to oral tradition and linguistics to ascertain the assertion that the presentday Kanuri and Kanembu speech forms emerged from the same parent language.
Bulakarima, Shettima Umara
core
ZuantuSet: A Collection of Historical Chinese Visualizations and Illustrations [PDF]
Historical visualizations are a valuable resource for studying the history of visualization and inspecting the cultural context where they were created. When investigating historical visualizations, it is essential to consider contributions from different cultural frameworks to gain a comprehensive understanding.
arxiv +1 more source
Open Problems in Computational Historical Linguistics. [PDF]
List JM.
europepmc +1 more source
Adapting Multilingual Embedding Models to Historical Luxembourgish [PDF]
The growing volume of digitized historical texts requires effective semantic search using text embeddings. However, pre-trained multilingual models face challenges with historical content due to OCR noise and outdated spellings. This study examines multilingual embeddings for cross-lingual semantic search in historical Luxembourgish (LB), a low ...
arxiv
Geography and language divergence: The case of Andic languages. [PDF]
Koile E, Chechuro I, Moroz G, Daniel M.
europepmc +1 more source
The speech community (SpCom), a core concept in empirical linguistics, is at the intersection of many principal problems in sociolinguistic theory and method.
Patrick, Peter L
core
Spanish dialect classifications [PDF]
Se ha considerado artículo en vez de capítulo de libroThis paper presents an overview of the proposals for a dialect division of the Spanish language in Europe. Given the enormous number of studies published on this subject, rather than being exhaustive,
Molina Martos, Isabel
core +1 more source
Contrastive Entity Coreference and Disambiguation for Historical Texts [PDF]
Massive-scale historical document collections are crucial for social science research. Despite increasing digitization, these documents typically lack unique cross-document identifiers for individuals mentioned within the texts, as well as individual identifiers from external knowledgebases like Wikipedia/Wikidata.
arxiv