Results 71 to 80 of about 2,031,469 (352)

Double-Hard Debias: Tailoring Word Embeddings for Gender Bias Mitigation [PDF]

open access: yesAnnual Meeting of the Association for Computational Linguistics, 2020
Word embeddings derived from human-generated corpora inherit strong gender bias which can be further amplified by downstream models. Some commonly adopted debiasing approaches, including the seminal Hard Debias algorithm, apply post-processing procedures
Tianlu Wang   +4 more
semanticscholar   +1 more source

Phonetic Word Embeddings

open access: yes, 2021
This work presents a novel methodology for calculating the phonetic similarity between words taking motivation from the human perception of sounds. This metric is employed to learn a continuous vector embedding space that groups similar sounding words together and can be used for various downstream computational phonology tasks.
Sharma, Rahul   +2 more
openaire   +2 more sources

Efficient Estimate of Low-Frequency Words’ Embeddings Based on the Dictionary: A Case Study on Chinese

open access: yesApplied Sciences, 2021
Obtaining high-quality embeddings of out-of-vocabularies (OOVs) and low-frequency words is a challenge in natural language processing (NLP). To efficiently estimate the embeddings of OOVs and low-frequency words, we propose a new method that uses the ...
Xianwen Liao   +5 more
doaj   +1 more source

BioWordVec, improving biomedical word embeddings with subword information and MeSH

open access: yesScientific Data, 2019
Distributed word representations have become an essential foundation for biomedical natural language processing (BioNLP), text mining and information retrieval. Word embeddings are traditionally computed at the word level from a large corpus of unlabeled
Yijia Zhang   +4 more
semanticscholar   +1 more source

Learning Chinese Word Embeddings With Words and Subcharacter N-Grams

open access: yesIEEE Access, 2019
Co-occurrence information between words is the basis of training word embeddings; besides, Chinese characters are composed of subcharacters, words made up by the same characters or subcharacters usually have similar semantics, but this internal ...
Ruizhi Kang   +4 more
doaj   +1 more source

Exploring the Privacy-Preserving Properties of Word Embeddings: Algorithmic Validation Study

open access: yesJournal of Medical Internet Research, 2020
BackgroundWord embeddings are dense numeric vectors used to represent language in neural networks. Until recently, there had been no publicly released embeddings trained on clinical data.
Abdalla, Mohamed   +3 more
doaj   +1 more source

Nurse is Closer to Woman than Surgeon? Mitigating Gender-Biased Proximities in Word Embeddings [PDF]

open access: yesTransactions of the Association for Computational Linguistics, 2020
Word embeddings are the standard model for semantic and syntactic representations of words. Unfortunately, these models have been shown to exhibit undesirable word associations resulting from gender, racial, and religious biases. Existing post-processing
Vaibhav Kumar   +2 more
semanticscholar   +1 more source

Learning Bilingual Word Embedding Mappings with Similar Words in Related Languages Using GAN

open access: yesApplied Artificial Intelligence, 2022
Cross-lingual word embeddings display words from different languages in the same vector space. They provide reasoning about semantics, compare the meaning of words across languages and word meaning in multilingual contexts, necessary to bilingual lexicon
Ghafour Alipour   +2 more
doaj   +1 more source

A Survey On Neural Word Embeddings [PDF]

open access: yesarXiv, 2021
Understanding human language has been a sub-challenge on the way of intelligent machines. The study of meaning in natural language processing (NLP) relies on the distributional hypothesis where language elements get meaning from the words that co-occur within contexts.
arxiv  

Comparative Analysis of Word Embeddings for Capturing Word Similarities [PDF]

open access: yes6th International Conference on Natural Language Processing (NATP 2020), 2020
Distributed language representation has become the most widely used technique for language representation in various natural language processing tasks. Most of the natural language processing models that are based on deep learning techniques use already pre-trained distributed word representations, commonly called word embeddings.
arxiv   +1 more source

Home - About - Disclaimer - Privacy