Results 41 to 50 of about 89,062 (279)
Cultural Cartography with Word Embeddings [PDF]
Using the frequency of keywords is a classic approach in the formal analysis of text, but has the drawback of glossing over the relationality of word meanings. Word embedding models overcome this problem by constructing a standardized and continuous “meaning-space” where words are assigned a location based on relations of similarity to other words ...
Stoltz, Dustin, Taylor, Marshall
openaire +4 more sources
Semantic features are very important for machine learning-based drug name recognition (DNR) systems. The semantic features used in most DNR systems are based on drug dictionaries manually constructed by experts.
Shengyu Liu +3 more
doaj +1 more source
Better Word Embeddings by Disentangling Contextual n-Gram Information
Pre-trained word vectors are ubiquitous in Natural Language Processing applications. In this paper, we show how training word embeddings jointly with bigram and even trigram embeddings, results in improved unigram embeddings.
Gupta, Prakhar +2 more
core +1 more source
Understanding and Creating Word Embeddings
Word embeddings allow you to analyze the usage of different terms in a corpus of texts by capturing information about their contextual usage. Through a primarily theoretical lens, this lesson will teach you how to prepare a corpus and train a word ...
Avery Blankenship +2 more
doaj +1 more source
GLTM: A Global and Local Word Embedding-Based Topic Model for Short Texts
Short texts have become a kind of prevalent source of information, and discovering topical information from short text collections is valuable for many applications.
Wenxin Liang +4 more
doaj +1 more source
SPINE: SParse Interpretable Neural Embeddings
Prediction without justification has limited utility. Much of the success of neural models can be attributed to their ability to learn rich, dense and expressive representations.
Berg-Kirkpatrick, Taylor +4 more
core +1 more source
Gloss Alignment using Word Embeddings
Capturing and annotating Sign language datasets is a time consuming and costly process. Current datasets are orders of magnitude too small to successfully train unconstrained \acf{slt} models. As a result, research has turned to TV broadcast content as a source of large-scale training data, consisting of both the sign language interpreter and the ...
Walsh, Harry +3 more
openaire +2 more sources
Learning Chinese Word Embeddings With Words and Subcharacter N-Grams
Co-occurrence information between words is the basis of training word embeddings; besides, Chinese characters are composed of subcharacters, words made up by the same characters or subcharacters usually have similar semantics, but this internal ...
Ruizhi Kang +4 more
doaj +1 more source
This work presents a novel methodology for calculating the phonetic similarity between words taking motivation from the human perception of sounds. This metric is employed to learn a continuous vector embedding space that groups similar sounding words together and can be used for various downstream computational phonology tasks.
Sharma, Rahul +2 more
openaire +2 more sources
Closed Form Word Embedding Alignment [PDF]
We develop a family of techniques to align word embeddings which are derived from different source datasets or created using different mechanisms (e.g., GloVe or word2vec). Our methods are simple and have a closed form to optimally rotate, translate, and scale to minimize root mean squared errors or maximize the average cosine similarity between two ...
Sunipa Dev +2 more
openaire +2 more sources

