Results 21 to 30 of about 302,572 (324)
Energy and Policy Considerations for Deep Learning in NLP [PDF]
Recent progress in hardware and methodology for training neural networks has ushered in a new generation of large networks trained on abundant data. These models have obtained notable gains in accuracy across many NLP tasks.
Emma Strubell +2 more
semanticscholar +1 more source
Large scale text mining for deriving useful insights: A case study focused on microbiome
Text mining has been shown to be an auxiliary but key driver for modeling, data harmonization, and interpretation in bio-medicine. Scientific literature holds a wealth of information and embodies cumulative knowledge and remains the core basis on which ...
Syed Ashif Jardary Al Ahmed +8 more
doaj +1 more source
Challenges and Strategies in Cross-Cultural NLP [PDF]
Various efforts in the Natural Language Processing (NLP) community have been made to accommodate linguistic diversity and serve speakers of many different languages.
Daniel Hershcovich +13 more
semanticscholar +1 more source
Demonstrate-Search-Predict: Composing retrieval and language models for knowledge-intensive NLP [PDF]
Retrieval-augmented in-context learning has emerged as a powerful approach for addressing knowledge-intensive tasks using frozen language models (LM) and retrieval models (RM).
O. Khattab +6 more
semanticscholar +1 more source
UPRec: User-aware Pre-training for sequential Recommendation
Recent years witness the success of pre-trained models to alleviate the data sparsity problem in recommender systems. However, existing pre-trained models for recommendation mainly focus on leveraging universal sequence patterns from user behavior ...
Chaojun Xiao +6 more
doaj +1 more source
Dynabench: Rethinking Benchmarking in NLP [PDF]
We introduce Dynabench, an open-source platform for dynamic dataset creation and model benchmarking. Dynabench runs in a web browser and supports human-and-model-in-the-loop dataset creation: annotators seek to create examples that a target model will ...
Douwe Kiela +18 more
semanticscholar +1 more source
The speech of native speakers is full of idiosyncrasies. Especially prominent are lexically restricted binary word co-occurrences of the type high esteem, strong tea, run [an] experiment, war break(s) out, etc.
Alexander Shvets, Leo Wanner
doaj +1 more source
Word Sense Induction with Attentive Context Clustering [PDF]
This paper presents ACCWSI (Attentive Context Clustering WSI), a method for Word Sense Induction, suitable for languages with limited resources. Pretrained on a small corpus and given an ambiguous word (a query word) and a set of excerpts that contain it,
Moshe Stekel, Amos Azaria, Shai Gordin
doaj +1 more source
HESML: a real-time semantic measures library for the biomedical domain with a reproducible survey
Background Ontology-based semantic similarity measures based on SNOMED-CT, MeSH, and Gene Ontology are being extensively used in many applications in biomedical text mining and genomics respectively, which has encouraged the development of semantic ...
Juan J. Lastra-Díaz +2 more
doaj +1 more source
BERT Rediscovers the Classical NLP Pipeline [PDF]
Pre-trained text encoders have rapidly advanced the state of the art on many NLP tasks. We focus on one such model, BERT, and aim to quantify where linguistic information is captured within the network.
Ian Tenney, Dipanjan Das, Ellie Pavlick
semanticscholar +1 more source

