Results 1 to 10 of about 716,263 (297)
Text Mining Infrastructure in R [PDF]
During the last decade text mining has become a widely used discipline utilizing statistical and machine learning methods. We present the tm package which provides a framework for text mining applications within R.
Kurt Hornik, Ingo Feinerer, David Meyer
doaj +1 more source
A large dataset of scientific text reuse in Open-Access publications
We present the Webis-STEREO-21 dataset, a massive collection of Scientific Text Reuse in Open-access publications. It contains 91 million cases of reused text passages found in 4.2 million unique open-access publications.
Lukas Gienapp +4 more
doaj +1 more source
Comparing neural models for nested and overlapping biomedical event detection
Background Nested and overlapping events are particularly frequent and informative structures in biomedical event extraction. However, state-of-the-art neural models either neglect those structures during learning or use syntactic features and external ...
Kurt Espinosa +5 more
doaj +1 more source
Automatic identification of suicide notes with a transformer-based deep learning model
Suicide is one of the leading causes of death worldwide. At the same time, the widespread use of social media has led to an increase in people posting their suicide notes online. Therefore, designing a learning model that can aid the detection of suicide
Tianlin Zhang +2 more
doaj +1 more source
Natural language processing applied to mental illness detection: a narrative review
Mental illness is highly prevalent nowadays, constituting a major cause of distress in people’s life with impact on society’s health and well-being. Mental illness is a complex multi-factorial disease associated with individual risk factors and a variety
Tianlin Zhang +3 more
doaj +1 more source
MuSe: The Musical Sentiment Dataset
The MuSe (Music Sentiment) dataset contains sentiment information for 90,001 songs. We computed scores for the affective dimensions of valence, dominance, and arousal, based on the user-generated tags that are available for each song via Last.fm.
Christopher Akiki, Manuel Burghardt
doaj +1 more source
COPIOUS: A gold standard corpus of named entities towards extracting species occurrence from biodiversity literature [PDF]
Background Species occurrence records are very important in the biodiversity domain. While several available corpora contain only annotations of species names or habitats and geographical locations, there is no consolidated corpus that covers all ...
Nhung Nguyen +2 more
doaj +2 more sources
PRESTOapp for health workers with mental health symptoms related to the COVID-19 pandemic
Introduction The COVID-19 pandemic has caused a significant impact on the mental health of health workers that has brought many hospitals to launch immediate preventive mental health programs.
M. Primé Tous +10 more
doaj +1 more source
Background Recently, automatically extracting biomedical relations has been a significant subject in biomedical research due to the rapid growth of biomedical literature.
Peng Su, K. Vijay-Shanker
doaj +1 more source
Detecting the articles which consist of protein–protein interactions (PPI) is a significant step in biological information extraction. In this paper, we present a hybrid text classification (TC) method to identify protein–protein interaction articles ...
Sabenabanu Abdulkadhar +2 more
doaj +1 more source

