Results 11 to 20 of about 229,295 (345)
Topic Concentration in Query Focused Summarization Datasets
Query-Focused Summarization (QFS) summarizes a document cluster in response to a specific input query. QFS algorithms must combine query relevance assessment, central content identification, and redundancy avoidance. Frustratingly, state of the art algorithms designed for QFS do not significantly improve upon generic summarization ...
Tal Baumel +2 more
core +3 more sources
A social and news media benchmark dataset for topic modeling
Topic modeling is an active research area with several unanswered questions. The focus of recent research in this area is on the use of a vector embedding representation of the input text with both generative and evolutionary topic modeling techniques ...
Samuel Miles +4 more
doaj +4 more sources
Topic-driven Clustering for Document Datasets [PDF]
In this paper, we define the problem of topic-driven clustering, which organizes a document collection according to a given set of topics (either from domain experts, or as a requirement satisfying users' needs). We propose three topic-driven schemes that consider the similarity between the document to its topic and the relationship among the documents
Ying Zhao 0008, George Karypis
openaire +2 more sources
MasakhaNEWS: News Topic Classification for African languages [PDF]
African languages are severely under-represented in NLP research due to lack of datasets covering several NLP tasks. While there are individual language specific datasets that are being expanded to different tasks, only a handful of NLP tasks (e.g. named
Adeeko, Adetola +64 more
core +3 more sources
This study constructs a comprehensive index to effectively judge the optimal number of topics in the LDA topic model. Based on the requirements for selecting the number of topics, a comprehensive judgment index of perplexity, isolation, stability, and ...
Jingxian Gan, Yong Qi
doaj +2 more sources
An Adaptive LDA Optimal Topic Number Selection Method in News Topic Identification
Nowadays, news text information is exploding, and people need more and more heterogeneous news content. Therefore, news text topic identification is needed to help viewers quickly and accurately screen and filter news related to their interests to save ...
Mingming Zheng +3 more
doaj +2 more sources
Detecting Similar Linked Datasets Using Topic Modelling
The Web of data is growing continuously with respect to both the size and number of the datasets published. Porting a dataset to five-star Linked Data however requires the publisher of this dataset to link it with the already available linked datasets.
Röder, Michael +3 more
openaire +2 more sources
Comparison of Artificial Intelligence Tools With Human Coding for Sentiment, Topic, and Thematic Analysis Tasks of Public Health Datasets During the COVID-19 Pandemic in Australia: Case Study [PDF]
BackgroundPublic opinion, which may be influenced by personal experiences, news, and social media, can impact compliance with public health measures (PHMs) during health emergencies.
Danielle Hutchinson +5 more
doaj +2 more sources
Topic modeling for cluster analysis of large biological and medical datasets [PDF]
The big data moniker is nowhere better deserved than to describe the ever-increasing prodigiousness and complexity of biological and medical datasets. New methods are needed to generate and test hypotheses, foster biological interpretation, and build validated predictors.
Weizhong Zhao, Wen Zou, James J. Chen
openaire +3 more sources
The task of Argument Mining, that is extracting and classifying argument components for a specific topic from large document sources, is an inherently difficult task for machine learning models and humans alike, as large Argument Mining datasets are rare and recognition of argument components requires expert knowledge.
Benjamin Schiller +3 more
core +4 more sources

