Datasets as topic - Open Access .click

Results 11 to 20 of about 229,295 (345)

Topic Concentration in Query Focused Summarization Datasets

Proceedings of the AAAI Conference on Artificial Intelligence, 2016
Query-Focused Summarization (QFS) summarizes a document cluster in response to a specific input query. QFS algorithms must combine query relevance assessment, central content identification, and redundancy avoidance. Frustratingly, state of the art algorithms designed for QFS do not significantly improve upon generic summarization ...
Tal Baumel, Raphael Cohen, Michael Elhadad +2 more
core +3 more sources

A social and news media benchmark dataset for topic modeling

Data in Brief, 2022
Topic modeling is an active research area with several unanswered questions. The focus of recent research in this area is on the use of a vector embedding representation of the input text with both generative and evolutionary topic modeling techniques ...
Samuel Miles +4 more
doaj +4 more sources

Topic-driven Clustering for Document Datasets [PDF]

Proceedings of the 2005 SIAM International Conference on Data Mining, 2005
In this paper, we define the problem of topic-driven clustering, which organizes a document collection according to a given set of topics (either from domain experts, or as a requirement satisfying users' needs). We propose three topic-driven schemes that consider the similarity between the document to its topic and the relationship among the documents
Ying Zhao 0008, George Karypis
openaire +2 more sources

MasakhaNEWS: News Topic Classification for African languages [PDF]

International Joint Conference on Natural Language Processing, 2023
African languages are severely under-represented in NLP research due to lack of datasets covering several NLP tasks. While there are individual language specific datasets that are being expanded to different tasks, only a handful of NLP tasks (e.g. named
Adeeko, Adetola +64 more
core +3 more sources

Selection of the Optimal Number of Topics for LDA Topic Model—Taking Patent Policy Analysis as an Example

Entropy, 2021
This study constructs a comprehensive index to effectively judge the optimal number of topics in the LDA topic model. Based on the requirements for selecting the number of topics, a comprehensive judgment index of perplexity, isolation, stability, and ...
Jingxian Gan, Yong Qi
doaj +2 more sources

An Adaptive LDA Optimal Topic Number Selection Method in News Topic Identification

IEEE Access, 2023
Nowadays, news text information is exploding, and people need more and more heterogeneous news content. Therefore, news text topic identification is needed to help viewers quickly and accurately screen and filter news related to their interests to save ...
Mingming Zheng, Kaizhong Jiang, Ranhui Xu, Lulu Qi +3 more
doaj +2 more sources

Extended Semantic Web Conference, 2016
The Web of data is growing continuously with respect to both the size and number of the datasets published. Porting a dataset to five-star Linked Data however requires the publisher of this dataset to link it with the already available linked datasets.
Röder, Michael +3 more
openaire +2 more sources

Comparison of Artificial Intelligence Tools With Human Coding for Sentiment, Topic, and Thematic Analysis Tasks of Public Health Datasets During the COVID-19 Pandemic in Australia: Case Study [PDF]

Online Journal of Public Health Informatics
BackgroundPublic opinion, which may be influenced by personal experiences, news, and social media, can impact compliance with public health measures (PHMs) during health emergencies.
Danielle Hutchinson +5 more
doaj +2 more sources

Topic modeling for cluster analysis of large biological and medical datasets [PDF]

BMC Bioinformatics, 2014
The big data moniker is nowhere better deserved than to describe the ever-increasing prodigiousness and complexity of biological and medical datasets. New methods are needed to generate and test hypotheses, foster biological interpretation, and build validated predictors.
Weizhong Zhao, Wen Zou, James J. Chen
openaire +3 more sources

Diversity Over Size: On the Effect of Sample and Topic Sizes for Topic-Dependent Argument Mining Datasets

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing
The task of Argument Mining, that is extracting and classifying argument components for a specific topic from large document sources, is an inherently difficult task for machine learning models and humans alike, as large Argument Mining datasets are rare and recognition of argument components requires expert knowledge.
Benjamin Schiller +3 more
core +4 more sources

topic modeling
fos: computer and information sciences
computation and language cs.cl

machine learning
computer science - computation and language
natural language processing

random forest
information retrieval cs.ir
artificial intelligence cs.ai