Results 21 to 30 of about 1,551,832 (303)

NaturalConv: A Chinese Dialogue Dataset Towards Multi-turn Topic-driven Conversation [PDF]

open access: gold, 2021
In this paper, we propose a Chinese multi-turn topic-driven conversation dataset, NaturalConv, which allows the participants to chat anything they want as long as any element from the topic is mentioned and the topic shift is smooth. Our corpus contains 19.9K conversations from six domains, and 400K utterances with an average turn number of 20.1. These
Xiaoyang Wang   +3 more
arxiv   +3 more sources

Clustering and topic modeling over tweets: A comparison over a health dataset

open access: green2019 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 2019
Twitter became the most popular form of social interactions in the healthcare domain. Thus, various teams have evaluated Twitter as an additional source where patients share information about their healthcare with the potential goal to improve their outcomes.
Juan Antonio Lossio-Ventura   +4 more
openalex   +5 more sources

WET: Word embedding-topic distribution vectors for MOOC video lectures dataset

open access: goldData in Brief, 2020
In this article, we present a dataset containing word embeddings and document topic distribution vectors generated from MOOCs video lecture transcripts. Transcripts of 12,032 video lectures from 200 courses were collected from Coursera learning platform.
Zenun Kastrati   +2 more
openalex   +7 more sources

A Multi-Faceted Approach to Trending Topic Attack Detection Using Semantic Similarity and Large-Scale Datasets

open access: goldIEEE Access
Twitter’s widespread popularity has made it a prime target for malicious actors exploiting trending hashtags to disseminate harmful content. This study marks the first systematic exploration of semantic consistency in tweets to detect trending ...
Insaf Kraidia   +2 more
doaj   +2 more sources

Topic Concentration in Query Focused Summarization Datasets

open access: bronzeProceedings of the AAAI Conference on Artificial Intelligence, 2016
Query-Focused Summarization (QFS) summarizes a document cluster in response to a specific input query. QFS algorithms must combine query relevance assessment, central content identification, and redundancy avoidance. Frustratingly, state of the art algorithms designed for QFS do not significantly improve upon generic summarization ...
Tal Baumel   +2 more
openalex   +3 more sources

Unsupervised Text Topic-Related Gene Extraction for Large Unbalanced Datasets

open access: yesTehnički Vjesnik, 2020
There is a common notion that traditional unsupervised feature extraction algorithms follow the assumption that the distribution of the different clusters in a dataset is balanced.
Li Jing-Ming   +5 more
doaj   +3 more sources

Topic-driven Clustering for Document Datasets [PDF]

open access: greenProceedings of the 2005 SIAM International Conference on Data Mining, 2005
Ying Zhao, George Karypis
openalex   +3 more sources

Topic selection for text classification using ensemble topic modeling with grouping, scoring, and modeling approach [PDF]

open access: yesScientific Reports
TextNetTopics (Yousef et al. in Front Genet 13:893378, 2022. https://doi.org/10.3389/fgene.2022.893378 ) is a recently developed approach that performs text classification-based topics (a topic is a group of terms or words) extracted from a Latent ...
Daniel Voskergian   +2 more
doaj   +2 more sources

INTERACTIVE TOOL FOR VISUALIZATION OF TOPIC MODELS [PDF]

open access: yesActa Electrotechnica et Informatica, 2019
Digital data are all around us and occurs in various forms as videos, pictures or texts. Digital documents represent the vast majority of such data. It can be e-news, social media contributions and so on.
Miroslav SMATANA   +3 more
doaj   +1 more source

Lessons Learned: It Takes a Village to Understand Inter-Sectoral Care Using Administrative Data across Jurisdictions

open access: yesInternational Journal of Population Data Science, 2018
Cancer care is complex and exists within the broader healthcare system. The CanIMPACT team sought to enhance primary cancer care capacity and improve integration between primary and cancer specialist care, focusing on breast cancer.
Patti Ann Groome   +7 more
doaj   +1 more source

Home - About - Disclaimer - Privacy