Datasets as topic - Open Access .click

Results 301 to 310 of about 229,295 (345)

Some of the next articles are maybe not open access.

Datasets for Large Language Models: A Comprehensive Survey

arXiv.org
This paper embarks on an exploration into the Large Language Model (LLM) datasets, which play a crucial role in the remarkable advancements of LLMs.
Yang Liu +4 more
semanticscholar +1 more source

Data Aggregation Of Tweets And Topic Modelling Based On The Twitter Dataset

2021 the 3rd International Conference on Big Data Engineering and Technology (BDET), 2021
Twitter is one of the most popular online social networks. It has a relatively simple data model and an intuitive API to access Twitter data. This makes it easy to collect social data and analyse the patterns of online behaviour. Twitter has an impactful presence among politicians, entrepreneurs, news agencies, public figures, and this makes it a ...
Vignesh Srinivasan, K. Chandrasekaran 0001 +1 more
openaire +1 more source

Topic Discovery and Topic-Driven Clustering for Audit Method Datasets

2011
As the promotion of China's Golden Auditing Project and the fast growth of on-line auditing, there are thousands of new computer audit methods emerged every year to fulfill various needs of audit practices. How to organize these existing computer audit methods and use them intelligently have become a fundamental and challenging problem.
Ying Zhao, Wanyu Fu, Shaobin Huang
openaire +1 more source

A systematic literature review of swimming performance prediction: methods, datasets, techniques and research trends

Retos
Introduction: Predicting swimming success in competitive sports, primarily the outcomes of forthcoming Olympic swimming competitions. Objective: This paper provides an extensive and systematic review of research in swimming performance prediction ...
Ari Tri Fitrianto +3 more
semanticscholar +1 more source

MUSED: A multimedia multi-document dataset for topic segmentation

Natural Language Engineering, 2018
AbstractResearch on topic segmentation has recently focused on segmenting documents by taking advantage of documents covering the same topics. In order to properly evaluate such approaches, a dataset of related documents is needed. However, existing datasets are limited in the number of related documents per domain.
Pedro Mota, Maxine Eskénazi, Luísa Coheur +2 more
openaire +1 more source

Topic Summary Views for Exploration of Large Scholarly Datasets

Journal on Data Semantics, 2018
In this article, we present the E-sch approach for exploration of large scholarly datasets based on topic summary views. The goal of E-sch is to semantically summarize the dataset related to a potentially very large number of scholar publications (e.g., millions) by a list of few thousands topics, up to an ultimate list of hundreds of topic summaries ...
S. Castano, A. Ferrara, S. Montanelli
openaire +2 more sources

Topic-Centric Recommender Systems for Bibliographic Datasets

2012
In this paper, we introduce a novel and efficient approach for Recommender Systems in the academic world. With the world of academia growing at a tremendous rate, we have an enormous number of researchers working on hosts of research topics. Providing personalized recommendations to a researcher that could assist him in expanding his research base is ...
Aditya Pratap Singh, Kumar Shubhankar, Vikram Pudi +2 more
openaire +1 more source

Topic Modeling To Contextualize Event-Based Datasets

Proceedings of the 2nd ACM SIGSPATIAL International Workshop on Advances on Resilient and Intelligent Cities, 2019
Colombia suffered civil conflict for over five decades resulting in thousands of deaths and kidnappings and millions of displaced citizens. A peace process between the government and the Revolutionary Armed Forces of Colombia (FARC) was negotiated in 2016.
Ashlynn R. Daughton +3 more
openaire +1 more source

Cross-Validation for Imbalanced Datasets: Avoiding Overoptimistic and Overfitting Approaches [Research Frontier]

IEEE Computational Intelligence Magazine, 2018
Although cross-validation is a standard procedure for performance evaluation, its joint application with oversampling remains an open question for researchers farther from the imbalanced data topic.
M. Santos +4 more
semanticscholar +1 more source

Experimental explorations on short text topic mining between LDA and NMF based Schemes

Knowledge-Based Systems, 2019
Learning topics from short texts has become a critical and fundamental task for understanding the widely-spread streaming social messages, e.g., tweets, snippets and questions/answers.
Yong Chen +4 more
semanticscholar +1 more source

topic modeling
fos: computer and information sciences
computation and language cs.cl

machine learning
computer science - computation and language
natural language processing

random forest
information retrieval cs.ir
artificial intelligence cs.ai