Results 301 to 310 of about 229,295 (345)
Some of the next articles are maybe not open access.
Datasets for Large Language Models: A Comprehensive Survey
arXiv.orgThis paper embarks on an exploration into the Large Language Model (LLM) datasets, which play a crucial role in the remarkable advancements of LLMs.
Yang Liu +4 more
semanticscholar +1 more source
Data Aggregation Of Tweets And Topic Modelling Based On The Twitter Dataset
2021 the 3rd International Conference on Big Data Engineering and Technology (BDET), 2021Twitter is one of the most popular online social networks. It has a relatively simple data model and an intuitive API to access Twitter data. This makes it easy to collect social data and analyse the patterns of online behaviour. Twitter has an impactful presence among politicians, entrepreneurs, news agencies, public figures, and this makes it a ...
Vignesh Srinivasan +1 more
openaire +1 more source
Topic Discovery and Topic-Driven Clustering for Audit Method Datasets
2011As the promotion of China's Golden Auditing Project and the fast growth of on-line auditing, there are thousands of new computer audit methods emerged every year to fulfill various needs of audit practices. How to organize these existing computer audit methods and use them intelligently have become a fundamental and challenging problem.
Ying Zhao, Wanyu Fu, Shaobin Huang
openaire +1 more source
Retos
Introduction: Predicting swimming success in competitive sports, primarily the outcomes of forthcoming Olympic swimming competitions. Objective: This paper provides an extensive and systematic review of research in swimming performance prediction ...
Ari Tri Fitrianto +3 more
semanticscholar +1 more source
Introduction: Predicting swimming success in competitive sports, primarily the outcomes of forthcoming Olympic swimming competitions. Objective: This paper provides an extensive and systematic review of research in swimming performance prediction ...
Ari Tri Fitrianto +3 more
semanticscholar +1 more source
MUSED: A multimedia multi-document dataset for topic segmentation
Natural Language Engineering, 2018AbstractResearch on topic segmentation has recently focused on segmenting documents by taking advantage of documents covering the same topics. In order to properly evaluate such approaches, a dataset of related documents is needed. However, existing datasets are limited in the number of related documents per domain.
Pedro Mota +2 more
openaire +1 more source
Topic Summary Views for Exploration of Large Scholarly Datasets
Journal on Data Semantics, 2018In this article, we present the E-sch approach for exploration of large scholarly datasets based on topic summary views. The goal of E-sch is to semantically summarize the dataset related to a potentially very large number of scholar publications (e.g., millions) by a list of few thousands topics, up to an ultimate list of hundreds of topic summaries ...
S. Castano, A. Ferrara, S. Montanelli
openaire +2 more sources
Topic-Centric Recommender Systems for Bibliographic Datasets
2012In this paper, we introduce a novel and efficient approach for Recommender Systems in the academic world. With the world of academia growing at a tremendous rate, we have an enormous number of researchers working on hosts of research topics. Providing personalized recommendations to a researcher that could assist him in expanding his research base is ...
Aditya Pratap Singh +2 more
openaire +1 more source
Topic Modeling To Contextualize Event-Based Datasets
Proceedings of the 2nd ACM SIGSPATIAL International Workshop on Advances on Resilient and Intelligent Cities, 2019Colombia suffered civil conflict for over five decades resulting in thousands of deaths and kidnappings and millions of displaced citizens. A peace process between the government and the Revolutionary Armed Forces of Colombia (FARC) was negotiated in 2016.
Ashlynn R. Daughton +3 more
openaire +1 more source
IEEE Computational Intelligence Magazine, 2018
Although cross-validation is a standard procedure for performance evaluation, its joint application with oversampling remains an open question for researchers farther from the imbalanced data topic.
M. Santos +4 more
semanticscholar +1 more source
Although cross-validation is a standard procedure for performance evaluation, its joint application with oversampling remains an open question for researchers farther from the imbalanced data topic.
M. Santos +4 more
semanticscholar +1 more source
Experimental explorations on short text topic mining between LDA and NMF based Schemes
Knowledge-Based Systems, 2019Learning topics from short texts has become a critical and fundamental task for understanding the widely-spread streaming social messages, e.g., tweets, snippets and questions/answers.
Yong Chen +4 more
semanticscholar +1 more source

