Data clustering - Open Access .click

Results 231 to 240 of about 796,452 (265)

Some of the next articles are maybe not open access.

WIREs Data Mining and Knowledge Discovery, 2011
AbstractMixture model clustering proceeds by fitting a finite mixture of multivariate distributions to data, the fitted mixture density then being used to allocate the data to one of the components. Common model formulations assume that either all the attributes are continuous or all the attributes are categorical.
Lynette A. Hunt, Murray A. Jorgensen
openaire +1 more source

Clustering for Data Matching

2006
The problem of matching data has as one of its major bottlenecks the rapid deterioration in performance of time and accuracy, as the amount of data to be processed increases. One reason for this deterioration in performance is the cost incurred by data matching systems when comparing data records to determine their similarity (or dissimilarity ...
Edward Tersoo Apeh, Bogdan Gabrys
openaire +1 more source

Inference for Clustered Data

The Stata Journal: Promoting communications on statistics and Stata, 2018
In this article, we introduce clusteff, a community-contributed command for checking the severity of cluster heterogeneity in cluster–robust analyses. Cluster heterogeneity can cause a size distortion leading to underrejection of the null hypothesis. Carter, Schnepel, and Steigerwald (2017, Review of Economics and Statistics 99: 698–709) develop the ...
Lee, Chang Hyung, Steigerwald, Douglas G. +1 more
openaire +1 more source

Non-redundant data clustering

Knowledge and Information Systems, 2005
Data clustering is a popular approach for automatically finding classes, concepts, or groups of patterns. In practice, this discovery process should avoid redundancies with existing knowledge about class structures or groupings, and reveal novel, previously unknown aspects of the data.
David Gondek, Thomas Hofmann 0001
openaire +1 more source

Inference on Distributed Data Clustering

Engineering Applications of Artificial Intelligence, 2005
In this paper we address confidentiality issues in distributed data clustering, particularly the inference problem. We present a measure of inference risk as a function of reconstruction precision and number of colluders in a distributed data mining group.
Josenildo Costa da Silva, Matthias Klusch +1 more
openaire +1 more source

Clusters in Aggregated Health Data

2008
Spatial information plays an important role in the identification of sources of outbreaks for many different health-related conditions. In the public health domain, as in many other domains, the available data is often aggregated into geographical regions, such as zip codes or municipalities.
Kevin Buchin +5 more
openaire +4 more sources

Overlapping Clustering for Textual Data

Proceedings of the 2018 7th International Conference on Software and Computer Applications, 2018
Texts have inherent overlapping, therefore for clustering textual data, the overlapping clustering algorithms are more appropriate. In this regard, a major challenge is that they are very slow in clustering big volumes of textual data. Among others, OKM and OSOM are two important overlapping clustering algorithms. In this study, we have implemented and
Atefeh Khazaei +2 more
openaire +1 more source

On Discrete Data Clustering

2008
Finite mixture modeling have been applied for different data mining tasks. The majority of the work done concerning finite mixture models has focused on mixtures for continuous data. However, many applications involve and generate discrete data for which discrete mixtures are better suited.
Nizar Bouguila, Walid ElGuebaly
openaire +1 more source

Multiscale Clustering for Functional Data

Journal of Classification, 2019
zbMATH Open Web Interface contents unavailable due to conflicting licenses.
Yaeji Lim, Hee-Seok Oh, Ying-Kuen K. Cheung +2 more
openaire +1 more source

A Clustering and Data-Reorganizing Algorithm

IEEE Transactions on Systems, Man, and Cybernetics, 1975
A clustering and data-reorganizing algorithm based on the concept of the shortest spanning path of a graph is given. This algorithm can be used to reorganize and/or cluster a large file of data.
James R. Slagle, Chin-Liang Chang, Stephen R. Heller +2 more
openaire +1 more source

machine learning
unsupervised learning
k-means algorithm

particle swarm optimization