Results 11 to 20 of about 4,674,587 (322)
Clustering Approaches for Mixed-Type Data: A Comparative Study [PDF]
Clustering is widely used in unsupervised learning to find homogeneous groups of observations within a dataset. However, clustering mixed-type data remains a challenge, as few existing approaches are suited for this task. This study presents the state-of-
Badih Ghattas, Alvaro Sanchez San-Benito
doaj +5 more sources
Spectral Clustering of Mixed-Type Data [PDF]
Cluster analysis seeks to assign objects with similar characteristics into groups called clusters so that objects within a group are similar to each other and dissimilar to objects in other groups.
Felix Mbuga, Cristina Tortora
doaj +3 more sources
Missing-Values Adjustment for Mixed-Type Data [PDF]
We propose a new method of single imputation, reconstruction, and estimation of nonreported, incorrect, implausible, or excluded values in more than one field of the record.
Agostino Tarsitano, Marianna Falcone
doaj +3 more sources
Holdout-Based Empirical Assessment of Mixed-Type Synthetic Data [PDF]
AI-based data synthesis has seen rapid progress over the last several years and is increasingly recognized for its promise to enable privacy-respecting high-fidelity data sharing.
Michael Platzer, Thomas Reutterer
doaj +4 more sources
Bayesian nonparametric models for spatially indexed data of mixed type [PDF]
We develop Bayesian nonparametric models for spatially indexed data of mixed type. Our work is motivated by challenges that occur in environmental epidemiology, where the usual presence of several confounding variables that exhibit complex interactions ...
Γεώργιος Παπαγεωργίου +2 more
openalex +4 more sources
MissForest - nonparametric missing value imputation for mixed-type data [PDF]
Modern data acquisition based on high-throughput technology is often facing the problem of missing data. Algorithms commonly used in the analysis of such large-scale data often depend on a complete set.
D. J. Stekhoven +11 more
core +5 more sources
DAGSLAM: causal Bayesian network structure learning of mixed type data and its application in identifying disease risk factors [PDF]
Background Identifying and understanding disease risk factors is crucial in epidemiology, particularly for chronic and noncommunicable diseases that often have complex interrelationships.
Yuanyuan Zhao, Jinzhu Jia
doaj +2 more sources
Clustering Mixed-Type Data Using a Probabilistic Distance Algorithm
Cluster analysis is a broadly used unsupervised data analysis technique for finding groups of homogeneous units in a data set. Probabilistic distance clustering adjusted for cluster size (PDQ), discussed in this contribution, falls within the broad category of clustering methods initially developed to deal with continuous data; it has the advantage of ...
Cristina Tortora, Francesco Palumbo
openalex +3 more sources
Clustering of Mixed-Type Data Considering Concept Hierarchies
Most clustering algorithms have been designed only for pure numerical or pure categorical data sets while nowadays many applications generate mixed data. It arises the question how to integrate various types of attributes so that one could efficiently group objects without loss of information.
Sahar Behzadi +3 more
openalex +5 more sources
Cluster Validation for Mixed-Type Data
For cluster analysis based on mixed-type data (i.e. data consisting of numerical and categorical variables), comparatively few clustering methods are available. One popular approach to deal with this kind of problems is an extension of the k-means algorithm (Huang, 1998), the so-called k-prototype algorithm, which is implemented in the R package ...
Rabea Aschenbruck, Gero Szepannek
openalex +3 more sources

