Can k-NN imputation improve the performance of C4.5 with small software project data sets? A comparative evaluation [PDF]
Missing data is a widespread problem that can affect the ability to use data to construct effective prediction systems. We investigate a common machine learning technique that can tolerate missing values, namely C4.5, to predict cost using six real world
Albrecht +60 more
core +1 more source
MissForest - nonparametric missing value imputation for mixed-type data
Modern data acquisition based on high-throughput technology is often facing the problem of missing data. Algorithms commonly used in the analysis of such large-scale data often depend on a complete set.
D. J. Stekhoven +11 more
core +1 more source
Reuse of imputed data in microarray analysis increases imputation efficiency [PDF]
Abstract Background The imputation of missing values is necessary for the efficient use of DNA microarray data, because many clustering algorithms and some statistical analysis require a complete data set.
Ki-Yeol Kim, Byoung-Jin Kim, Gwan-Su Yi
openaire +4 more sources
Evaluating the state of the art in missing data imputation for clinical data
Clinical data are increasingly being mined to derive new medical knowledge with a goal of enabling greater diagnostic precision, better-personalized therapeutic regimens, improved clinical outcomes and more efficient utilization of health-care resources.
Yuan Luo
semanticscholar +1 more source
Context-Aware Data Imputation: Application of Domain-Agnostic Deep Imputation Network
Data imputation (DI) is a crucial task to manage missing data across different domains, such as healthcare and finance. Traditional imputation methods often fail to account for contextual nuances within specific domains due to the heterogeneity of data ...
Mohammed Gh. Al Zamil +1 more
doaj +1 more source
Fast and accurate imputation of summary statistics enhances evidence of functional enrichment
Imputation using external reference panels is a widely used approach for increasing power in GWAS and meta-analysis. Existing HMM-based imputation approaches require individual-level genotypes.
Bhatia, Gaurav +9 more
core +2 more sources
Nearest neighbours in least-squares data imputation algorithms with different missing patterns [PDF]
Methods for imputation of missing data in the so-called least-squares approximation approach, a non-parametric computationally efficient multidimensional technique, are experimentally compared.
Atkeson +30 more
core +1 more source
BackgroundCommercial physical activity monitors have wide utility in the assessment of physical activity in research and clinical settings, however, the removal of devices results in missing data and has the potential to bias study conclusions.
R O'Driscoll +8 more
doaj +1 more source
Integration of survey data and big observational data for finite population inference using mass imputation [PDF]
Multiple data sources are becoming increasingly available for statistical analyses in the era of big data. As an important example in finite-population inference, we consider an imputation approach to combining a probability sample with big observational
Kim, Jae Kwang +2 more
core +3 more sources
This work identified serum proteins associated with pancreatic epithelial neoplasms (PanINs) and early‐stage PDAC. Proteomics screens assessed genetically engineered mice with abundant PanINs, KPC mice (Lox‐STOP‐Lox‐KrasG12D/+ Lox‐STOP‐Lox‐Trp53R172H/+ Pdx1‐Cre) before PDAC development and also early‐stage PDAC patients (n = 31), compared to benign ...
Hannah Mearns +10 more
wiley +1 more source

