The Optimal Machine Learning-Based Missing Data Imputation for the Cox Proportional Hazard Model
An adequate imputation of missing data would significantly preserve the statistical power and avoid erroneous conclusions. In the era of big data, machine learning is a great tool to infer the missing values.
Chao-Yu Guo +4 more
doaj +1 more source
Comparison of regression imputation methods of baseline covariates that predict survival outcomes
Introduction: Missing data are inevitable in medical research and appropriate handling of missing data is critical for statistical estimation and making inferences.
Nicole Solomon +2 more
doaj +1 more source
DIST: direct imputation of summary statistics for unmeasured SNPs [PDF]
Abstract Motivation: Genotype imputation methods are used to enhance the resolution of genome-wide association studies, and thus increase the detection rate for genetic signals. Although most studies report all univariate summary statistics, many of them limit the access to subject-level genotypes.
Lee, Donghyung +4 more
openaire +2 more sources
Multiple Imputation to Balance Unbalanced Designs for Two-Way Analysis of Variance
A balanced ANOVA design provides an unambiguous interpretation of the F-tests, and has more power than an unbalanced design. In earlier literature, multiple imputation was proposed to create balance in unbalanced designs, as an alternative to Type-III ...
Joost R. van Ginkel +1 more
doaj +1 more source
DISTMIX: direct imputation of summary statistics for unmeasured SNPs from mixed ethnicity cohorts [PDF]
Motivation: To increase the signal resolution for large-scale meta-analyses of genome-wide association studies, genotypes at unmeasured single nucleotide polymorphisms (SNPs) are commonly imputed using large multi-ethnic reference panels.
Ayman H. Fanous +10 more
core +3 more sources
A New Statistic to Evaluate Imputation Reliability
As the amount of data from genome wide association studies grows dramatically, many interesting scientific questions require imputation to combine or expand datasets. However, there are two situations for which imputation has been problematic: (1) polymorphisms with low minor allele frequency (MAF), and (2) datasets where subjects are genotyped on ...
Lin, Peng +11 more
openaire +4 more sources
Evaluation and application of summary statistic imputation to discover new height-associated loci.
As most of the heritability of complex traits is attributed to common and low frequency genetic variants, imputing them by combining genotyping chips and large sequenced reference panels is the most cost-effective approach to discover the genetic basis ...
Sina Rüeger +2 more
doaj +1 more source
Transposable regularized covariance models with an application to missing data imputation [PDF]
Missing data estimation is an important challenge with high-dimensional data arranged in the form of a matrix. Typically this data matrix is transposable, meaning that either the rows, columns or both can be treated as features.
Allen, Genevera I., Tibshirani, Robert
core +1 more source
Quality Assessment of Imputations in Administrative Data [PDF]
This article contributes a framework for the quality assessment of imputations within a broader structure to evaluate the quality of register-based data.
Astleithner, Franz +5 more
core +1 more source
A Comparative Study of Various Methods for Handling Missing Data in UNSODA
UNSODA, a free international soil database, is very popular and has been used in many fields. However, missing soil property data have limited the utility of this dataset, especially for data-driven models. Here, three machine learning-based methods, i.e.
Yingpeng Fu, Hongjian Liao, Longlong Lv
doaj +1 more source

