Skip to main content

Genetic Association Studies

  • Chapter
  • First Online:
Simultaneous Statistical Inference
  • 1965 Accesses

Abstract

In genetic association studies, one analyzes associations between a (potentially very large) set of genetic markers and a phenotype of interest. This is a particular multiple test problem which has several challenging aspects, for instance the high dimensionality of the statistical parameter and the discreteness of the statistical model. In this chapter, we discuss how to fine-tune multiple tests that we have described theoretically in Part I in order to address these challenges. In particular, we propose the usage of realized randomized \(p\)-values in data-adaptive multiple tests and show how linkage disequilibrium among genetic markers can be employed to construct simultaneous test procedures and to establish probability bounds which lead to effective numbers of tests. Finally, we analyze (positive) dependency properties among test statistics and the applicability of standard margin-based multiple tests. The methods are applied to two real-life datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Agresti A (2002) Categorical data analysis. Wiley Series in Probability and Mathematical Statistics, 2nd edn. Wiley, Chichester

    Google Scholar 

  • Dickhaus T (2012) Simultaneous Statistical Inference in dynamic factor models. SFB 649 Discussion Paper 2012–033, Sonderforschungsbereich 649, Humboldt Universität zu Berlin, Germany. http://sfb649.wiwi.hu-berlin.de/papers/pdf/SFB649DP2012-033.pdf

  • Dickhaus T, Stange J (2013) Multiple point hypothesis test problems and effective numbers of tests for control of the family-wise error rate. Calcutta Statis Assoc Bull, to appear

    Google Scholar 

  • Dickhaus T, Strassburger K, Schunk D, Morcillo-Suarez C, Illig T, Navarro A (2012) How to analyze many contingency tables simultaneously in genetic association studies. Stat Appl Genet Mol Biol 11(4):Article 12

    Google Scholar 

  • Finner H, Straßburger K, Heid IM, Herder C, Rathmann W, Giani G, Dickhaus T, Lichtner P, Meitinger T, Wichmann HE, Illig T, Gieger C (2010) How to link call rate and p-values for Hardy-Weinberg equilibrium as measures of genome-wide SNP data quality. Stat Med 29(22):2347–2358

    Google Scholar 

  • Herder C, Rathmann W, Strassburger K, Finner H, Grallert H, Huth C, Meisinger C, Gieger C, Martin S, Giani G, Scherbaum WA, Wichmann HE, Illig T (2008) Variants of the PPARG, IGF2BP2, CDKAL1, HHEX, and TCF7L2 genes confer risk of type 2 diabetes independently of BMI in the German KORA studies. Horm Metab Res 40:722–726

    Article  CAS  PubMed  Google Scholar 

  • Howie BN, Donnelly P, Marchini J (2009) A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet 5:e1000,529

    Google Scholar 

  • Karlin S, Rinott Y (1980) Classes of orderings of measures and related correlation inequalities I. Multivariate totally positive distributions. J Multivariate Anal 10:467–498

    Article  Google Scholar 

  • Langaas M, Bakke Ø (2013) Robust Methods for Disease-Genotype Association in Genetic Association Studies: Calculate p-values using exact conditional enumeration instead of asymptotic approximations. arXiv:1307.7536v1

    Google Scholar 

  • Lewontin RC, Kojima KI (1960) The evolutionary dynamics of complex polymorphisms. Evolution 14:458–472

    Article  Google Scholar 

  • Li Y, Willer CJ, Ding J, Scheet P, Abecasis GR (2010) MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes. Genet Epidemiol 34:816–834

    Article  PubMed Central  PubMed  Google Scholar 

  • Marchini J, Howie B, Myers S, McVean G, Donnelly P (2007) A new multipoint method for genome-wide association studies by imputation of genotypes. Nat Genet 39:906–913

    Article  CAS  PubMed  Google Scholar 

  • Meinshausen N, Meier L, Bühlmann P (2009) \(p\)-Values for high-dimensional regression. J Am Stat Assoc 104(488):1671–1681. doi: 10.1198/jasa.2009.tm08647

    Article  Google Scholar 

  • Moskvina V, Schmidt KM (2008) On multiple-testing correction in genome-wide association studies. Genet Epidemiol 32:567–573

    Article  PubMed  Google Scholar 

  • Spokoiny V, Dickhaus T (2014) Basics of modern parametric statistics. Springer, Heidelberg, forthcoming

    Google Scholar 

  • The 1000 Genomes Consortium (2010) A map of human genome variation from population-scale sequencing. Nature 467:1061–1073

    Google Scholar 

  • The International HapMap Consortium (2005) A haplotype map of the human genome. Nature 437(7063):1299–1320

    Google Scholar 

  • The Wellcome Trust Case Control Consortium (2007) Genome-wide association study of 14,000 cases of seven common diseases and 3,000 hared controls. Nature 447(7):661–678

    Google Scholar 

  • Wasserman L, Roeder K (2009) High-dimensional variable selection. Ann Stat 37(5A):2178–2201

    Article  PubMed Central  PubMed  Google Scholar 

  • Weir BS (1996) Genetic data analysis II. Sinauer Associates, Sunderland, MA

    Google Scholar 

  • Wigginton JE, Cutler DJ, Abecasis GR (2005) A Note on Exact Tests of Hardy-Weinberg Equilibrium. Am J Hum Genet 76:887–893

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  • Willer CJ, Sanna S, Jackson AU, Scuteri A, Bonnycastle LL, Clarke R, Heath SC, Timpson NJ, Najjar SS, Stringham HM, Strait J, Duren WL, Maschio A, Busonero F, Mulas A, Albai G, Swift AJ, Morken MA, Narisu N, Bennett D, Parish S, Shen H, Galan P, Meneton P, Hercberg S, Zelenika D, Chen WM, Li Y, Scott LJ, Scheet PA, Sundvall J, Watanabe RM, Nagaraja R, Ebrahim S, Lawlor DA, Ben-Shlomo Y, Davey-Smith G, Shuldiner AR, Collins R, Bergman RN, Uda M, Tuomilehto J, Cao A, Collins FS, Lakatta E, Lathrop GM, Boehnke M, Schlessinger D, Mohlke KL, Abecasis GR (2008) Newly identified loci that influence lipid concentrations and risk of coronary artery disease. Nat Genet 40:161–169

    Article  CAS  PubMed  Google Scholar 

  • Zheng G, Yang Y, Zhu X, Elston RC (2012) Analysis of genetic association studies. Statistics for biology and health. Springer, New York. doi:10.1007/978-1-4614-2245-7

  • Ziegler A, König IR (2006) A statistical approach to genetic epidemiology. Wiley, Weinheim

    Google Scholar 

Download references

Acknowledgments

This chapter makes use of data generated by the Wellcome Trust Case Control Consortium. A full list of the investigators who contributed to the generation of the data is available from http://www.wtccc.org.uk. Funding for the Wellcome Trust Case Control Consortium project was provided by the Wellcome Trust under award 076113. Parts of this chapter originated from joint work with Klaus Straßburger, Daniel Schunk, Carlos Morcillo-Suarez, Thomas Illig, Arcadi Navarro and Jens Stange. I am grateful to Mette Langaas and Øyvind Bakke for inviting me and for their hospitality during my visit to Norwegian University of Science and Technology (NTNU), for many fruitful discussions and for some valuable references.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thorsten Dickhaus .

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Dickhaus, T. (2014). Genetic Association Studies. In: Simultaneous Statistical Inference. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-45182-9_9

Download citation

Publish with us

Policies and ethics