Advanced

Informed Conditioning on Clinical Covariates Increases Power in Case-Control Association Studies

Zaitlen, Noah; Lindstroem, Sara; Pasaniuc, Bogdan; Cornelis, Marilyn; Genovese, Giulio; Pollack, Samuela; Barton, Anne; Bickeboeller, Heike; Bowden, Donald W. and Eyre, Steve, et al. (2012) In PLoS Genetics 8(11).
Abstract
Genetic case-control association studies often include data on clinical covariates, such as body mass index (BMI), smoking status, or age, that may modify the underlying genetic risk of case or control samples. For example, in type 2 diabetes, odds ratios for established variants estimated from low-BMI cases are larger than those estimated from high-BMI cases. An unanswered question is how to use this information to maximize statistical power in case-control studies that ascertain individuals on the basis of phenotype (case-control ascertainment) or phenotype and clinical covariates (case-controlcovariate ascertainment). While current approaches improve power in studies with random ascertainment, they often lose power under case-control... (More)
Genetic case-control association studies often include data on clinical covariates, such as body mass index (BMI), smoking status, or age, that may modify the underlying genetic risk of case or control samples. For example, in type 2 diabetes, odds ratios for established variants estimated from low-BMI cases are larger than those estimated from high-BMI cases. An unanswered question is how to use this information to maximize statistical power in case-control studies that ascertain individuals on the basis of phenotype (case-control ascertainment) or phenotype and clinical covariates (case-controlcovariate ascertainment). While current approaches improve power in studies with random ascertainment, they often lose power under case-control ascertainment and fail to capture available power increases under case-control-covariate ascertainment. We show that an informed conditioning approach, based on the liability threshold model with parameters informed by external epidemiological information, fully accounts for disease prevalence and non-random ascertainment of phenotype as well as covariates and provides a substantial increase in power while maintaining a properly controlled falsepositive rate. Our method outperforms standard case-control association tests with or without covariates, tests of gene x covariate interaction, and previously proposed tests for dealing with covariates in ascertained data, with especially large improvements in the case of case-control-covariate ascertainment. We investigate empirical case-control studies of type 2 diabetes, prostate cancer, lung cancer, breast cancer, rheumatoid arthritis, age-related macular degeneration, and end-stage kidney disease over a total of 89,726 samples. In these datasets, informed conditioning outperforms logistic regression for 115 of the 157 known associated variants investigated (P-value = 1x10(-9)). The improvement varied across diseases with a 16% median increase in chi(2) test statistics and a commensurate increase in power. This suggests that applying our method to existing and future association studies of these diseases may identify novel disease loci. (Less)
Please use this url to cite or link to this publication:
author
, et al. (More)
(Less)
organization
publishing date
type
Contribution to journal
publication status
published
subject
in
PLoS Genetics
volume
8
issue
11
publisher
Public Library of Science
external identifiers
  • wos:000311891600020
  • scopus:84870718214
ISSN
1553-7404
DOI
10.1371/journal.pgen.1003032
language
English
LU publication?
yes
id
040d57c3-4c19-4bab-8cf4-6ce3342cc141 (old id 3577222)
date added to LUP
2013-04-02 07:43:48
date last changed
2017-09-17 03:24:09
@article{040d57c3-4c19-4bab-8cf4-6ce3342cc141,
  abstract     = {Genetic case-control association studies often include data on clinical covariates, such as body mass index (BMI), smoking status, or age, that may modify the underlying genetic risk of case or control samples. For example, in type 2 diabetes, odds ratios for established variants estimated from low-BMI cases are larger than those estimated from high-BMI cases. An unanswered question is how to use this information to maximize statistical power in case-control studies that ascertain individuals on the basis of phenotype (case-control ascertainment) or phenotype and clinical covariates (case-controlcovariate ascertainment). While current approaches improve power in studies with random ascertainment, they often lose power under case-control ascertainment and fail to capture available power increases under case-control-covariate ascertainment. We show that an informed conditioning approach, based on the liability threshold model with parameters informed by external epidemiological information, fully accounts for disease prevalence and non-random ascertainment of phenotype as well as covariates and provides a substantial increase in power while maintaining a properly controlled falsepositive rate. Our method outperforms standard case-control association tests with or without covariates, tests of gene x covariate interaction, and previously proposed tests for dealing with covariates in ascertained data, with especially large improvements in the case of case-control-covariate ascertainment. We investigate empirical case-control studies of type 2 diabetes, prostate cancer, lung cancer, breast cancer, rheumatoid arthritis, age-related macular degeneration, and end-stage kidney disease over a total of 89,726 samples. In these datasets, informed conditioning outperforms logistic regression for 115 of the 157 known associated variants investigated (P-value = 1x10(-9)). The improvement varied across diseases with a 16% median increase in chi(2) test statistics and a commensurate increase in power. This suggests that applying our method to existing and future association studies of these diseases may identify novel disease loci.},
  author       = {Zaitlen, Noah and Lindstroem, Sara and Pasaniuc, Bogdan and Cornelis, Marilyn and Genovese, Giulio and Pollack, Samuela and Barton, Anne and Bickeboeller, Heike and Bowden, Donald W. and Eyre, Steve and Freedman, Barry I. and Friedman, David J. and Field, John K. and Groop, Leif and Haugen, Aage and Heinrich, Joachim and Henderson, Brian E. and Hicks, Pamela J. and Hocking, Lynne J. and Kolonel, Laurence N. and Landi, Maria Teresa and Langefeld, Carl D. and Le Marchand, Loic and Meister, Michael and Morgan, Ann W. and Raji, Olaide Y. and Risch, Angela and Rosenberger, Albert and Scherf, David and Steer, Sophia and Walshaw, Martin and Waters, Kevin M. and Wilson, Anthony G. and Wordsworth, Paul and Zienolddiny, Shanbeh and Tchetgen, Eric Tchetgen and Haiman, Christopher and Hunter, David J. and Plenge, Robert M. and Worthington, Jane and Christiani, David C. and Schaumberg, Debra A. and Chasman, Daniel I. and Altshuler, David and Voight, Benjamin and Kraft, Peter and Patterson, Nick and Price, Alkes L.},
  issn         = {1553-7404},
  language     = {eng},
  number       = {11},
  publisher    = {Public Library of Science},
  series       = {PLoS Genetics},
  title        = {Informed Conditioning on Clinical Covariates Increases Power in Case-Control Association Studies},
  url          = {http://dx.doi.org/10.1371/journal.pgen.1003032},
  volume       = {8},
  year         = {2012},
}