Advanced

An efficient sampling strategy for selection of biobank samples using risk scores

Björk, Jonas LU ; Malmqvist, Ebba; Rylander, Lars LU and Rignell-Hydbom, Anna LU (2017) In Scandinavian Journal of Public Health 45(17_suppl). p.41-44
Abstract

Aim: The aim of this study was to suggest a new sample-selection strategy based on risk scores in case-control studies with biobank data. Methods: An ongoing Swedish case-control study on fetal exposure to endocrine disruptors and overweight in early childhood was used as the empirical example. Cases were defined as children with a body mass index (BMI) ≥18 kg/m2 (n=545) at four years of age, and controls as children with a BMI of 1/217 kg/m2 (n=4472 available). The risk of being overweight was modelled using logistic regression based on available covariates from the health examination and prior to selecting samples from the biobank. A risk score was estimated for each child and categorised as low (0-5%), medium... (More)

Aim: The aim of this study was to suggest a new sample-selection strategy based on risk scores in case-control studies with biobank data. Methods: An ongoing Swedish case-control study on fetal exposure to endocrine disruptors and overweight in early childhood was used as the empirical example. Cases were defined as children with a body mass index (BMI) ≥18 kg/m2 (n=545) at four years of age, and controls as children with a BMI of 1/217 kg/m2 (n=4472 available). The risk of being overweight was modelled using logistic regression based on available covariates from the health examination and prior to selecting samples from the biobank. A risk score was estimated for each child and categorised as low (0-5%), medium (6-13%) or high (≥14%) risk of being overweight. Results: The final risk-score model, with smoking during pregnancy (p=0.001), birth weight (p<0.001), BMI of both parents (p<0.001 for both), type of residence (p=0.04) and economic situation (p=0.12), yielded an area under the receiver operating characteristic curve of 67% (n=3945 with complete data). The case group (n=416) had the following risk-score profile: low (12%), medium (46%) and high risk (43%). Twice as many controls were selected from each risk group, with further matching on sex. Computer simulations showed that the proposed selection strategy with stratification on risk scores yielded consistent improvements in statistical precision. Conclusions: Using risk scores based on available survey or register data as a basis for sample selection may improve possibilities to study heterogeneity of exposure effects in biobank-based studies.

(Less)
Please use this url to cite or link to this publication:
author
organization
publishing date
type
Contribution to journal
publication status
published
subject
keywords
case-control study, childhood obesity, Endocrine disruptors, overweight, propensity score, research design
in
Scandinavian Journal of Public Health
volume
45
issue
17_suppl
pages
4 pages
publisher
Taylor & Francis
external identifiers
  • scopus:85022192692
  • wos:000405007800008
ISSN
1403-4948
DOI
10.1177/1403494817702329
language
English
LU publication?
yes
id
9c8faac0-19d7-4a26-8242-149f93046f33
date added to LUP
2017-07-25 14:19:37
date last changed
2017-09-18 11:39:20
@article{9c8faac0-19d7-4a26-8242-149f93046f33,
  abstract     = {<p>Aim: The aim of this study was to suggest a new sample-selection strategy based on risk scores in case-control studies with biobank data. Methods: An ongoing Swedish case-control study on fetal exposure to endocrine disruptors and overweight in early childhood was used as the empirical example. Cases were defined as children with a body mass index (BMI) ≥18 kg/m<sup>2</sup> (n=545) at four years of age, and controls as children with a BMI of 1/217 kg/m<sup>2</sup> (n=4472 available). The risk of being overweight was modelled using logistic regression based on available covariates from the health examination and prior to selecting samples from the biobank. A risk score was estimated for each child and categorised as low (0-5%), medium (6-13%) or high (≥14%) risk of being overweight. Results: The final risk-score model, with smoking during pregnancy (p=0.001), birth weight (p&lt;0.001), BMI of both parents (p&lt;0.001 for both), type of residence (p=0.04) and economic situation (p=0.12), yielded an area under the receiver operating characteristic curve of 67% (n=3945 with complete data). The case group (n=416) had the following risk-score profile: low (12%), medium (46%) and high risk (43%). Twice as many controls were selected from each risk group, with further matching on sex. Computer simulations showed that the proposed selection strategy with stratification on risk scores yielded consistent improvements in statistical precision. Conclusions: Using risk scores based on available survey or register data as a basis for sample selection may improve possibilities to study heterogeneity of exposure effects in biobank-based studies.</p>},
  author       = {Björk, Jonas and Malmqvist, Ebba and Rylander, Lars and Rignell-Hydbom, Anna},
  issn         = {1403-4948},
  keyword      = {case-control study,childhood obesity,Endocrine disruptors,overweight,propensity score,research design},
  language     = {eng},
  month        = {07},
  number       = {17_suppl},
  pages        = {41--44},
  publisher    = {Taylor & Francis},
  series       = {Scandinavian Journal of Public Health},
  title        = {An efficient sampling strategy for selection of biobank samples using risk scores},
  url          = {http://dx.doi.org/10.1177/1403494817702329},
  volume       = {45},
  year         = {2017},
}