An efficient sampling strategy for selection of biobank samples using risk scores

Björk, Jonas; Malmqvist, Ebba; Rylander, Lars; Rignell-Hydbom, Anna

An efficient sampling strategy for selection of biobank samples using risk scores

Mark

Björk, Jonas ^LU

; Malmqvist, Ebba ^LU

; Rylander, Lars ^LU

and Rignell-Hydbom, Anna ^LU (2017) In Scandinavian Journal of Public Health 45(17_suppl). p.41-44

Abstract: Aim: The aim of this study was to suggest a new sample-selection strategy based on risk scores in case-control studies with biobank data. Methods: An ongoing Swedish case-control study on fetal exposure to endocrine disruptors and overweight in early childhood was used as the empirical example. Cases were defined as children with a body mass index (BMI) ≥18 kg/m² (n=545) at four years of age, and controls as children with a BMI of 1/217 kg/m² (n=4472 available). The risk of being overweight was modelled using logistic regression based on available covariates from the health examination and prior to selecting samples from the biobank. A risk score was estimated for each child and categorised as low (0-5%), medium... (More); Aim: The aim of this study was to suggest a new sample-selection strategy based on risk scores in case-control studies with biobank data. Methods: An ongoing Swedish case-control study on fetal exposure to endocrine disruptors and overweight in early childhood was used as the empirical example. Cases were defined as children with a body mass index (BMI) ≥18 kg/m² (n=545) at four years of age, and controls as children with a BMI of 1/217 kg/m² (n=4472 available). The risk of being overweight was modelled using logistic regression based on available covariates from the health examination and prior to selecting samples from the biobank. A risk score was estimated for each child and categorised as low (0-5%), medium (6-13%) or high (≥14%) risk of being overweight. Results: The final risk-score model, with smoking during pregnancy (p=0.001), birth weight (p<0.001), BMI of both parents (p<0.001 for both), type of residence (p=0.04) and economic situation (p=0.12), yielded an area under the receiver operating characteristic curve of 67% (n=3945 with complete data). The case group (n=416) had the following risk-score profile: low (12%), medium (46%) and high risk (43%). Twice as many controls were selected from each risk group, with further matching on sex. Computer simulations showed that the proposed selection strategy with stratification on risk scores yielded consistent improvements in statistical precision. Conclusions: Using risk scores based on available survey or register data as a basis for sample selection may improve possibilities to study heterogeneity of exposure effects in biobank-based studies.
(Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/9c8faac0-19d7-4a26-8242-149f93046f33

author

Björk, Jonas ^LU

; Malmqvist, Ebba ^LU

; Rylander, Lars ^LU

and Rignell-Hydbom, Anna ^LU

organization

publishing date

2017-07-01

type

Contribution to journal

publication status

published

subject

Health Sciences

keywords

case-control study, childhood obesity, Endocrine disruptors, overweight, propensity score, research design

in

Scandinavian Journal of Public Health

volume

45

issue

17_suppl

pages

4 pages

publisher

SAGE Publications

external identifiers

pmid:28683661
wos:000405007800008
scopus:85022192692

ISSN

1403-4948

DOI

10.1177/1403494817702329

language

English

LU publication?

yes

id

9c8faac0-19d7-4a26-8242-149f93046f33

date added to LUP

2017-07-25 14:19:37

date last changed

2026-03-18 10:55:09

@article{9c8faac0-19d7-4a26-8242-149f93046f33,
  abstract     = {{<p>Aim: The aim of this study was to suggest a new sample-selection strategy based on risk scores in case-control studies with biobank data. Methods: An ongoing Swedish case-control study on fetal exposure to endocrine disruptors and overweight in early childhood was used as the empirical example. Cases were defined as children with a body mass index (BMI) ≥18 kg/m<sup>2</sup> (n=545) at four years of age, and controls as children with a BMI of 1/217 kg/m<sup>2</sup> (n=4472 available). The risk of being overweight was modelled using logistic regression based on available covariates from the health examination and prior to selecting samples from the biobank. A risk score was estimated for each child and categorised as low (0-5%), medium (6-13%) or high (≥14%) risk of being overweight. Results: The final risk-score model, with smoking during pregnancy (p=0.001), birth weight (p&lt;0.001), BMI of both parents (p&lt;0.001 for both), type of residence (p=0.04) and economic situation (p=0.12), yielded an area under the receiver operating characteristic curve of 67% (n=3945 with complete data). The case group (n=416) had the following risk-score profile: low (12%), medium (46%) and high risk (43%). Twice as many controls were selected from each risk group, with further matching on sex. Computer simulations showed that the proposed selection strategy with stratification on risk scores yielded consistent improvements in statistical precision. Conclusions: Using risk scores based on available survey or register data as a basis for sample selection may improve possibilities to study heterogeneity of exposure effects in biobank-based studies.</p>}},
  author       = {{Björk, Jonas and Malmqvist, Ebba and Rylander, Lars and Rignell-Hydbom, Anna}},
  issn         = {{1403-4948}},
  keywords     = {{case-control study; childhood obesity; Endocrine disruptors; overweight; propensity score; research design}},
  language     = {{eng}},
  month        = {{07}},
  number       = {{17_suppl}},
  pages        = {{41--44}},
  publisher    = {{SAGE Publications}},
  series       = {{Scandinavian Journal of Public Health}},
  title        = {{An efficient sampling strategy for selection of biobank samples using risk scores}},
  url          = {{http://dx.doi.org/10.1177/1403494817702329}},
  doi          = {{10.1177/1403494817702329}},
  volume       = {{45}},
  year         = {{2017}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

An efficient sampling strategy for selection of biobank samples using risk scores