An efficient sampling strategy for selection of biobank samples using risk scores
(2017) In Scandinavian Journal of Public Health 45(17_suppl). p.41-44- Abstract
Aim: The aim of this study was to suggest a new sample-selection strategy based on risk scores in case-control studies with biobank data. Methods: An ongoing Swedish case-control study on fetal exposure to endocrine disruptors and overweight in early childhood was used as the empirical example. Cases were defined as children with a body mass index (BMI) ≥18 kg/m2 (n=545) at four years of age, and controls as children with a BMI of 1/217 kg/m2 (n=4472 available). The risk of being overweight was modelled using logistic regression based on available covariates from the health examination and prior to selecting samples from the biobank. A risk score was estimated for each child and categorised as low (0-5%), medium... (More)
Aim: The aim of this study was to suggest a new sample-selection strategy based on risk scores in case-control studies with biobank data. Methods: An ongoing Swedish case-control study on fetal exposure to endocrine disruptors and overweight in early childhood was used as the empirical example. Cases were defined as children with a body mass index (BMI) ≥18 kg/m2 (n=545) at four years of age, and controls as children with a BMI of 1/217 kg/m2 (n=4472 available). The risk of being overweight was modelled using logistic regression based on available covariates from the health examination and prior to selecting samples from the biobank. A risk score was estimated for each child and categorised as low (0-5%), medium (6-13%) or high (≥14%) risk of being overweight. Results: The final risk-score model, with smoking during pregnancy (p=0.001), birth weight (p<0.001), BMI of both parents (p<0.001 for both), type of residence (p=0.04) and economic situation (p=0.12), yielded an area under the receiver operating characteristic curve of 67% (n=3945 with complete data). The case group (n=416) had the following risk-score profile: low (12%), medium (46%) and high risk (43%). Twice as many controls were selected from each risk group, with further matching on sex. Computer simulations showed that the proposed selection strategy with stratification on risk scores yielded consistent improvements in statistical precision. Conclusions: Using risk scores based on available survey or register data as a basis for sample selection may improve possibilities to study heterogeneity of exposure effects in biobank-based studies.
(Less)
- author
- Björk, Jonas
LU
; Malmqvist, Ebba LU
; Rylander, Lars LU
and Rignell-Hydbom, Anna LU
- organization
- publishing date
- 2017-07-01
- type
- Contribution to journal
- publication status
- published
- subject
- keywords
- case-control study, childhood obesity, Endocrine disruptors, overweight, propensity score, research design
- in
- Scandinavian Journal of Public Health
- volume
- 45
- issue
- 17_suppl
- pages
- 4 pages
- publisher
- SAGE Publications
- external identifiers
-
- pmid:28683661
- wos:000405007800008
- scopus:85022192692
- ISSN
- 1403-4948
- DOI
- 10.1177/1403494817702329
- language
- English
- LU publication?
- yes
- id
- 9c8faac0-19d7-4a26-8242-149f93046f33
- date added to LUP
- 2017-07-25 14:19:37
- date last changed
- 2025-03-14 02:17:23
@article{9c8faac0-19d7-4a26-8242-149f93046f33, abstract = {{<p>Aim: The aim of this study was to suggest a new sample-selection strategy based on risk scores in case-control studies with biobank data. Methods: An ongoing Swedish case-control study on fetal exposure to endocrine disruptors and overweight in early childhood was used as the empirical example. Cases were defined as children with a body mass index (BMI) ≥18 kg/m<sup>2</sup> (n=545) at four years of age, and controls as children with a BMI of 1/217 kg/m<sup>2</sup> (n=4472 available). The risk of being overweight was modelled using logistic regression based on available covariates from the health examination and prior to selecting samples from the biobank. A risk score was estimated for each child and categorised as low (0-5%), medium (6-13%) or high (≥14%) risk of being overweight. Results: The final risk-score model, with smoking during pregnancy (p=0.001), birth weight (p<0.001), BMI of both parents (p<0.001 for both), type of residence (p=0.04) and economic situation (p=0.12), yielded an area under the receiver operating characteristic curve of 67% (n=3945 with complete data). The case group (n=416) had the following risk-score profile: low (12%), medium (46%) and high risk (43%). Twice as many controls were selected from each risk group, with further matching on sex. Computer simulations showed that the proposed selection strategy with stratification on risk scores yielded consistent improvements in statistical precision. Conclusions: Using risk scores based on available survey or register data as a basis for sample selection may improve possibilities to study heterogeneity of exposure effects in biobank-based studies.</p>}}, author = {{Björk, Jonas and Malmqvist, Ebba and Rylander, Lars and Rignell-Hydbom, Anna}}, issn = {{1403-4948}}, keywords = {{case-control study; childhood obesity; Endocrine disruptors; overweight; propensity score; research design}}, language = {{eng}}, month = {{07}}, number = {{17_suppl}}, pages = {{41--44}}, publisher = {{SAGE Publications}}, series = {{Scandinavian Journal of Public Health}}, title = {{An efficient sampling strategy for selection of biobank samples using risk scores}}, url = {{http://dx.doi.org/10.1177/1403494817702329}}, doi = {{10.1177/1403494817702329}}, volume = {{45}}, year = {{2017}}, }