The value of combining individual and small area sociodemographic data for assessing and handling selective participation in cohort studies : Evidence from the Swedish CardioPulmonary bioImage Study
(2022) In PLoS ONE 17(3).- Abstract
Objectives To study the value of combining individual- and neighborhood-level sociodemographic data to predict study participation and assess the effects of baseline selection on the distribution of metabolic risk factors and lifestyle factors in the Swedish CardioPulmonary bioImage Study (SCAPIS). Methods We linked sociodemographic register data to SCAPIS participants (n = 30,154, ages: 50-64 years) and a random sample of the study's target population (n = 59,909). We assessed the classification ability of participation models based on individual-level data, neighborhood-level data, and combinations of both. Standardized mean differences (SMD) were used to examine how reweighting the sample to match the population affected the averages... (More)
Objectives To study the value of combining individual- and neighborhood-level sociodemographic data to predict study participation and assess the effects of baseline selection on the distribution of metabolic risk factors and lifestyle factors in the Swedish CardioPulmonary bioImage Study (SCAPIS). Methods We linked sociodemographic register data to SCAPIS participants (n = 30,154, ages: 50-64 years) and a random sample of the study's target population (n = 59,909). We assessed the classification ability of participation models based on individual-level data, neighborhood-level data, and combinations of both. Standardized mean differences (SMD) were used to examine how reweighting the sample to match the population affected the averages of 32 cardiopulmonary risk factors at baseline. Absolute SMDs >0.10 were considered meaningful. Results Combining both individual-level and neighborhood-level data gave rise to a model with better classification ability (AUC: 71.3%) than models with only individual-level (AUC: 66.9%) or neighborhood-level data (AUC: 65.5%). We observed a greater change in the distribution of risk factors when we reweighted the participants using both individual and area data. The only meaningful change was related to the (self-reported) frequency of alcohol consumption, which appears to be higher in the SCAPIS sample than in the population. The remaining risk factors did not change meaningfully. Conclusions Both individual- and neighborhood-level characteristics are informative in assessing study selection effects. Future analyses of cardiopulmonary outcomes in the SCAPIS cohort can benefit from our study, though the average impact of selection on risk factor distributions at baseline appears small.
(Less)
- author
- Bonander, Carl ; Nilsson, Anton LU ; Björk, Jonas LU ; Blomberg, Anders ; Engström, Gunnar LU ; Jernberg, Tomas ; Sundström, Johan ; Östgren, Carl Johan LU ; Bergström, Göran and Strömberg, Ulf
- organization
- publishing date
- 2022-03
- type
- Contribution to journal
- publication status
- published
- subject
- in
- PLoS ONE
- volume
- 17
- issue
- 3
- article number
- e0265088
- publisher
- Public Library of Science (PLoS)
- external identifiers
-
- pmid:35259202
- scopus:85126077455
- ISSN
- 1932-6203
- DOI
- 10.1371/journal.pone.0265088
- language
- English
- LU publication?
- yes
- id
- 17b96f9a-e4b5-4977-a4b0-f03263638cc0
- date added to LUP
- 2022-04-27 08:23:02
- date last changed
- 2024-03-23 20:08:48
@article{17b96f9a-e4b5-4977-a4b0-f03263638cc0, abstract = {{<p>Objectives To study the value of combining individual- and neighborhood-level sociodemographic data to predict study participation and assess the effects of baseline selection on the distribution of metabolic risk factors and lifestyle factors in the Swedish CardioPulmonary bioImage Study (SCAPIS). Methods We linked sociodemographic register data to SCAPIS participants (n = 30,154, ages: 50-64 years) and a random sample of the study's target population (n = 59,909). We assessed the classification ability of participation models based on individual-level data, neighborhood-level data, and combinations of both. Standardized mean differences (SMD) were used to examine how reweighting the sample to match the population affected the averages of 32 cardiopulmonary risk factors at baseline. Absolute SMDs >0.10 were considered meaningful. Results Combining both individual-level and neighborhood-level data gave rise to a model with better classification ability (AUC: 71.3%) than models with only individual-level (AUC: 66.9%) or neighborhood-level data (AUC: 65.5%). We observed a greater change in the distribution of risk factors when we reweighted the participants using both individual and area data. The only meaningful change was related to the (self-reported) frequency of alcohol consumption, which appears to be higher in the SCAPIS sample than in the population. The remaining risk factors did not change meaningfully. Conclusions Both individual- and neighborhood-level characteristics are informative in assessing study selection effects. Future analyses of cardiopulmonary outcomes in the SCAPIS cohort can benefit from our study, though the average impact of selection on risk factor distributions at baseline appears small.</p>}}, author = {{Bonander, Carl and Nilsson, Anton and Björk, Jonas and Blomberg, Anders and Engström, Gunnar and Jernberg, Tomas and Sundström, Johan and Östgren, Carl Johan and Bergström, Göran and Strömberg, Ulf}}, issn = {{1932-6203}}, language = {{eng}}, number = {{3}}, publisher = {{Public Library of Science (PLoS)}}, series = {{PLoS ONE}}, title = {{The value of combining individual and small area sociodemographic data for assessing and handling selective participation in cohort studies : Evidence from the Swedish CardioPulmonary bioImage Study}}, url = {{http://dx.doi.org/10.1371/journal.pone.0265088}}, doi = {{10.1371/journal.pone.0265088}}, volume = {{17}}, year = {{2022}}, }