Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Predicting Monthly Community-Level Domestic Radon Concentrations in the Greater Boston Area with an Ensemble Learning Model

Li, Longxiang ; Blomberg, Annelise J LU orcid ; Stern, Rebecca A LU ; Kang, Choong-Min ; Papatheodorou, Stefania ; Wei, Yaguang ; Liu, Man ; Peralta, Adjani A ; Vieira, Carolina L Z and Koutrakis, Petros (2021) In Environmental Science & Technology 55(10). p.7157-7166
Abstract

Inhaling radon and its progeny is associated with adverse health outcomes. However, previous studies of the health effects of residential exposure to radon in the United States were commonly based on a county-level temporally invariant radon model that was developed using measurements collected in the mid- to late 1980s. We developed a machine learning model to predict monthly radon concentrations for each ZIP Code Tabulation Area (ZCTA) in the Greater Boston area based on 363,783 short-term measurements by Spruce Environmental Technologies, Inc., during the period 2005-2018. A two-stage ensemble-based model was developed to predict radon concentrations for all ZCTAs and months. Stage one included 12 base statistical models that... (More)

Inhaling radon and its progeny is associated with adverse health outcomes. However, previous studies of the health effects of residential exposure to radon in the United States were commonly based on a county-level temporally invariant radon model that was developed using measurements collected in the mid- to late 1980s. We developed a machine learning model to predict monthly radon concentrations for each ZIP Code Tabulation Area (ZCTA) in the Greater Boston area based on 363,783 short-term measurements by Spruce Environmental Technologies, Inc., during the period 2005-2018. A two-stage ensemble-based model was developed to predict radon concentrations for all ZCTAs and months. Stage one included 12 base statistical models that independently predicted ZCTA-level radon concentrations based on geological, architectural, socioeconomic, and meteorological factors for each ZCTA. Stage two aggregated the predictions of these 12 base models using an ensemble learning method. The results of a 10-fold cross-validation showed that the stage-two model has a good prediction accuracy with a weighted R2 of 0.63 and root mean square error of 22.6 Bq/m3. The community-level time-varying predictions from our model have good predictive precision and accuracy and can be used in future prospective epidemiological studies in the Greater Boston area.

(Less)
Please use this url to cite or link to this publication:
author
; ; ; ; ; ; ; ; and
publishing date
type
Contribution to journal
publication status
published
keywords
Air Pollutants, Radioactive/analysis, Air Pollution, Indoor/analysis, Boston, Housing, Machine Learning, Models, Statistical, Radon/analysis, United States
in
Environmental Science & Technology
volume
55
issue
10
pages
7157 - 7166
publisher
The American Chemical Society (ACS)
external identifiers
  • pmid:33939421
  • scopus:85106504351
ISSN
1520-5851
DOI
10.1021/acs.est.0c08792
language
English
LU publication?
no
id
9d4d0778-b4b8-4095-8afa-5c13711ba960
date added to LUP
2021-09-09 11:31:21
date last changed
2024-06-15 16:02:21
@article{9d4d0778-b4b8-4095-8afa-5c13711ba960,
  abstract     = {{<p>Inhaling radon and its progeny is associated with adverse health outcomes. However, previous studies of the health effects of residential exposure to radon in the United States were commonly based on a county-level temporally invariant radon model that was developed using measurements collected in the mid- to late 1980s. We developed a machine learning model to predict monthly radon concentrations for each ZIP Code Tabulation Area (ZCTA) in the Greater Boston area based on 363,783 short-term measurements by Spruce Environmental Technologies, Inc., during the period 2005-2018. A two-stage ensemble-based model was developed to predict radon concentrations for all ZCTAs and months. Stage one included 12 base statistical models that independently predicted ZCTA-level radon concentrations based on geological, architectural, socioeconomic, and meteorological factors for each ZCTA. Stage two aggregated the predictions of these 12 base models using an ensemble learning method. The results of a 10-fold cross-validation showed that the stage-two model has a good prediction accuracy with a weighted R2 of 0.63 and root mean square error of 22.6 Bq/m3. The community-level time-varying predictions from our model have good predictive precision and accuracy and can be used in future prospective epidemiological studies in the Greater Boston area.</p>}},
  author       = {{Li, Longxiang and Blomberg, Annelise J and Stern, Rebecca A and Kang, Choong-Min and Papatheodorou, Stefania and Wei, Yaguang and Liu, Man and Peralta, Adjani A and Vieira, Carolina L Z and Koutrakis, Petros}},
  issn         = {{1520-5851}},
  keywords     = {{Air Pollutants, Radioactive/analysis; Air Pollution, Indoor/analysis; Boston; Housing; Machine Learning; Models, Statistical; Radon/analysis; United States}},
  language     = {{eng}},
  number       = {{10}},
  pages        = {{7157--7166}},
  publisher    = {{The American Chemical Society (ACS)}},
  series       = {{Environmental Science & Technology}},
  title        = {{Predicting Monthly Community-Level Domestic Radon Concentrations in the Greater Boston Area with an Ensemble Learning Model}},
  url          = {{http://dx.doi.org/10.1021/acs.est.0c08792}},
  doi          = {{10.1021/acs.est.0c08792}},
  volume       = {{55}},
  year         = {{2021}},
}