Predicting preterm birth with machine learning methods
(2022) DABN01 20221Department of Economics
- Abstract
- Preterm birth is a leading cause for birth complications and neonatal mortality in the world. It remains difficult to predict whether a preterm birth will occur, which hinders the possible use of prevention treatments. This thesis investigates the use of machine learning models in the prediction of spontaneous preterm birth. In addition, possible heterogeneous performance of these models among different racial groups is explored. Using birth certificate data, retrieved from the Natality Birth Data Sets in the National Vital Statistics System, machine learning models were trained and evaluated. Four machine learning methods are employed: logistic regression, random forests, eXtreme gradient boosting and neural networks. The models’... (More)
- Preterm birth is a leading cause for birth complications and neonatal mortality in the world. It remains difficult to predict whether a preterm birth will occur, which hinders the possible use of prevention treatments. This thesis investigates the use of machine learning models in the prediction of spontaneous preterm birth. In addition, possible heterogeneous performance of these models among different racial groups is explored. Using birth certificate data, retrieved from the Natality Birth Data Sets in the National Vital Statistics System, machine learning models were trained and evaluated. Four machine learning methods are employed: logistic regression, random forests, eXtreme gradient boosting and neural networks. The models’ performance is similar across methods, the logistic regression model achieved the lowest test AUC of 0.6710 and the lowest TPR of 30.14% at the 10% FPR level. The eXtreme gradient boosting model performed best with a test AUC of 0.6994 and TPR of 34.15%. All models performed similarly for both black and non-black women. These results confirm previous evidence that this type of easily accessible patient data does not seem to be sufficient to construct high-performing machine learning models. (Less)
Please use this url to cite or link to this publication:
http://lup.lub.lu.se/student-papers/record/9083000
- author
- Dresse, Zélie Isabelle E LU
- supervisor
- organization
- course
- DABN01 20221
- year
- 2022
- type
- H1 - Master's Degree (One Year)
- subject
- keywords
- preterm birth, prediction, machine learning, race
- language
- English
- id
- 9083000
- date added to LUP
- 2022-06-08 12:50:45
- date last changed
- 2022-06-08 12:50:45
@misc{9083000, abstract = {{Preterm birth is a leading cause for birth complications and neonatal mortality in the world. It remains difficult to predict whether a preterm birth will occur, which hinders the possible use of prevention treatments. This thesis investigates the use of machine learning models in the prediction of spontaneous preterm birth. In addition, possible heterogeneous performance of these models among different racial groups is explored. Using birth certificate data, retrieved from the Natality Birth Data Sets in the National Vital Statistics System, machine learning models were trained and evaluated. Four machine learning methods are employed: logistic regression, random forests, eXtreme gradient boosting and neural networks. The models’ performance is similar across methods, the logistic regression model achieved the lowest test AUC of 0.6710 and the lowest TPR of 30.14% at the 10% FPR level. The eXtreme gradient boosting model performed best with a test AUC of 0.6994 and TPR of 34.15%. All models performed similarly for both black and non-black women. These results confirm previous evidence that this type of easily accessible patient data does not seem to be sufficient to construct high-performing machine learning models.}}, author = {{Dresse, Zélie Isabelle E}}, language = {{eng}}, note = {{Student Paper}}, title = {{Predicting preterm birth with machine learning methods}}, year = {{2022}}, }