Machine Learning for Mortality Risk Prediction: A Study on Cohort Data
(2025)Department of Automatic Control
- Abstract
- This thesis explores the application of machine learning techniques to predict mortality risk using data from the Swedish Adoption/Twin Study of Aging (SATSA). Through the development and evaluation of multiple supervised learning models, including Random Forest and Histogram Gradient Boosting classifiers, the study examines mortality predictors across 10-year and 20-year timeframes. The models achieve strong predictive performance for 20-year mortality, with accuracies ranging from 79-81% and Area-Under-Curve scores of 0.84-0.88. Analysis reveals that while chronological age remains the strongest predictor, cognitive function, cardiovascular health markers, and pulmonary capacity emerge as significant predictors, with their relative... (More)
- This thesis explores the application of machine learning techniques to predict mortality risk using data from the Swedish Adoption/Twin Study of Aging (SATSA). Through the development and evaluation of multiple supervised learning models, including Random Forest and Histogram Gradient Boosting classifiers, the study examines mortality predictors across 10-year and 20-year timeframes. The models achieve strong predictive performance for 20-year mortality, with accuracies ranging from 79-81% and Area-Under-Curve scores of 0.84-0.88. Analysis reveals that while chronological age remains the strongest predictor, cognitive function, cardiovascular health markers, and pulmonary capacity emerge as significant predictors, with their relative importance varying between prediction windows. The study introduces a novel approach to missing data, treating non-completion of physical assessments as informative signals rather than applying traditional imputation methods. This work demonstrates the feasibility of developing accurate mortality prediction models using cohort data while highlighting the complex nature of mortality risk factors in aging populations. The findings have potential implications for both clinical practice and aging research methodology. (Less)
Please use this url to cite or link to this publication:
http://lup.lub.lu.se/student-papers/record/9186855
- author
- Jakobsson, Jim
- supervisor
- organization
- alternative title
- Maskininlärning för prediktion av mortalitetsrisk: En studie på kohortdata
- year
- 2025
- type
- H3 - Professional qualifications (4 Years - )
- subject
- report number
- TFRT-6271
- other publication id
- 0280-5316
- language
- English
- id
- 9186855
- date added to LUP
- 2025-05-09 08:20:50
- date last changed
- 2025-05-09 08:20:50
@misc{9186855, abstract = {{This thesis explores the application of machine learning techniques to predict mortality risk using data from the Swedish Adoption/Twin Study of Aging (SATSA). Through the development and evaluation of multiple supervised learning models, including Random Forest and Histogram Gradient Boosting classifiers, the study examines mortality predictors across 10-year and 20-year timeframes. The models achieve strong predictive performance for 20-year mortality, with accuracies ranging from 79-81% and Area-Under-Curve scores of 0.84-0.88. Analysis reveals that while chronological age remains the strongest predictor, cognitive function, cardiovascular health markers, and pulmonary capacity emerge as significant predictors, with their relative importance varying between prediction windows. The study introduces a novel approach to missing data, treating non-completion of physical assessments as informative signals rather than applying traditional imputation methods. This work demonstrates the feasibility of developing accurate mortality prediction models using cohort data while highlighting the complex nature of mortality risk factors in aging populations. The findings have potential implications for both clinical practice and aging research methodology.}}, author = {{Jakobsson, Jim}}, language = {{eng}}, note = {{Student Paper}}, title = {{Machine Learning for Mortality Risk Prediction: A Study on Cohort Data}}, year = {{2025}}, }