Comparing Machine Learning Models for Predicting Lung Cancer Mortality Rates

Welander, Mikael

Comparing Machine Learning Models for Predicting Lung Cancer Mortality Rates

Mark

Welander, Mikael ^LU (2024) STAH11 20222
Department of Statistics

Abstract: Every year, approximately 2.2 million people are diagnosed with lung cancer worldwide, and 1.7 million die as a result of the disease. This thesis employs machine learning to predict mean per-capita lung cancer mortality rates in US counties, using a dataset consisting of a wide array of demographic, socio-economic, educational, and healthcare-related variables. The primary objective is to compare the predictive performance of several machine learning methods: ordinary least squares, ridge regression, the lasso, and neural networks with hyperparameters optimized through random search and 5-fold cross-validation.
A neural network using mean squared error (MSE) as loss function achieved the lowest mean absolute error (MAE) and root mean... (More); Every year, approximately 2.2 million people are diagnosed with lung cancer worldwide, and 1.7 million die as a result of the disease. This thesis employs machine learning to predict mean per-capita lung cancer mortality rates in US counties, using a dataset consisting of a wide array of demographic, socio-economic, educational, and healthcare-related variables. The primary objective is to compare the predictive performance of several machine learning methods: ordinary least squares, ridge regression, the lasso, and neural networks with hyperparameters optimized through random search and 5-fold cross-validation.
A neural network using mean squared error (MSE) as loss function achieved the lowest mean absolute error (MAE) and root mean squared error (RMSE) on the test set. However, the overall differences between models were small. Regularization through ridge regression and the lasso did not improve predictive performance compared to ordinary least squares. Furthermore, a comparison with previous research revealed substantial differences in model performance, with past studies reporting better predictive results. Consequently, several avenues were suggested as potential paths for future research endeavours. (Less)

Please use this url to cite or link to this publication: http://lup.lub.lu.se/student-papers/record/9159269

author

Welander, Mikael ^LU

supervisor

Farrukh Javed ^LU

organization

Department of Statistics

course

STAH11 20222

year

2024

type

M2 - Bachelor Degree

subject

Mathematics and Statistics

keywords

Machine learning, neural networks, ridge regression, lasso, ordinary least squares, cross-validation, gradient descent.

language

English

id

9159269

date added to LUP

2024-06-11 11:17:00

date last changed

2024-07-05 03:42:53

@misc{9159269,
  abstract     = {{Every year, approximately 2.2 million people are diagnosed with lung cancer worldwide, and 1.7 million die as a result of the disease. This thesis employs machine learning to predict mean per-capita lung cancer mortality rates in US counties, using a dataset consisting of a wide array of demographic, socio-economic, educational, and healthcare-related variables. The primary objective is to compare the predictive performance of several machine learning methods: ordinary least squares, ridge regression, the lasso, and neural networks with hyperparameters optimized through random search and 5-fold cross-validation.
A neural network using mean squared error (MSE) as loss function achieved the lowest mean absolute error (MAE) and root mean squared error (RMSE) on the test set. However, the overall differences between models were small. Regularization through ridge regression and the lasso did not improve predictive performance compared to ordinary least squares. Furthermore, a comparison with previous research revealed substantial differences in model performance, with past studies reporting better predictive results. Consequently, several avenues were suggested as potential paths for future research endeavours.}},
  author       = {{Welander, Mikael}},
  language     = {{eng}},
  note         = {{Student Paper}},
  title        = {{Comparing Machine Learning Models for Predicting Lung Cancer Mortality Rates}},
  year         = {{2024}},
}

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Comparing Machine Learning Models for Predicting Lung Cancer Mortality Rates