Skip to main content

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Comparing Machine Learning Models for Predicting Lung Cancer Mortality Rates

Welander, Mikael LU (2024) STAH11 20222
Department of Statistics
Abstract
Every year, approximately 2.2 million people are diagnosed with lung cancer worldwide, and 1.7 million die as a result of the disease. This thesis employs machine learning to predict mean per-capita lung cancer mortality rates in US counties, using a dataset consisting of a wide array of demographic, socio-economic, educational, and healthcare-related variables. The primary objective is to compare the predictive performance of several machine learning methods: ordinary least squares, ridge regression, the lasso, and neural networks with hyperparameters optimized through random search and 5-fold cross-validation.
A neural network using mean squared error (MSE) as loss function achieved the lowest mean absolute error (MAE) and root mean... (More)
Every year, approximately 2.2 million people are diagnosed with lung cancer worldwide, and 1.7 million die as a result of the disease. This thesis employs machine learning to predict mean per-capita lung cancer mortality rates in US counties, using a dataset consisting of a wide array of demographic, socio-economic, educational, and healthcare-related variables. The primary objective is to compare the predictive performance of several machine learning methods: ordinary least squares, ridge regression, the lasso, and neural networks with hyperparameters optimized through random search and 5-fold cross-validation.
A neural network using mean squared error (MSE) as loss function achieved the lowest mean absolute error (MAE) and root mean squared error (RMSE) on the test set. However, the overall differences between models were small. Regularization through ridge regression and the lasso did not improve predictive performance compared to ordinary least squares. Furthermore, a comparison with previous research revealed substantial differences in model performance, with past studies reporting better predictive results. Consequently, several avenues were suggested as potential paths for future research endeavours. (Less)
Please use this url to cite or link to this publication:
author
Welander, Mikael LU
supervisor
organization
course
STAH11 20222
year
type
M2 - Bachelor Degree
subject
keywords
Machine learning, neural networks, ridge regression, lasso, ordinary least squares, cross-validation, gradient descent.
language
English
id
9159269
date added to LUP
2024-06-11 11:17:00
date last changed
2024-06-11 11:17:00
@misc{9159269,
  abstract     = {{Every year, approximately 2.2 million people are diagnosed with lung cancer worldwide, and 1.7 million die as a result of the disease. This thesis employs machine learning to predict mean per-capita lung cancer mortality rates in US counties, using a dataset consisting of a wide array of demographic, socio-economic, educational, and healthcare-related variables. The primary objective is to compare the predictive performance of several machine learning methods: ordinary least squares, ridge regression, the lasso, and neural networks with hyperparameters optimized through random search and 5-fold cross-validation.
A neural network using mean squared error (MSE) as loss function achieved the lowest mean absolute error (MAE) and root mean squared error (RMSE) on the test set. However, the overall differences between models were small. Regularization through ridge regression and the lasso did not improve predictive performance compared to ordinary least squares. Furthermore, a comparison with previous research revealed substantial differences in model performance, with past studies reporting better predictive results. Consequently, several avenues were suggested as potential paths for future research endeavours.}},
  author       = {{Welander, Mikael}},
  language     = {{eng}},
  note         = {{Student Paper}},
  title        = {{Comparing Machine Learning Models for Predicting Lung Cancer Mortality Rates}},
  year         = {{2024}},
}