Skip to main content

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Comparative Analysis of Machine Learning Algorithms on Comprehensive and Cluster-Specific Data in the Auto Insurance Industry

Balabanova, Veselina LU and Bhattarai, Shreeya LU (2024) DABN01 20241
Department of Economics
Department of Statistics
Abstract
In recent years, businesses have been focusing on Customer Lifetime Value (CLV) to achieve better customer relationships and to identify high-value customers for more customized marketing strategies. This thesis contributes by comparing the performance of different machine learning models on cluster-specific data points and the complete dataset from the auto insurance industry. In addition, the study also discovers the most valuable customer cluster and devises customer retention strategies based on significant features that influence CLV.

For further empirical analysis, we have selected Principal Component Analysis (PCA) and k-means Clustering for customer segmentation. We have also used Random Forest, XGBoost, and Neural Networks, to... (More)
In recent years, businesses have been focusing on Customer Lifetime Value (CLV) to achieve better customer relationships and to identify high-value customers for more customized marketing strategies. This thesis contributes by comparing the performance of different machine learning models on cluster-specific data points and the complete dataset from the auto insurance industry. In addition, the study also discovers the most valuable customer cluster and devises customer retention strategies based on significant features that influence CLV.

For further empirical analysis, we have selected Principal Component Analysis (PCA) and k-means Clustering for customer segmentation. We have also used Random Forest, XGBoost, and Neural Networks, to predict CLV on comprehensive and cluster-specific data. Applied feature importance and hyperparameter tuning have been used for further insights. Overall, the findings suggest the best performance among the models is by Random Forest and its R^2 improved by 27% while RMSE dropped by 39% after applying the models to every cluster for predicting CLV. For future research, the findings from this study can also be adopted in other insurance industries to see how using clustering techniques helps improve the machine learning models’ performances. (Less)
Please use this url to cite or link to this publication:
author
Balabanova, Veselina LU and Bhattarai, Shreeya LU
supervisor
organization
course
DABN01 20241
year
type
H1 - Master's Degree (One Year)
subject
keywords
Auto Insurance Industry, Machine Learning, Random Forest, XGBoost, Neural Network, k-Means Clustering, Principal Component Analysis, Customer Lifetime Value (CLV)
language
English
id
9154968
date added to LUP
2024-09-24 08:32:17
date last changed
2024-09-24 08:32:17
@misc{9154968,
  abstract     = {{In recent years, businesses have been focusing on Customer Lifetime Value (CLV) to achieve better customer relationships and to identify high-value customers for more customized marketing strategies. This thesis contributes by comparing the performance of different machine learning models on cluster-specific data points and the complete dataset from the auto insurance industry. In addition, the study also discovers the most valuable customer cluster and devises customer retention strategies based on significant features that influence CLV.

For further empirical analysis, we have selected Principal Component Analysis (PCA) and k-means Clustering for customer segmentation. We have also used Random Forest, XGBoost, and Neural Networks, to predict CLV on comprehensive and cluster-specific data. Applied feature importance and hyperparameter tuning have been used for further insights. Overall, the findings suggest the best performance among the models is by Random Forest and its R^2 improved by 27% while RMSE dropped by 39% after applying the models to every cluster for predicting CLV. For future research, the findings from this study can also be adopted in other insurance industries to see how using clustering techniques helps improve the machine learning models’ performances.}},
  author       = {{Balabanova, Veselina and Bhattarai, Shreeya}},
  language     = {{eng}},
  note         = {{Student Paper}},
  title        = {{Comparative Analysis of Machine Learning Algorithms on Comprehensive and Cluster-Specific Data in the Auto Insurance Industry}},
  year         = {{2024}},
}