Skip to main content

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Analysis and Prediction of Cyber-Threats using Machine Learning algorithms

Kebande, Victor Rigworo LU and Nguyen, Vinh (2025) DABN01 20241
Department of Economics
Department of Statistics
Abstract (Swedish)
As digital transformation continues to expand, the threat landscape for cyber-attacks has become increasingly complex. Proactive detection and prevention of cyber-threats are essential to secure critical infrastructures. This thesis proposes an approach centered on leveraging data analysis and Machine Learning (ML) to analyze and predict the existence of potential cyber-threats. By harnessing the power of advanced analytics techniques, and predictive modeling, the study aims to uncover hidden insights from historical cyber-threat data. This study has leveraged Microsoft Malware Prediction Dataset that has been curated by Microsoft for the purpose of advancing research in malware detection and prediction using ML classification models. We... (More)
As digital transformation continues to expand, the threat landscape for cyber-attacks has become increasingly complex. Proactive detection and prevention of cyber-threats are essential to secure critical infrastructures. This thesis proposes an approach centered on leveraging data analysis and Machine Learning (ML) to analyze and predict the existence of potential cyber-threats. By harnessing the power of advanced analytics techniques, and predictive modeling, the study aims to uncover hidden insights from historical cyber-threat data. This study has leveraged Microsoft Malware Prediction Dataset that has been curated by Microsoft for the purpose of advancing research in malware detection and prediction using ML classification models. We then evaluate and compare the effectiveness of several classifiers, including Random Forest (RF), Logistic Regression (LR), Quadratic Discriminant Analysis (QDA), Linear Discriminant Analysis (LDA), XGBoost, and a Fully Connected Neural Network (FCNN). RF achieved an average accuracy of 62.24\%, LR 61.41\%, QDA model achieved an accuracy of 59.15\%, LDA 60.86\%, XGBoost 60.75\% and FCNN 63.05\% respectively. From these results, it is evident that Ensemble methods like RF and Deep learning techniques like Neural Nets outperformed other models in terms of accuracy and other evaluation metrics that perform consistently well in new, unseen data. LR and LDA also demonstrated competitive performance. These findings suggest that RF and FCNN techniques are promising approaches for cyber threat detection tasks. (Less)
Please use this url to cite or link to this publication:
author
Kebande, Victor Rigworo LU and Nguyen, Vinh
supervisor
organization
course
DABN01 20241
year
type
H1 - Master's Degree (One Year)
subject
keywords
Machine Learning Algorithms, data analysis, Prediction
language
English
additional info
NA
id
9186430
date added to LUP
2025-09-17 08:31:04
date last changed
2025-09-17 08:31:04
@misc{9186430,
  abstract     = {{As digital transformation continues to expand, the threat landscape for cyber-attacks has become increasingly complex. Proactive detection and prevention of cyber-threats are essential to secure critical infrastructures. This thesis proposes an approach centered on leveraging data analysis and Machine Learning (ML) to analyze and predict the existence of potential cyber-threats. By harnessing the power of advanced analytics techniques, and predictive modeling, the study aims to uncover hidden insights from historical cyber-threat data. This study has leveraged Microsoft Malware Prediction Dataset that has been curated by Microsoft for the purpose of advancing research in malware detection and prediction using ML classification models. We then evaluate and compare the effectiveness of several classifiers, including Random Forest (RF), Logistic Regression (LR), Quadratic Discriminant Analysis (QDA), Linear Discriminant Analysis (LDA), XGBoost, and a Fully Connected Neural Network (FCNN). RF achieved an average accuracy of 62.24\%, LR 61.41\%, QDA model achieved an accuracy of 59.15\%, LDA 60.86\%, XGBoost 60.75\% and FCNN 63.05\% respectively. From these results, it is evident that Ensemble methods like RF and Deep learning techniques like Neural Nets outperformed other models in terms of accuracy and other evaluation metrics that perform consistently well in new, unseen data. LR and LDA also demonstrated competitive performance. These findings suggest that RF and FCNN techniques are promising approaches for cyber threat detection tasks.}},
  author       = {{Kebande, Victor Rigworo and Nguyen, Vinh}},
  language     = {{eng}},
  note         = {{Student Paper}},
  title        = {{Analysis and Prediction of Cyber-Threats using Machine Learning algorithms}},
  year         = {{2025}},
}