Analysis and Prediction of Cyber-Threats using Machine Learning algorithms
(2025) DABN01 20241Department of Economics
Department of Statistics
- Abstract (Swedish)
- As digital transformation continues to expand, the threat landscape for cyber-attacks has become increasingly complex. Proactive detection and prevention of cyber-threats are essential to secure critical infrastructures. This thesis proposes an approach centered on leveraging data analysis and Machine Learning (ML) to analyze and predict the existence of potential cyber-threats. By harnessing the power of advanced analytics techniques, and predictive modeling, the study aims to uncover hidden insights from historical cyber-threat data. This study has leveraged Microsoft Malware Prediction Dataset that has been curated by Microsoft for the purpose of advancing research in malware detection and prediction using ML classification models. We... (More)
- As digital transformation continues to expand, the threat landscape for cyber-attacks has become increasingly complex. Proactive detection and prevention of cyber-threats are essential to secure critical infrastructures. This thesis proposes an approach centered on leveraging data analysis and Machine Learning (ML) to analyze and predict the existence of potential cyber-threats. By harnessing the power of advanced analytics techniques, and predictive modeling, the study aims to uncover hidden insights from historical cyber-threat data. This study has leveraged Microsoft Malware Prediction Dataset that has been curated by Microsoft for the purpose of advancing research in malware detection and prediction using ML classification models. We then evaluate and compare the effectiveness of several classifiers, including Random Forest (RF), Logistic Regression (LR), Quadratic Discriminant Analysis (QDA), Linear Discriminant Analysis (LDA), XGBoost, and a Fully Connected Neural Network (FCNN). RF achieved an average accuracy of 62.24\%, LR 61.41\%, QDA model achieved an accuracy of 59.15\%, LDA 60.86\%, XGBoost 60.75\% and FCNN 63.05\% respectively. From these results, it is evident that Ensemble methods like RF and Deep learning techniques like Neural Nets outperformed other models in terms of accuracy and other evaluation metrics that perform consistently well in new, unseen data. LR and LDA also demonstrated competitive performance. These findings suggest that RF and FCNN techniques are promising approaches for cyber threat detection tasks. (Less)
Please use this url to cite or link to this publication:
http://lup.lub.lu.se/student-papers/record/9186430
- author
- Kebande, Victor Rigworo LU and Nguyen, Vinh
- supervisor
- organization
- course
- DABN01 20241
- year
- 2025
- type
- H1 - Master's Degree (One Year)
- subject
- keywords
- Machine Learning Algorithms, data analysis, Prediction
- language
- English
- additional info
- NA
- id
- 9186430
- date added to LUP
- 2025-09-17 08:31:04
- date last changed
- 2025-09-17 08:31:04
@misc{9186430, abstract = {{As digital transformation continues to expand, the threat landscape for cyber-attacks has become increasingly complex. Proactive detection and prevention of cyber-threats are essential to secure critical infrastructures. This thesis proposes an approach centered on leveraging data analysis and Machine Learning (ML) to analyze and predict the existence of potential cyber-threats. By harnessing the power of advanced analytics techniques, and predictive modeling, the study aims to uncover hidden insights from historical cyber-threat data. This study has leveraged Microsoft Malware Prediction Dataset that has been curated by Microsoft for the purpose of advancing research in malware detection and prediction using ML classification models. We then evaluate and compare the effectiveness of several classifiers, including Random Forest (RF), Logistic Regression (LR), Quadratic Discriminant Analysis (QDA), Linear Discriminant Analysis (LDA), XGBoost, and a Fully Connected Neural Network (FCNN). RF achieved an average accuracy of 62.24\%, LR 61.41\%, QDA model achieved an accuracy of 59.15\%, LDA 60.86\%, XGBoost 60.75\% and FCNN 63.05\% respectively. From these results, it is evident that Ensemble methods like RF and Deep learning techniques like Neural Nets outperformed other models in terms of accuracy and other evaluation metrics that perform consistently well in new, unseen data. LR and LDA also demonstrated competitive performance. These findings suggest that RF and FCNN techniques are promising approaches for cyber threat detection tasks.}}, author = {{Kebande, Victor Rigworo and Nguyen, Vinh}}, language = {{eng}}, note = {{Student Paper}}, title = {{Analysis and Prediction of Cyber-Threats using Machine Learning algorithms}}, year = {{2025}}, }