Skip to main content

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Machine Learning Classification on Behavior-Based Security Alerts : A Comparative Study of Three Algorithms

Zhang, Junchao LU and Walther, Isak LU (2024) EITL05 20241
Department of Electrical and Information Technology
Abstract
In the cybersecurity industry, security analysts are plagued by a high number of false positive alerts of various types. This takes up time and resources, and makes security analysts more prone to overlook true security threats. In collaboration with Orange Cyberdefense, this thesis investigates the ability of three machine learning algo- rithms, Decision Trees, Naive Bayes and Support Vector Machines (SVM), to classify behavioral security alerts. Using Scikit-learn, these three algorithms were trained and tested on synthetic data that con- sists of thousands of alerts. The results show that the Decision Tree algorithm has the highest performance in this alert classification task, closely followed by the SVM algorithm, with the Naive Bayes... (More)
In the cybersecurity industry, security analysts are plagued by a high number of false positive alerts of various types. This takes up time and resources, and makes security analysts more prone to overlook true security threats. In collaboration with Orange Cyberdefense, this thesis investigates the ability of three machine learning algo- rithms, Decision Trees, Naive Bayes and Support Vector Machines (SVM), to classify behavioral security alerts. Using Scikit-learn, these three algorithms were trained and tested on synthetic data that con- sists of thousands of alerts. The results show that the Decision Tree algorithm has the highest performance in this alert classification task, closely followed by the SVM algorithm, with the Naive Bayes algo- rithm having the lowest performance. With the performance demon- strated by the algorithms, this thesis concludes that machine learning algorithms are able to assist security analysts prioritize true security threats. (Less)
Popular Abstract (Swedish)
Inom cybers ̈akerhetsbranschen pl ̊agas s ̈akerhetsanalytiker av ett stort antal falska positiva varningar. Detta tar tid och resurser och g ̈or s ̈akerhetsanalytiker mer ben ̈agna att f ̈orbise verkliga s ̈akerhetshot. I samarbete med Orange Cyberdefense unders ̈oker detta examensar- bete f ̈orm ̊agan hos tre maskininl ̈arningsalgoritmer, Decision Trees, Naive Bayes och Support Vector Machines (SVM), att klassificera be- teendes ̈akerhetsvarningar. Med hj ̈alp av Scikit-learn tr ̈anades dessa tre algoritmer och testades p ̊a syntetisk data som best ̊ar av tusen- tals varningar. Resultaten visar att Decision Tree-algoritmen har den h ̈ogsta prestandan i denna varningsklassificeringsuppgift, t ̈att f ̈oljt av SVM-algoritmen, d ̈ar Naive... (More)
Inom cybers ̈akerhetsbranschen pl ̊agas s ̈akerhetsanalytiker av ett stort antal falska positiva varningar. Detta tar tid och resurser och g ̈or s ̈akerhetsanalytiker mer ben ̈agna att f ̈orbise verkliga s ̈akerhetshot. I samarbete med Orange Cyberdefense unders ̈oker detta examensar- bete f ̈orm ̊agan hos tre maskininl ̈arningsalgoritmer, Decision Trees, Naive Bayes och Support Vector Machines (SVM), att klassificera be- teendes ̈akerhetsvarningar. Med hj ̈alp av Scikit-learn tr ̈anades dessa tre algoritmer och testades p ̊a syntetisk data som best ̊ar av tusen- tals varningar. Resultaten visar att Decision Tree-algoritmen har den h ̈ogsta prestandan i denna varningsklassificeringsuppgift, t ̈att f ̈oljt av SVM-algoritmen, d ̈ar Naive Bayes-algoritmen har den l ̈agsta prestandan. Med den prestanda som algoritmerna visar, drar detta examensarbete slutsatsen att maskininl ̈arningsalgoritmer kan hj ̈alpa s ̈akerhetsanalytiker att prioritera verkliga s ̈akerhetshot. (Less)
Please use this url to cite or link to this publication:
author
Zhang, Junchao LU and Walther, Isak LU
supervisor
organization
course
EITL05 20241
year
type
M2 - Bachelor Degree
subject
keywords
Machine learning, Alert classification, False positive alerts, Decision tree, Naive bayes, Support Vector Machine, alert fatigue
report number
LU/LTH-EIT 2024-997
language
English
id
9170236
date added to LUP
2024-07-05 13:15:30
date last changed
2024-07-05 13:15:30
@misc{9170236,
  abstract     = {{In the cybersecurity industry, security analysts are plagued by a high number of false positive alerts of various types. This takes up time and resources, and makes security analysts more prone to overlook true security threats. In collaboration with Orange Cyberdefense, this thesis investigates the ability of three machine learning algo- rithms, Decision Trees, Naive Bayes and Support Vector Machines (SVM), to classify behavioral security alerts. Using Scikit-learn, these three algorithms were trained and tested on synthetic data that con- sists of thousands of alerts. The results show that the Decision Tree algorithm has the highest performance in this alert classification task, closely followed by the SVM algorithm, with the Naive Bayes algo- rithm having the lowest performance. With the performance demon- strated by the algorithms, this thesis concludes that machine learning algorithms are able to assist security analysts prioritize true security threats.}},
  author       = {{Zhang, Junchao and Walther, Isak}},
  language     = {{eng}},
  note         = {{Student Paper}},
  title        = {{Machine Learning Classification on Behavior-Based Security Alerts : A Comparative Study of Three Algorithms}},
  year         = {{2024}},
}