Machine Learning Classification on Behavior-Based Security Alerts : A Comparative Study of Three Algorithms

Zhang, Junchao; Walther, Isak

Machine Learning Classification on Behavior-Based Security Alerts : A Comparative Study of Three Algorithms

Mark

Zhang, Junchao ^LU and Walther, Isak ^LU (2024) EITL05 20241
Department of Electrical and Information Technology

Abstract: In the cybersecurity industry, security analysts are plagued by a high number of false positive alerts of various types. This takes up time and resources, and makes security analysts more prone to overlook true security threats. In collaboration with Orange Cyberdefense, this thesis investigates the ability of three machine learning algo- rithms, Decision Trees, Naive Bayes and Support Vector Machines (SVM), to classify behavioral security alerts. Using Scikit-learn, these three algorithms were trained and tested on synthetic data that con- sists of thousands of alerts. The results show that the Decision Tree algorithm has the highest performance in this alert classification task, closely followed by the SVM algorithm, with the Naive Bayes... (More); In the cybersecurity industry, security analysts are plagued by a high number of false positive alerts of various types. This takes up time and resources, and makes security analysts more prone to overlook true security threats. In collaboration with Orange Cyberdefense, this thesis investigates the ability of three machine learning algo- rithms, Decision Trees, Naive Bayes and Support Vector Machines (SVM), to classify behavioral security alerts. Using Scikit-learn, these three algorithms were trained and tested on synthetic data that con- sists of thousands of alerts. The results show that the Decision Tree algorithm has the highest performance in this alert classification task, closely followed by the SVM algorithm, with the Naive Bayes algo- rithm having the lowest performance. With the performance demon- strated by the algorithms, this thesis concludes that machine learning algorithms are able to assist security analysts prioritize true security threats. (Less)
Popular Abstract (Swedish): Inom cybers ̈akerhetsbranschen pl ̊agas s ̈akerhetsanalytiker av ett stort antal falska positiva varningar. Detta tar tid och resurser och g ̈or s ̈akerhetsanalytiker mer ben ̈agna att f ̈orbise verkliga s ̈akerhetshot. I samarbete med Orange Cyberdefense unders ̈oker detta examensar- bete f ̈orm ̊agan hos tre maskininl ̈arningsalgoritmer, Decision Trees, Naive Bayes och Support Vector Machines (SVM), att klassificera be- teendes ̈akerhetsvarningar. Med hj ̈alp av Scikit-learn tr ̈anades dessa tre algoritmer och testades p ̊a syntetisk data som best ̊ar av tusen- tals varningar. Resultaten visar att Decision Tree-algoritmen har den h ̈ogsta prestandan i denna varningsklassificeringsuppgift, t ̈att f ̈oljt av SVM-algoritmen, d ̈ar Naive... (More); Inom cybers ̈akerhetsbranschen pl ̊agas s ̈akerhetsanalytiker av ett stort antal falska positiva varningar. Detta tar tid och resurser och g ̈or s ̈akerhetsanalytiker mer ben ̈agna att f ̈orbise verkliga s ̈akerhetshot. I samarbete med Orange Cyberdefense unders ̈oker detta examensar- bete f ̈orm ̊agan hos tre maskininl ̈arningsalgoritmer, Decision Trees, Naive Bayes och Support Vector Machines (SVM), att klassificera be- teendes ̈akerhetsvarningar. Med hj ̈alp av Scikit-learn tr ̈anades dessa tre algoritmer och testades p ̊a syntetisk data som best ̊ar av tusen- tals varningar. Resultaten visar att Decision Tree-algoritmen har den h ̈ogsta prestandan i denna varningsklassificeringsuppgift, t ̈att f ̈oljt av SVM-algoritmen, d ̈ar Naive Bayes-algoritmen har den l ̈agsta prestandan. Med den prestanda som algoritmerna visar, drar detta examensarbete slutsatsen att maskininl ̈arningsalgoritmer kan hj ̈alpa s ̈akerhetsanalytiker att prioritera verkliga s ̈akerhetshot. (Less)

- Open Access
- |
- PDF
- Open Access
- |
- PDF

Links

Document download statistics

Related Materials

Related object is supplementary material:
Poster

Please use this url to cite or link to this publication: http://lup.lub.lu.se/student-papers/record/9170236

author

Zhang, Junchao ^LU and Walther, Isak ^LU

supervisor

Christian Gehrmann ^LU

organization

Department of Electrical and Information Technology

course

EITL05 20241

year

2024

type

M2 - Bachelor Degree

subject

Technology and Engineering

keywords

Machine learning, Alert classification, False positive alerts, Decision tree, Naive bayes, Support Vector Machine, alert fatigue

report number

LU/LTH-EIT 2024-997

language

English

id

9170236

date added to LUP

2024-07-05 13:15:30

date last changed

2024-07-05 13:15:30

@misc{9170236,
  abstract     = {{In the cybersecurity industry, security analysts are plagued by a high number of false positive alerts of various types. This takes up time and resources, and makes security analysts more prone to overlook true security threats. In collaboration with Orange Cyberdefense, this thesis investigates the ability of three machine learning algo- rithms, Decision Trees, Naive Bayes and Support Vector Machines (SVM), to classify behavioral security alerts. Using Scikit-learn, these three algorithms were trained and tested on synthetic data that con- sists of thousands of alerts. The results show that the Decision Tree algorithm has the highest performance in this alert classification task, closely followed by the SVM algorithm, with the Naive Bayes algo- rithm having the lowest performance. With the performance demon- strated by the algorithms, this thesis concludes that machine learning algorithms are able to assist security analysts prioritize true security threats.}},
  author       = {{Zhang, Junchao and Walther, Isak}},
  language     = {{eng}},
  note         = {{Student Paper}},
  title        = {{Machine Learning Classification on Behavior-Based Security Alerts : A Comparative Study of Three Algorithms}},
  year         = {{2024}},
}

LUP Student Papers

LUND UNIVERSITY LIBRARIES

Machine Learning Classification on Behavior-Based Security Alerts : A Comparative Study of Three Algorithms