Security Issue Classification for Vulnerability Management with Semi-supervised Learning

Wåreus, Emil; Duppils, Anton; Tullberg, Magnus; Hell, Martin

Security Issue Classification for Vulnerability Management with Semi-supervised Learning

Mark

Wåreus, Emil ^LU ; Duppils, Anton ; Tullberg, Magnus and Hell, Martin ^LU (2022) p.84-95

Abstract: Open-Source Software (OSS) is increasingly common in industry software and enables developers to build better applications, at a higher pace, and with better security. These advantages also come with the cost of including vulnerabilities through these third-party libraries. The largest publicly available database of easily machine-readable vulnerabilities is the National Vulnerability Database (NVD). However, reporting to this database is a human-dependent process, and it fails to provide an acceptable coverage of all open source vulnerabilities. We propose the use of semi-supervised machine learning to classify issues as security-related to provide additional vulnerabilities in an automated pipeline. Our models, based on a Hierarchical... (More); Open-Source Software (OSS) is increasingly common in industry software and enables developers to build better applications, at a higher pace, and with better security. These advantages also come with the cost of including vulnerabilities through these third-party libraries. The largest publicly available database of easily machine-readable vulnerabilities is the National Vulnerability Database (NVD). However, reporting to this database is a human-dependent process, and it fails to provide an acceptable coverage of all open source vulnerabilities. We propose the use of semi-supervised machine learning to classify issues as security-related to provide additional vulnerabilities in an automated pipeline. Our models, based on a Hierarchical Attention Network (HAN), outperform previously proposed models on our manually labelled test dataset, with an F1 score of 71%. Based on the results and the vast number of GitHub issues, our model potentially identifies about 191 036 security-related issues with prediction power over 80%. (Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/93a45265-e5cb-4d38-a440-dcd41a07552a

author

Wåreus, Emil ^LU ; Duppils, Anton ; Tullberg, Magnus and Hell, Martin ^LU

organization

publishing date

2022

type

Chapter in Book/Report/Conference proceeding

publication status

published

subject

Computer Sciences

host publication

8th International Conference on Information Systems Security and Privacy, ICISSP 2022

pages

84 - 95

publisher

SciTePress

external identifiers

scopus:85176328863

ISBN

978-989-758-553-1

DOI

10.5220/0010813000003120

project

Säkra mjukvaruuppdateringar för den smarta staden

language

English

LU publication?

yes

id

93a45265-e5cb-4d38-a440-dcd41a07552a

date added to LUP

2022-04-04 14:51:47

date last changed

2025-10-14 11:17:29

@inproceedings{93a45265-e5cb-4d38-a440-dcd41a07552a,
  abstract     = {{Open-Source Software (OSS) is increasingly common in industry software and enables developers to build better applications, at a higher pace, and with better security. These advantages also come with the cost of including vulnerabilities through these third-party libraries. The largest publicly available database of easily machine-readable vulnerabilities is the National Vulnerability Database (NVD). However, reporting to this database is a human-dependent process, and it fails to provide an acceptable coverage of all open source vulnerabilities. We propose the use of semi-supervised machine learning to classify issues as security-related to provide additional vulnerabilities in an automated pipeline. Our models, based on a Hierarchical Attention Network (HAN), outperform previously proposed models on our manually labelled test dataset, with an F1 score of 71%. Based on the results and the vast number of GitHub issues, our model potentially identifies about 191 036 security-related issues with prediction power over 80%.}},
  author       = {{Wåreus, Emil and Duppils, Anton and Tullberg, Magnus and Hell, Martin}},
  booktitle    = {{8th International Conference on Information Systems Security and Privacy, ICISSP 2022}},
  isbn         = {{978-989-758-553-1}},
  language     = {{eng}},
  pages        = {{84--95}},
  publisher    = {{SciTePress}},
  title        = {{Security Issue Classification for Vulnerability Management with Semi-supervised Learning}},
  url          = {{http://dx.doi.org/10.5220/0010813000003120}},
  doi          = {{10.5220/0010813000003120}},
  year         = {{2022}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Security Issue Classification for Vulnerability Management with Semi-supervised Learning