Security Issue Classification for Vulnerability Management with Semi-supervised Learning
(2022) p.84-95- Abstract
- Open-Source Software (OSS) is increasingly common in industry software and enables developers to build better applications, at a higher pace, and with better security. These advantages also come with the cost of including vulnerabilities through these third-party libraries. The largest publicly available database of easily machine-readable vulnerabilities is the National Vulnerability Database (NVD). However, reporting to this database is a human-dependent process, and it fails to provide an acceptable coverage of all open source vulnerabilities. We propose the use of semi-supervised machine learning to classify issues as security-related to provide additional vulnerabilities in an automated pipeline. Our models, based on a Hierarchical... (More)
- Open-Source Software (OSS) is increasingly common in industry software and enables developers to build better applications, at a higher pace, and with better security. These advantages also come with the cost of including vulnerabilities through these third-party libraries. The largest publicly available database of easily machine-readable vulnerabilities is the National Vulnerability Database (NVD). However, reporting to this database is a human-dependent process, and it fails to provide an acceptable coverage of all open source vulnerabilities. We propose the use of semi-supervised machine learning to classify issues as security-related to provide additional vulnerabilities in an automated pipeline. Our models, based on a Hierarchical Attention Network (HAN), outperform previously proposed models on our manually labelled test dataset, with an F1 score of 71%. Based on the results and the vast number of GitHub issues, our model potentially identifies about 191 036 security-related issues with prediction power over 80%. (Less)
Please use this url to cite or link to this publication:
https://lup.lub.lu.se/record/93a45265-e5cb-4d38-a440-dcd41a07552a
- author
- Wåreus, Emil LU ; Duppils, Anton ; Tullberg, Magnus and Hell, Martin LU
- organization
- publishing date
- 2022
- type
- Chapter in Book/Report/Conference proceeding
- publication status
- published
- subject
- host publication
- 8th International Conference on Information Systems Security and Privacy, ICISSP 2022
- pages
- 84 - 95
- publisher
- SciTePress
- external identifiers
-
- scopus:85176328863
- ISBN
- 978-989-758-553-1
- DOI
- 10.5220/0010813000003120
- project
- Säkra mjukvaruuppdateringar för den smarta staden
- language
- English
- LU publication?
- yes
- id
- 93a45265-e5cb-4d38-a440-dcd41a07552a
- date added to LUP
- 2022-04-04 14:51:47
- date last changed
- 2024-01-10 13:23:49
@inproceedings{93a45265-e5cb-4d38-a440-dcd41a07552a, abstract = {{Open-Source Software (OSS) is increasingly common in industry software and enables developers to build better applications, at a higher pace, and with better security. These advantages also come with the cost of including vulnerabilities through these third-party libraries. The largest publicly available database of easily machine-readable vulnerabilities is the National Vulnerability Database (NVD). However, reporting to this database is a human-dependent process, and it fails to provide an acceptable coverage of all open source vulnerabilities. We propose the use of semi-supervised machine learning to classify issues as security-related to provide additional vulnerabilities in an automated pipeline. Our models, based on a Hierarchical Attention Network (HAN), outperform previously proposed models on our manually labelled test dataset, with an F1 score of 71%. Based on the results and the vast number of GitHub issues, our model potentially identifies about 191 036 security-related issues with prediction power over 80%.}}, author = {{Wåreus, Emil and Duppils, Anton and Tullberg, Magnus and Hell, Martin}}, booktitle = {{8th International Conference on Information Systems Security and Privacy, ICISSP 2022}}, isbn = {{978-989-758-553-1}}, language = {{eng}}, pages = {{84--95}}, publisher = {{SciTePress}}, title = {{Security Issue Classification for Vulnerability Management with Semi-supervised Learning}}, url = {{http://dx.doi.org/10.5220/0010813000003120}}, doi = {{10.5220/0010813000003120}}, year = {{2022}}, }