An Exploratory Study on Machine Learning to Combine Security Vulnerability Alerts from Static Analysis Tools

Pereira, Jose D'Abruzzo; Campos, João R.; Vieira, Marco

An Exploratory Study on Machine Learning to Combine Security Vulnerability Alerts from Static Analysis Tools

Detalhes bibliográficos
Autor(a) principal:	Pereira, Jose D'Abruzzo
Data de Publicação:	2019
Outros Autores:	Campos, João R., Vieira, Marco
Tipo de documento:	Artigo
Idioma:	eng
Título da fonte:	Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
Texto Completo:	https://hdl.handle.net/10316/117479 https://doi.org/10.1109/LADC48089.2019.8995685
Resumo:	Due to time-to-market needs and cost of manual validation techniques, software systems are often deployed with vulnerabilities that may be exploited to gain illegitimate access/control, ultimately resulting in non-negligible consequences. Static Analysis Tools (SATs) are widely used for vulnerability detection, where the source code is analyzed without executing it. However, the performance of SATs varies considerably and a high detection rate usually comes with significant false alarms. Recent studies considered combining various SATs to improve the overall detection ability, but they do not allow exploring different performance trade-offs, as basic and rigid rules are normally followed. Machine Learning (ML) algorithms have shown promising results in several complex problems, due to their ability to fit specific needs. This paper presents an exploratory study on the combination of the output of SATs through ML algorithms to improve vulnerability detection while trying to reduce false alarms. The dataset consists of SQL Injection (SQLi) and Cross-Site Scripting (XSS) vulnerabilities detected by five different SATs in a large set of WordPress plugins developed in PHP. Results show that, for the case of SQLi, a false alarm reduction is possible without compromising the vulnerabilities detected, and that using ML allows trade-offs (e.g., reduction in false alarms at the expense of a few vulnerabilities) that are not possible with existing techniques. The paper also proposes a regression-based approach for ranking source code files considering estimates of vulnerabilities computed using the output of SATs. Results show that the approach allows creating a ranking of the source code files that largely overlaps the real ranking (based on real known vulnerabilities).

Metadados do item

id	RCAP_9ec7f1ad151856a77a682f0281a5511a
oai_identifier_str	oai:estudogeral.uc.pt:10316/117479
network_acronym_str	RCAP
network_name_str	Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository_id_str	https://opendoar.ac.uk/repository/7160
spelling	An Exploratory Study on Machine Learning to Combine Security Vulnerability Alerts from Static Analysis ToolsSecurityVulnerability DetectionStatic Code AnalysisMachine LearningDue to time-to-market needs and cost of manual validation techniques, software systems are often deployed with vulnerabilities that may be exploited to gain illegitimate access/control, ultimately resulting in non-negligible consequences. Static Analysis Tools (SATs) are widely used for vulnerability detection, where the source code is analyzed without executing it. However, the performance of SATs varies considerably and a high detection rate usually comes with significant false alarms. Recent studies considered combining various SATs to improve the overall detection ability, but they do not allow exploring different performance trade-offs, as basic and rigid rules are normally followed. Machine Learning (ML) algorithms have shown promising results in several complex problems, due to their ability to fit specific needs. This paper presents an exploratory study on the combination of the output of SATs through ML algorithms to improve vulnerability detection while trying to reduce false alarms. The dataset consists of SQL Injection (SQLi) and Cross-Site Scripting (XSS) vulnerabilities detected by five different SATs in a large set of WordPress plugins developed in PHP. Results show that, for the case of SQLi, a false alarm reduction is possible without compromising the vulnerabilities detected, and that using ML allows trade-offs (e.g., reduction in false alarms at the expense of a few vulnerabilities) that are not possible with existing techniques. The paper also proposes a regression-based approach for ranking source code files considering estimates of vulnerabilities computed using the output of SATs. Results show that the approach allows creating a ranking of the source code files that largely overlaps the real ranking (based on real known vulnerabilities).This work was partially funded by FCT grant no. SFRH/BD/140221/2018, project ATMOSPHERE, funded by the European Commission under the Cooperation Programme, H2020 grant agreement no. 777154, and project METRICS, funded by the FCT – agreement no POCI-01-0145-FEDER-032504.IEEE2019info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articlehttps://hdl.handle.net/10316/117479https://hdl.handle.net/10316/117479https://doi.org/10.1109/LADC48089.2019.8995685eng978-1-7281-6622-3https://ieeexplore.ieee.org/document/8995685Pereira, Jose D'AbruzzoCampos, João R.Vieira, Marcoinfo:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2024-12-27T16:22:16Zoai:estudogeral.uc.pt:10316/117479Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-29T06:11:25.154402Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse
dc.title.none.fl_str_mv	An Exploratory Study on Machine Learning to Combine Security Vulnerability Alerts from Static Analysis Tools
title	An Exploratory Study on Machine Learning to Combine Security Vulnerability Alerts from Static Analysis Tools
spellingShingle	An Exploratory Study on Machine Learning to Combine Security Vulnerability Alerts from Static Analysis Tools Pereira, Jose D'Abruzzo Security Vulnerability Detection Static Code Analysis Machine Learning
title_short	An Exploratory Study on Machine Learning to Combine Security Vulnerability Alerts from Static Analysis Tools
title_full	An Exploratory Study on Machine Learning to Combine Security Vulnerability Alerts from Static Analysis Tools
title_fullStr	An Exploratory Study on Machine Learning to Combine Security Vulnerability Alerts from Static Analysis Tools
title_full_unstemmed	An Exploratory Study on Machine Learning to Combine Security Vulnerability Alerts from Static Analysis Tools
title_sort	An Exploratory Study on Machine Learning to Combine Security Vulnerability Alerts from Static Analysis Tools
author	Pereira, Jose D'Abruzzo
author_facet	Pereira, Jose D'Abruzzo Campos, João R. Vieira, Marco
author_role	author
author2	Campos, João R. Vieira, Marco
author2_role	author author
dc.contributor.author.fl_str_mv	Pereira, Jose D'Abruzzo Campos, João R. Vieira, Marco
dc.subject.por.fl_str_mv	Security Vulnerability Detection Static Code Analysis Machine Learning
topic	Security Vulnerability Detection Static Code Analysis Machine Learning
description	Due to time-to-market needs and cost of manual validation techniques, software systems are often deployed with vulnerabilities that may be exploited to gain illegitimate access/control, ultimately resulting in non-negligible consequences. Static Analysis Tools (SATs) are widely used for vulnerability detection, where the source code is analyzed without executing it. However, the performance of SATs varies considerably and a high detection rate usually comes with significant false alarms. Recent studies considered combining various SATs to improve the overall detection ability, but they do not allow exploring different performance trade-offs, as basic and rigid rules are normally followed. Machine Learning (ML) algorithms have shown promising results in several complex problems, due to their ability to fit specific needs. This paper presents an exploratory study on the combination of the output of SATs through ML algorithms to improve vulnerability detection while trying to reduce false alarms. The dataset consists of SQL Injection (SQLi) and Cross-Site Scripting (XSS) vulnerabilities detected by five different SATs in a large set of WordPress plugins developed in PHP. Results show that, for the case of SQLi, a false alarm reduction is possible without compromising the vulnerabilities detected, and that using ML allows trade-offs (e.g., reduction in false alarms at the expense of a few vulnerabilities) that are not possible with existing techniques. The paper also proposes a regression-based approach for ranking source code files considering estimates of vulnerabilities computed using the output of SATs. Results show that the approach allows creating a ranking of the source code files that largely overlaps the real ranking (based on real known vulnerabilities).
publishDate	2019
dc.date.none.fl_str_mv	2019
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/article
format	article
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	https://hdl.handle.net/10316/117479 https://hdl.handle.net/10316/117479 https://doi.org/10.1109/LADC48089.2019.8995685
url	https://hdl.handle.net/10316/117479 https://doi.org/10.1109/LADC48089.2019.8995685
dc.language.iso.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	978-1-7281-6622-3 https://ieeexplore.ieee.org/document/8995685
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.publisher.none.fl_str_mv	IEEE
publisher.none.fl_str_mv	IEEE
dc.source.none.fl_str_mv	reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia instacron:RCAAP
instname_str	FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron_str	RCAAP
institution	RCAAP
reponame_str	Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
collection	Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository.name.fl_str_mv	Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
repository.mail.fl_str_mv	info@rcaap.pt
_version_	1833602607470346240

An Exploratory Study on Machine Learning to Combine Security Vulnerability Alerts from Static Analysis Tools

Registros relacionados