Explainable machine learning for malware detection on Android applications

Bibliographic Details
Main Author: Palma, Catarina
Publication Date: 2024
Other Authors: J. Ferreira, Artur, Figueiredo, Mário
Format: Article
Language: eng
Source: Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
Download full: http://hdl.handle.net/10400.21/17228
Summary: The presence of malicious software (malware), for example, in Android applications (apps), has harmful or irreparable consequences to the user and/or the device. Despite the protections app stores provide to avoid malware, it keeps growing in sophistication and diffusion. In this paper, we explore the use of machine learning (ML) techniques to detect malware in Android apps. The focus is on the study of different data pre-processing, dimensionality reduction, and classification techniques, assessing the generalization ability of the learned models using public domain datasets and specifically developed apps. We find that the classifiers that achieve better performance for this task are support vector machines (SVM) and random forests (RF). We emphasize the use of feature selection (FS) techniques to reduce the data dimensionality and to identify the most relevant features in Android malware classification, leading to explainability on this task. Our approach can identify the most relevant features to classify an app as malware. Namely, we conclude that permissions play a prominent role in Android malware detection. The proposed approach reduces the data dimensionality while achieving high accuracy in identifying malware in Android apps.
id RCAP_92e9314eca6d4c573ac1db3c84dd35cf
oai_identifier_str oai:repositorio.ipl.pt:10400.21/17228
network_acronym_str RCAP
network_name_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository_id_str https://opendoar.ac.uk/repository/7160
spelling Explainable machine learning for malware detection on Android applicationsandroid applicationsdatasetsexplainabilityfeature selectionmachine learningmalware detectionnumerosity balancing; securitysoft computingsupervised learningThe presence of malicious software (malware), for example, in Android applications (apps), has harmful or irreparable consequences to the user and/or the device. Despite the protections app stores provide to avoid malware, it keeps growing in sophistication and diffusion. In this paper, we explore the use of machine learning (ML) techniques to detect malware in Android apps. The focus is on the study of different data pre-processing, dimensionality reduction, and classification techniques, assessing the generalization ability of the learned models using public domain datasets and specifically developed apps. We find that the classifiers that achieve better performance for this task are support vector machines (SVM) and random forests (RF). We emphasize the use of feature selection (FS) techniques to reduce the data dimensionality and to identify the most relevant features in Android malware classification, leading to explainability on this task. Our approach can identify the most relevant features to classify an app as malware. Namely, we conclude that permissions play a prominent role in Android malware detection. The proposed approach reduces the data dimensionality while achieving high accuracy in identifying malware in Android apps.MDPIRCIPLPalma, CatarinaJ. Ferreira, ArturFigueiredo, Mário2024-03-27T16:47:22Z20242024-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/10400.21/17228eng10.3390/info15010025info:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2025-02-12T07:55:55Zoai:repositorio.ipl.pt:10400.21/17228Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-28T19:52:07.713516Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse
dc.title.none.fl_str_mv Explainable machine learning for malware detection on Android applications
title Explainable machine learning for malware detection on Android applications
spellingShingle Explainable machine learning for malware detection on Android applications
Palma, Catarina
android applications
datasets
explainability
feature selection
machine learning
malware detection
numerosity balancing; security
soft computing
supervised learning
title_short Explainable machine learning for malware detection on Android applications
title_full Explainable machine learning for malware detection on Android applications
title_fullStr Explainable machine learning for malware detection on Android applications
title_full_unstemmed Explainable machine learning for malware detection on Android applications
title_sort Explainable machine learning for malware detection on Android applications
author Palma, Catarina
author_facet Palma, Catarina
J. Ferreira, Artur
Figueiredo, Mário
author_role author
author2 J. Ferreira, Artur
Figueiredo, Mário
author2_role author
author
dc.contributor.none.fl_str_mv RCIPL
dc.contributor.author.fl_str_mv Palma, Catarina
J. Ferreira, Artur
Figueiredo, Mário
dc.subject.por.fl_str_mv android applications
datasets
explainability
feature selection
machine learning
malware detection
numerosity balancing; security
soft computing
supervised learning
topic android applications
datasets
explainability
feature selection
machine learning
malware detection
numerosity balancing; security
soft computing
supervised learning
description The presence of malicious software (malware), for example, in Android applications (apps), has harmful or irreparable consequences to the user and/or the device. Despite the protections app stores provide to avoid malware, it keeps growing in sophistication and diffusion. In this paper, we explore the use of machine learning (ML) techniques to detect malware in Android apps. The focus is on the study of different data pre-processing, dimensionality reduction, and classification techniques, assessing the generalization ability of the learned models using public domain datasets and specifically developed apps. We find that the classifiers that achieve better performance for this task are support vector machines (SVM) and random forests (RF). We emphasize the use of feature selection (FS) techniques to reduce the data dimensionality and to identify the most relevant features in Android malware classification, leading to explainability on this task. Our approach can identify the most relevant features to classify an app as malware. Namely, we conclude that permissions play a prominent role in Android malware detection. The proposed approach reduces the data dimensionality while achieving high accuracy in identifying malware in Android apps.
publishDate 2024
dc.date.none.fl_str_mv 2024-03-27T16:47:22Z
2024
2024-01-01T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10400.21/17228
url http://hdl.handle.net/10400.21/17228
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv 10.3390/info15010025
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv MDPI
publisher.none.fl_str_mv MDPI
dc.source.none.fl_str_mv reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron:RCAAP
instname_str FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron_str RCAAP
institution RCAAP
reponame_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
collection Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository.name.fl_str_mv Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
repository.mail.fl_str_mv info@rcaap.pt
_version_ 1833598370142224384