Class Incremental Deep Learning: A Computational Scheme to Avoid Catastrophic Forgetting in Domain Generation Algorithm Multiclass Classification
| Main Author: | |
|---|---|
| Publication Date: | 2024 |
| Other Authors: | , |
| Format: | Article |
| Language: | eng |
| Source: | Repositório Institucional da UNESP |
| Download full: | http://dx.doi.org/10.3390/app14167244 https://hdl.handle.net/11449/306709 |
Summary: | Domain Generation Algorithms (DGAs) are algorithms present in most malware used by botnets and advanced persistent threats. These algorithms dynamically generate domain names to maintain and obfuscate communication between the infected device and the attacker’s command and control server. Since DGAs are used by many threats, it is extremely important to classify a given DGA according to the threat it is related to. In addition, as new threats emerge daily, classifier models tend to become obsolete over time. Deep neural networks tend to lose their classification ability when retrained with a dataset that is significantly different from the initial one, a phenomenon known as catastrophic forgetting. This work presents a computational scheme composed of a deep learning model based on CNN and natural language processing and an incremental learning technique for class increment through transfer learning to classify 60 DGA families and include a new family to the classifier model, training the model incrementally using some examples from known families, avoiding catastrophic forgetting and maintaining metric levels. The proposed methodology achieved an average precision of 86.75%, an average recall of 83.06%, and an average F1 score of 83.78% with the full dataset, and suffered minimal losses when applying the class increment. |
| id |
UNSP_a07b8f6e28591781510e1ac9a8527068 |
|---|---|
| oai_identifier_str |
oai:repositorio.unesp.br:11449/306709 |
| network_acronym_str |
UNSP |
| network_name_str |
Repositório Institucional da UNESP |
| repository_id_str |
2946 |
| spelling |
Class Incremental Deep Learning: A Computational Scheme to Avoid Catastrophic Forgetting in Domain Generation Algorithm Multiclass Classificationbotnetscybersecuritydeep learningDGAincremental learningmulticlass classificationDomain Generation Algorithms (DGAs) are algorithms present in most malware used by botnets and advanced persistent threats. These algorithms dynamically generate domain names to maintain and obfuscate communication between the infected device and the attacker’s command and control server. Since DGAs are used by many threats, it is extremely important to classify a given DGA according to the threat it is related to. In addition, as new threats emerge daily, classifier models tend to become obsolete over time. Deep neural networks tend to lose their classification ability when retrained with a dataset that is significantly different from the initial one, a phenomenon known as catastrophic forgetting. This work presents a computational scheme composed of a deep learning model based on CNN and natural language processing and an incremental learning technique for class increment through transfer learning to classify 60 DGA families and include a new family to the classifier model, training the model incrementally using some examples from known families, avoiding catastrophic forgetting and maintaining metric levels. The proposed methodology achieved an average precision of 86.75%, an average recall of 83.06%, and an average F1 score of 83.78% with the full dataset, and suffered minimal losses when applying the class increment.Department of Computer Science and Statistics (DCCE) São Paulo State University (UNESP), São PauloDepartment of Computer Science and Statistics (DCCE) São Paulo State University (UNESP), São PauloUniversidade Estadual Paulista (UNESP)Gregório, João Rafael [UNESP]Cansian, Adriano Mauro [UNESP]Neves, Leandro Alves [UNESP]2025-04-29T20:06:56Z2024-08-01info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articlehttp://dx.doi.org/10.3390/app14167244Applied Sciences (Switzerland), v. 14, n. 16, 2024.2076-3417https://hdl.handle.net/11449/30670910.3390/app141672442-s2.0-85202448519Scopusreponame:Repositório Institucional da UNESPinstname:Universidade Estadual Paulista (UNESP)instacron:UNESPengApplied Sciences (Switzerland)info:eu-repo/semantics/openAccess2025-04-30T14:37:07Zoai:repositorio.unesp.br:11449/306709Repositório InstitucionalPUBhttp://repositorio.unesp.br/oai/requestrepositoriounesp@unesp.bropendoar:29462025-04-30T14:37:07Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)false |
| dc.title.none.fl_str_mv |
Class Incremental Deep Learning: A Computational Scheme to Avoid Catastrophic Forgetting in Domain Generation Algorithm Multiclass Classification |
| title |
Class Incremental Deep Learning: A Computational Scheme to Avoid Catastrophic Forgetting in Domain Generation Algorithm Multiclass Classification |
| spellingShingle |
Class Incremental Deep Learning: A Computational Scheme to Avoid Catastrophic Forgetting in Domain Generation Algorithm Multiclass Classification Gregório, João Rafael [UNESP] botnets cybersecurity deep learning DGA incremental learning multiclass classification |
| title_short |
Class Incremental Deep Learning: A Computational Scheme to Avoid Catastrophic Forgetting in Domain Generation Algorithm Multiclass Classification |
| title_full |
Class Incremental Deep Learning: A Computational Scheme to Avoid Catastrophic Forgetting in Domain Generation Algorithm Multiclass Classification |
| title_fullStr |
Class Incremental Deep Learning: A Computational Scheme to Avoid Catastrophic Forgetting in Domain Generation Algorithm Multiclass Classification |
| title_full_unstemmed |
Class Incremental Deep Learning: A Computational Scheme to Avoid Catastrophic Forgetting in Domain Generation Algorithm Multiclass Classification |
| title_sort |
Class Incremental Deep Learning: A Computational Scheme to Avoid Catastrophic Forgetting in Domain Generation Algorithm Multiclass Classification |
| author |
Gregório, João Rafael [UNESP] |
| author_facet |
Gregório, João Rafael [UNESP] Cansian, Adriano Mauro [UNESP] Neves, Leandro Alves [UNESP] |
| author_role |
author |
| author2 |
Cansian, Adriano Mauro [UNESP] Neves, Leandro Alves [UNESP] |
| author2_role |
author author |
| dc.contributor.none.fl_str_mv |
Universidade Estadual Paulista (UNESP) |
| dc.contributor.author.fl_str_mv |
Gregório, João Rafael [UNESP] Cansian, Adriano Mauro [UNESP] Neves, Leandro Alves [UNESP] |
| dc.subject.por.fl_str_mv |
botnets cybersecurity deep learning DGA incremental learning multiclass classification |
| topic |
botnets cybersecurity deep learning DGA incremental learning multiclass classification |
| description |
Domain Generation Algorithms (DGAs) are algorithms present in most malware used by botnets and advanced persistent threats. These algorithms dynamically generate domain names to maintain and obfuscate communication between the infected device and the attacker’s command and control server. Since DGAs are used by many threats, it is extremely important to classify a given DGA according to the threat it is related to. In addition, as new threats emerge daily, classifier models tend to become obsolete over time. Deep neural networks tend to lose their classification ability when retrained with a dataset that is significantly different from the initial one, a phenomenon known as catastrophic forgetting. This work presents a computational scheme composed of a deep learning model based on CNN and natural language processing and an incremental learning technique for class increment through transfer learning to classify 60 DGA families and include a new family to the classifier model, training the model incrementally using some examples from known families, avoiding catastrophic forgetting and maintaining metric levels. The proposed methodology achieved an average precision of 86.75%, an average recall of 83.06%, and an average F1 score of 83.78% with the full dataset, and suffered minimal losses when applying the class increment. |
| publishDate |
2024 |
| dc.date.none.fl_str_mv |
2024-08-01 2025-04-29T20:06:56Z |
| dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
| dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
| format |
article |
| status_str |
publishedVersion |
| dc.identifier.uri.fl_str_mv |
http://dx.doi.org/10.3390/app14167244 Applied Sciences (Switzerland), v. 14, n. 16, 2024. 2076-3417 https://hdl.handle.net/11449/306709 10.3390/app14167244 2-s2.0-85202448519 |
| url |
http://dx.doi.org/10.3390/app14167244 https://hdl.handle.net/11449/306709 |
| identifier_str_mv |
Applied Sciences (Switzerland), v. 14, n. 16, 2024. 2076-3417 10.3390/app14167244 2-s2.0-85202448519 |
| dc.language.iso.fl_str_mv |
eng |
| language |
eng |
| dc.relation.none.fl_str_mv |
Applied Sciences (Switzerland) |
| dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
| eu_rights_str_mv |
openAccess |
| dc.source.none.fl_str_mv |
Scopus reponame:Repositório Institucional da UNESP instname:Universidade Estadual Paulista (UNESP) instacron:UNESP |
| instname_str |
Universidade Estadual Paulista (UNESP) |
| instacron_str |
UNESP |
| institution |
UNESP |
| reponame_str |
Repositório Institucional da UNESP |
| collection |
Repositório Institucional da UNESP |
| repository.name.fl_str_mv |
Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP) |
| repository.mail.fl_str_mv |
repositoriounesp@unesp.br |
| _version_ |
1834482798405091328 |