Uma Análise comparativa de técnicas de subamostragem para projetos de sistemas de detecção de intrusão em redes de computadores

Detalhes bibliográficos
Ano de defesa: 2020
Autor(a) principal: Silva, Bruno Riccelli dos Santos
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Não Informado pela instituição
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: http://www.repositorio.ufc.br/handle/riufc/52808
Resumo: Intrusion Detection Systems (IDS) figure as one of the leading solutions adopted in the area of network security to prevent network intrusion and ensure the security of data and services. However, this type of problem requires IDS to be assertive and efficient concerning processing time. Undersampling techniques allow classifiers to be evaluated from smaller sub-databases in a representative manner, seeking better assertiveness in less processing time. Some works in the literature present this kind of solution in the IDS project, but criteria such as the adoption of a replicable methodology, are generally not respected. Three sub-sampling methodologies were selected: random selection, by Cluster centroids and Nearmiss in two recent databases (CICIDS 2017 and CICIDS 2018) and comparison purposes between the classifiers. Thus, based on the results obtained and on the criteria adopted for the choice of classifiers, in the complete CIC2017 and CIC2018 databases, the random forest classifier obtains the best results. As for the sub-base generated, from the CIC2017 database, by the random under-sampling, the KNN classifier was considered the best for its average metrics of accuracy, efficiency, and training time. In the sub-base using the Cluster centroids under-sampling technique, generated from CIC2018, the classifier Naive Bayes gets the best results. As for the subbases generated from CIC2017 and CIC2018, using the NearMiss sub-sampling technique, the best classifiers, for their average metrics of accuracy, efficiency and training time, were KNN and Naive Bayes, respectively. Also, the results indicate that the sub-sampling by Cluster centroids presents the best performance when applied to classifiers based on distance, it follows that the technique of under-sampling influences the process of choosing the best classifier in the design of an Intrusion Detection Systems.