Phishing website detection using genetic algorithm-based feature selection and parameter hypertuning

Bibliographic Details
Main Author: Silva, Ana Sofia Pulquério
Publication Date: 2023
Format: Master thesis
Language: eng
Source: Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
Download full: http://hdl.handle.net/10362/152538
Summary: Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Business Analytics
id RCAP_4a339973e22f02e8a4ecde045c4bb9d1
oai_identifier_str oai:run.unl.pt:10362/152538
network_acronym_str RCAP
network_name_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository_id_str https://opendoar.ac.uk/repository/7160
spelling Phishing website detection using genetic algorithm-based feature selection and parameter hypertuningPhishingArtificial IntelligenceMachine LearningDeep LearningEvolutionary AlgorithmsGenetic AlgorithmsDissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Business AnalyticsFalse webpages are created by cyber attackers who seek to mislead users into revealing sensitive and personal information, from credit card details to passwords. Phishing is a class of cyber attacks that mislead users into clicking on false websites, logging into related accounts, and subsequently stealing funds. This cyberattack increases annually given the exponential increase of e-commerce customers, which causes difficulty to distinguish between harmless and false websites. The conventional methods to detect phishing websites are focused on a database of blacklisted and whitelisted. Such methods are not capable to detect new phishing websites. To solve this problem, researchers are developing machine learning (ML) and deep learning-based methods. In this dissertation, a hybrid-based solution, which uses genetic algorithms and ML algorithms for phishing detection based on the URL of the website is proposed. Regarding evaluation, comparisons between conventional ML and DL models are performed using various feature sets resulting from commonly used feature selection methods, such as mutual information and recursive feature elimination. This dissertation proposes a final model with an accuracy of 95.34% on the test set.Henriques, Roberto André PereiraRUNSilva, Ana Sofia Pulquério2023-05-09T17:21:33Z2023-04-102023-04-10T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10362/152538TID:203286367enginfo:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2024-05-22T18:11:16Zoai:run.unl.pt:10362/152538Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-28T17:41:30.760617Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse
dc.title.none.fl_str_mv Phishing website detection using genetic algorithm-based feature selection and parameter hypertuning
title Phishing website detection using genetic algorithm-based feature selection and parameter hypertuning
spellingShingle Phishing website detection using genetic algorithm-based feature selection and parameter hypertuning
Silva, Ana Sofia Pulquério
Phishing
Artificial Intelligence
Machine Learning
Deep Learning
Evolutionary Algorithms
Genetic Algorithms
title_short Phishing website detection using genetic algorithm-based feature selection and parameter hypertuning
title_full Phishing website detection using genetic algorithm-based feature selection and parameter hypertuning
title_fullStr Phishing website detection using genetic algorithm-based feature selection and parameter hypertuning
title_full_unstemmed Phishing website detection using genetic algorithm-based feature selection and parameter hypertuning
title_sort Phishing website detection using genetic algorithm-based feature selection and parameter hypertuning
author Silva, Ana Sofia Pulquério
author_facet Silva, Ana Sofia Pulquério
author_role author
dc.contributor.none.fl_str_mv Henriques, Roberto André Pereira
RUN
dc.contributor.author.fl_str_mv Silva, Ana Sofia Pulquério
dc.subject.por.fl_str_mv Phishing
Artificial Intelligence
Machine Learning
Deep Learning
Evolutionary Algorithms
Genetic Algorithms
topic Phishing
Artificial Intelligence
Machine Learning
Deep Learning
Evolutionary Algorithms
Genetic Algorithms
description Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Business Analytics
publishDate 2023
dc.date.none.fl_str_mv 2023-05-09T17:21:33Z
2023-04-10
2023-04-10T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10362/152538
TID:203286367
url http://hdl.handle.net/10362/152538
identifier_str_mv TID:203286367
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron:RCAAP
instname_str FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron_str RCAAP
institution RCAAP
reponame_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
collection Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository.name.fl_str_mv Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
repository.mail.fl_str_mv info@rcaap.pt
_version_ 1833596899606659072