Comparing the performance of oversampling techniques for imbalanced learning in insurance fraud detection
Autor(a) principal: | |
---|---|
Data de Publicação: | 2018 |
Tipo de documento: | Dissertação |
Idioma: | eng |
Título da fonte: | Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
Texto Completo: | http://hdl.handle.net/10362/33863 |
Resumo: | Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics |
id |
RCAP_a1aa338ceba38709210833f21c7aee39 |
---|---|
oai_identifier_str |
oai:run.unl.pt:10362/33863 |
network_acronym_str |
RCAP |
network_name_str |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
repository_id_str |
https://opendoar.ac.uk/repository/7160 |
spelling |
Comparing the performance of oversampling techniques for imbalanced learning in insurance fraud detectionImbalanced datasetsFraudoversamplingInsuranceDissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced AnalyticsAlthough the current trend of data production is focused on generating tons of it every second, there are situations where the target category is represented extremely unequally, giving rise to imbalanced datasets, analyzing them correctly can lead to relevant decisions that produces appropriate business strategies. Fraud modeling is one example of this situation: it is expected less fraudulent transactions than reliable ones, predict them could be crucial for improving decisions and processes in a company. However, class imbalance produces a negative effect on traditional techniques in dealing with this problem, a lot of techniques have been proposed and oversampling is one of them. This work analyses the behavior of different oversampling techniques such as Random oversampling, SOMO and SMOTE, through different classifiers and evaluation metrics. The exercise is done with real data from an insurance company in Colombia predicting fraudulent claims for its compulsory auto product. Conclusions of this research demonstrate the advantages of using oversampling for imbalance circumstances but also the importance of comparing different evaluation metrics and classifiers to obtain accurate appropriate conclusions and comparable results.Bação, Fernando José Ferreira LucasRUNMoreno, María Fernanda Osorio2018-04-05T13:24:16Z2018-03-262018-03-26T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10362/33863TID:201894289enginfo:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2024-05-22T17:31:43Zoai:run.unl.pt:10362/33863Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-28T17:03:00.594509Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse |
dc.title.none.fl_str_mv |
Comparing the performance of oversampling techniques for imbalanced learning in insurance fraud detection |
title |
Comparing the performance of oversampling techniques for imbalanced learning in insurance fraud detection |
spellingShingle |
Comparing the performance of oversampling techniques for imbalanced learning in insurance fraud detection Moreno, María Fernanda Osorio Imbalanced datasets Fraud oversampling Insurance |
title_short |
Comparing the performance of oversampling techniques for imbalanced learning in insurance fraud detection |
title_full |
Comparing the performance of oversampling techniques for imbalanced learning in insurance fraud detection |
title_fullStr |
Comparing the performance of oversampling techniques for imbalanced learning in insurance fraud detection |
title_full_unstemmed |
Comparing the performance of oversampling techniques for imbalanced learning in insurance fraud detection |
title_sort |
Comparing the performance of oversampling techniques for imbalanced learning in insurance fraud detection |
author |
Moreno, María Fernanda Osorio |
author_facet |
Moreno, María Fernanda Osorio |
author_role |
author |
dc.contributor.none.fl_str_mv |
Bação, Fernando José Ferreira Lucas RUN |
dc.contributor.author.fl_str_mv |
Moreno, María Fernanda Osorio |
dc.subject.por.fl_str_mv |
Imbalanced datasets Fraud oversampling Insurance |
topic |
Imbalanced datasets Fraud oversampling Insurance |
description |
Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics |
publishDate |
2018 |
dc.date.none.fl_str_mv |
2018-04-05T13:24:16Z 2018-03-26 2018-03-26T00:00:00Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10362/33863 TID:201894289 |
url |
http://hdl.handle.net/10362/33863 |
identifier_str_mv |
TID:201894289 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia instacron:RCAAP |
instname_str |
FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
collection |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
repository.name.fl_str_mv |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia |
repository.mail.fl_str_mv |
info@rcaap.pt |
_version_ |
1833596395014062080 |