A performance comparison of oversampling methods for data generation in imbalanced learning tasks

Bibliographic Details
Main Author: Dattagupta, Samrat Jayanta
Publication Date: 2018
Format: Master thesis
Language: eng
Source: Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
Download full: http://hdl.handle.net/10362/31307
Summary: Dissertation presented as the partial requirement for obtaining a Master's degree in Statistics and Information Management, specialization in Marketing Research e CRM
id RCAP_3d32ad50b74eee34bcb985f1eecfe3bc
oai_identifier_str oai:run.unl.pt:10362/31307
network_acronym_str RCAP
network_name_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository_id_str https://opendoar.ac.uk/repository/7160
spelling A performance comparison of oversampling methods for data generation in imbalanced learning tasksImbalanced learningOversampling methodsEvaluation metricsClassifier performanceDissertation presented as the partial requirement for obtaining a Master's degree in Statistics and Information Management, specialization in Marketing Research e CRMClass Imbalance problem is one of the most fundamental challenges faced by the machine learning community. The imbalance refers to number of instances in the class of interest being relatively low, as compared to the rest of the data. Sampling is a common technique for dealing with this problem. A number of over - sampling approaches have been applied in an attempt to balance the classes. This study provides an overview of the issue of class imbalance and attempts to examine some common oversampling approaches for dealing with this problem. In order to illustrate the differences, an experiment is conducted using multiple simulated data sets for comparing the performance of these oversampling methods on different classifiers based on various evaluation criteria. In addition, the effect of different parameters, such as number of features and imbalance ratio, on the classifier performance is also evaluated.Bação, Fernando José Ferreira LucasDouzas, GeorgiosRUNDattagupta, Samrat Jayanta2018-02-26T16:07:15Z2018-02-022018-02-02T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10362/31307TID:201851938enginfo:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2024-05-22T17:30:56Zoai:run.unl.pt:10362/31307Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-28T17:02:06.228928Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse
dc.title.none.fl_str_mv A performance comparison of oversampling methods for data generation in imbalanced learning tasks
title A performance comparison of oversampling methods for data generation in imbalanced learning tasks
spellingShingle A performance comparison of oversampling methods for data generation in imbalanced learning tasks
Dattagupta, Samrat Jayanta
Imbalanced learning
Oversampling methods
Evaluation metrics
Classifier performance
title_short A performance comparison of oversampling methods for data generation in imbalanced learning tasks
title_full A performance comparison of oversampling methods for data generation in imbalanced learning tasks
title_fullStr A performance comparison of oversampling methods for data generation in imbalanced learning tasks
title_full_unstemmed A performance comparison of oversampling methods for data generation in imbalanced learning tasks
title_sort A performance comparison of oversampling methods for data generation in imbalanced learning tasks
author Dattagupta, Samrat Jayanta
author_facet Dattagupta, Samrat Jayanta
author_role author
dc.contributor.none.fl_str_mv Bação, Fernando José Ferreira Lucas
Douzas, Georgios
RUN
dc.contributor.author.fl_str_mv Dattagupta, Samrat Jayanta
dc.subject.por.fl_str_mv Imbalanced learning
Oversampling methods
Evaluation metrics
Classifier performance
topic Imbalanced learning
Oversampling methods
Evaluation metrics
Classifier performance
description Dissertation presented as the partial requirement for obtaining a Master's degree in Statistics and Information Management, specialization in Marketing Research e CRM
publishDate 2018
dc.date.none.fl_str_mv 2018-02-26T16:07:15Z
2018-02-02
2018-02-02T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10362/31307
TID:201851938
url http://hdl.handle.net/10362/31307
identifier_str_mv TID:201851938
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron:RCAAP
instname_str FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron_str RCAAP
institution RCAAP
reponame_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
collection Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository.name.fl_str_mv Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
repository.mail.fl_str_mv info@rcaap.pt
_version_ 1833596386815246336