A performance comparison of oversampling methods for data generation in imbalanced learning tasks
Main Author: | |
---|---|
Publication Date: | 2018 |
Format: | Master thesis |
Language: | eng |
Source: | Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
Download full: | http://hdl.handle.net/10362/31307 |
Summary: | Dissertation presented as the partial requirement for obtaining a Master's degree in Statistics and Information Management, specialization in Marketing Research e CRM |
id |
RCAP_3d32ad50b74eee34bcb985f1eecfe3bc |
---|---|
oai_identifier_str |
oai:run.unl.pt:10362/31307 |
network_acronym_str |
RCAP |
network_name_str |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
repository_id_str |
https://opendoar.ac.uk/repository/7160 |
spelling |
A performance comparison of oversampling methods for data generation in imbalanced learning tasksImbalanced learningOversampling methodsEvaluation metricsClassifier performanceDissertation presented as the partial requirement for obtaining a Master's degree in Statistics and Information Management, specialization in Marketing Research e CRMClass Imbalance problem is one of the most fundamental challenges faced by the machine learning community. The imbalance refers to number of instances in the class of interest being relatively low, as compared to the rest of the data. Sampling is a common technique for dealing with this problem. A number of over - sampling approaches have been applied in an attempt to balance the classes. This study provides an overview of the issue of class imbalance and attempts to examine some common oversampling approaches for dealing with this problem. In order to illustrate the differences, an experiment is conducted using multiple simulated data sets for comparing the performance of these oversampling methods on different classifiers based on various evaluation criteria. In addition, the effect of different parameters, such as number of features and imbalance ratio, on the classifier performance is also evaluated.Bação, Fernando José Ferreira LucasDouzas, GeorgiosRUNDattagupta, Samrat Jayanta2018-02-26T16:07:15Z2018-02-022018-02-02T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10362/31307TID:201851938enginfo:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2024-05-22T17:30:56Zoai:run.unl.pt:10362/31307Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-28T17:02:06.228928Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse |
dc.title.none.fl_str_mv |
A performance comparison of oversampling methods for data generation in imbalanced learning tasks |
title |
A performance comparison of oversampling methods for data generation in imbalanced learning tasks |
spellingShingle |
A performance comparison of oversampling methods for data generation in imbalanced learning tasks Dattagupta, Samrat Jayanta Imbalanced learning Oversampling methods Evaluation metrics Classifier performance |
title_short |
A performance comparison of oversampling methods for data generation in imbalanced learning tasks |
title_full |
A performance comparison of oversampling methods for data generation in imbalanced learning tasks |
title_fullStr |
A performance comparison of oversampling methods for data generation in imbalanced learning tasks |
title_full_unstemmed |
A performance comparison of oversampling methods for data generation in imbalanced learning tasks |
title_sort |
A performance comparison of oversampling methods for data generation in imbalanced learning tasks |
author |
Dattagupta, Samrat Jayanta |
author_facet |
Dattagupta, Samrat Jayanta |
author_role |
author |
dc.contributor.none.fl_str_mv |
Bação, Fernando José Ferreira Lucas Douzas, Georgios RUN |
dc.contributor.author.fl_str_mv |
Dattagupta, Samrat Jayanta |
dc.subject.por.fl_str_mv |
Imbalanced learning Oversampling methods Evaluation metrics Classifier performance |
topic |
Imbalanced learning Oversampling methods Evaluation metrics Classifier performance |
description |
Dissertation presented as the partial requirement for obtaining a Master's degree in Statistics and Information Management, specialization in Marketing Research e CRM |
publishDate |
2018 |
dc.date.none.fl_str_mv |
2018-02-26T16:07:15Z 2018-02-02 2018-02-02T00:00:00Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/masterThesis |
format |
masterThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://hdl.handle.net/10362/31307 TID:201851938 |
url |
http://hdl.handle.net/10362/31307 |
identifier_str_mv |
TID:201851938 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.source.none.fl_str_mv |
reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia instacron:RCAAP |
instname_str |
FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia |
instacron_str |
RCAAP |
institution |
RCAAP |
reponame_str |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
collection |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) |
repository.name.fl_str_mv |
Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia |
repository.mail.fl_str_mv |
info@rcaap.pt |
_version_ |
1833596386815246336 |