A comprehensive study among distance measures on supervised optimum-path forest classification
Main Author: | |
---|---|
Publication Date: | 2024 |
Other Authors: | , , |
Format: | Article |
Language: | eng |
Source: | Repositório Institucional da UNESP |
Download full: | http://dx.doi.org/10.1016/j.asoc.2024.112021 https://hdl.handle.net/11449/306869 |
Summary: | Supervised pattern classification relies on a labeled training set to learn decision boundaries that separate samples from different classes. Such samples can be either weakly- or reliably-labeled; in the first case, one can employ techniques specifically designed to cope with uncertainty during labeling, and in the other scenario, it relies on numerous alternatives, including metric learning. Pattern classifiers usually adopt the Euclidean distance to compare samples and assess their proximity, but this implies the feature space is embedded in a plane. However, samples are embedded in curved spaces for some applications, although not straightforward to prove. In this manuscript, we assessed the performance of the Optimum-Path Forest (OPF) classifier under different distance functions, which are used to weigh arcs among samples, for a graph encoding the feature space. This work compared 47 distance measures applied to the OPF classifier considering 22 datasets, plus Decision Trees, Logistic Regression, and Support Vector Machines. The experiments highlighted that OPF is user-friendly when handling distance measures and can obtain better accuracies in some situations than its standard (Euclidean) counterpart and the classifiers mentioned above. On the other hand, time-consuming distance calculations may affect OPF's efficiency during inference. |
id |
UNSP_fe34f221ee7c91387c33ccbcf163a9e8 |
---|---|
oai_identifier_str |
oai:repositorio.unesp.br:11449/306869 |
network_acronym_str |
UNSP |
network_name_str |
Repositório Institucional da UNESP |
repository_id_str |
2946 |
spelling |
A comprehensive study among distance measures on supervised optimum-path forest classificationMetric learningOptimum-path forestSupervised learningSupervised pattern classification relies on a labeled training set to learn decision boundaries that separate samples from different classes. Such samples can be either weakly- or reliably-labeled; in the first case, one can employ techniques specifically designed to cope with uncertainty during labeling, and in the other scenario, it relies on numerous alternatives, including metric learning. Pattern classifiers usually adopt the Euclidean distance to compare samples and assess their proximity, but this implies the feature space is embedded in a plane. However, samples are embedded in curved spaces for some applications, although not straightforward to prove. In this manuscript, we assessed the performance of the Optimum-Path Forest (OPF) classifier under different distance functions, which are used to weigh arcs among samples, for a graph encoding the feature space. This work compared 47 distance measures applied to the OPF classifier considering 22 datasets, plus Decision Trees, Logistic Regression, and Support Vector Machines. The experiments highlighted that OPF is user-friendly when handling distance measures and can obtain better accuracies in some situations than its standard (Euclidean) counterpart and the classifiers mentioned above. On the other hand, time-consuming distance calculations may affect OPF's efficiency during inference.International Business Machines CorporationFundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)Department of Computing São Paulo State University, Av. Eng. Luiz Edmundo Carrijo Coube, 14-01Department of Computing São Paulo State University, Av. Eng. Luiz Edmundo Carrijo Coube, 14-01FAPESP: #2013/07375-0FAPESP: #2014/12236-1FAPESP: #2019/ 02205-5FAPESP: #2019/07665-4FAPESP: #2020/12101-0FAPESP: #2023/10823-6Universidade Estadual Paulista (UNESP)de Rosa, Gustavo H. [UNESP]Roder, Mateus [UNESP]Passos, Leandro A. [UNESP]Papa, João Paulo [UNESP]2025-04-29T20:07:31Z2024-10-01info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articlehttp://dx.doi.org/10.1016/j.asoc.2024.112021Applied Soft Computing, v. 164.1568-4946https://hdl.handle.net/11449/30686910.1016/j.asoc.2024.1120212-s2.0-85199793906Scopusreponame:Repositório Institucional da UNESPinstname:Universidade Estadual Paulista (UNESP)instacron:UNESPengApplied Soft Computinginfo:eu-repo/semantics/openAccess2025-04-30T14:36:56Zoai:repositorio.unesp.br:11449/306869Repositório InstitucionalPUBhttp://repositorio.unesp.br/oai/requestrepositoriounesp@unesp.bropendoar:29462025-04-30T14:36:56Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)false |
dc.title.none.fl_str_mv |
A comprehensive study among distance measures on supervised optimum-path forest classification |
title |
A comprehensive study among distance measures on supervised optimum-path forest classification |
spellingShingle |
A comprehensive study among distance measures on supervised optimum-path forest classification de Rosa, Gustavo H. [UNESP] Metric learning Optimum-path forest Supervised learning |
title_short |
A comprehensive study among distance measures on supervised optimum-path forest classification |
title_full |
A comprehensive study among distance measures on supervised optimum-path forest classification |
title_fullStr |
A comprehensive study among distance measures on supervised optimum-path forest classification |
title_full_unstemmed |
A comprehensive study among distance measures on supervised optimum-path forest classification |
title_sort |
A comprehensive study among distance measures on supervised optimum-path forest classification |
author |
de Rosa, Gustavo H. [UNESP] |
author_facet |
de Rosa, Gustavo H. [UNESP] Roder, Mateus [UNESP] Passos, Leandro A. [UNESP] Papa, João Paulo [UNESP] |
author_role |
author |
author2 |
Roder, Mateus [UNESP] Passos, Leandro A. [UNESP] Papa, João Paulo [UNESP] |
author2_role |
author author author |
dc.contributor.none.fl_str_mv |
Universidade Estadual Paulista (UNESP) |
dc.contributor.author.fl_str_mv |
de Rosa, Gustavo H. [UNESP] Roder, Mateus [UNESP] Passos, Leandro A. [UNESP] Papa, João Paulo [UNESP] |
dc.subject.por.fl_str_mv |
Metric learning Optimum-path forest Supervised learning |
topic |
Metric learning Optimum-path forest Supervised learning |
description |
Supervised pattern classification relies on a labeled training set to learn decision boundaries that separate samples from different classes. Such samples can be either weakly- or reliably-labeled; in the first case, one can employ techniques specifically designed to cope with uncertainty during labeling, and in the other scenario, it relies on numerous alternatives, including metric learning. Pattern classifiers usually adopt the Euclidean distance to compare samples and assess their proximity, but this implies the feature space is embedded in a plane. However, samples are embedded in curved spaces for some applications, although not straightforward to prove. In this manuscript, we assessed the performance of the Optimum-Path Forest (OPF) classifier under different distance functions, which are used to weigh arcs among samples, for a graph encoding the feature space. This work compared 47 distance measures applied to the OPF classifier considering 22 datasets, plus Decision Trees, Logistic Regression, and Support Vector Machines. The experiments highlighted that OPF is user-friendly when handling distance measures and can obtain better accuracies in some situations than its standard (Euclidean) counterpart and the classifiers mentioned above. On the other hand, time-consuming distance calculations may affect OPF's efficiency during inference. |
publishDate |
2024 |
dc.date.none.fl_str_mv |
2024-10-01 2025-04-29T20:07:31Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
http://dx.doi.org/10.1016/j.asoc.2024.112021 Applied Soft Computing, v. 164. 1568-4946 https://hdl.handle.net/11449/306869 10.1016/j.asoc.2024.112021 2-s2.0-85199793906 |
url |
http://dx.doi.org/10.1016/j.asoc.2024.112021 https://hdl.handle.net/11449/306869 |
identifier_str_mv |
Applied Soft Computing, v. 164. 1568-4946 10.1016/j.asoc.2024.112021 2-s2.0-85199793906 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
Applied Soft Computing |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.source.none.fl_str_mv |
Scopus reponame:Repositório Institucional da UNESP instname:Universidade Estadual Paulista (UNESP) instacron:UNESP |
instname_str |
Universidade Estadual Paulista (UNESP) |
instacron_str |
UNESP |
institution |
UNESP |
reponame_str |
Repositório Institucional da UNESP |
collection |
Repositório Institucional da UNESP |
repository.name.fl_str_mv |
Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP) |
repository.mail.fl_str_mv |
repositoriounesp@unesp.br |
_version_ |
1834482378857250816 |