A comprehensive study among distance measures on supervised optimum-path forest classification

Bibliographic Details
Main Author: de Rosa, Gustavo H. [UNESP]
Publication Date: 2024
Other Authors: Roder, Mateus [UNESP], Passos, Leandro A. [UNESP], Papa, João Paulo [UNESP]
Format: Article
Language: eng
Source: Repositório Institucional da UNESP
Download full: http://dx.doi.org/10.1016/j.asoc.2024.112021
https://hdl.handle.net/11449/306869
Summary: Supervised pattern classification relies on a labeled training set to learn decision boundaries that separate samples from different classes. Such samples can be either weakly- or reliably-labeled; in the first case, one can employ techniques specifically designed to cope with uncertainty during labeling, and in the other scenario, it relies on numerous alternatives, including metric learning. Pattern classifiers usually adopt the Euclidean distance to compare samples and assess their proximity, but this implies the feature space is embedded in a plane. However, samples are embedded in curved spaces for some applications, although not straightforward to prove. In this manuscript, we assessed the performance of the Optimum-Path Forest (OPF) classifier under different distance functions, which are used to weigh arcs among samples, for a graph encoding the feature space. This work compared 47 distance measures applied to the OPF classifier considering 22 datasets, plus Decision Trees, Logistic Regression, and Support Vector Machines. The experiments highlighted that OPF is user-friendly when handling distance measures and can obtain better accuracies in some situations than its standard (Euclidean) counterpart and the classifiers mentioned above. On the other hand, time-consuming distance calculations may affect OPF's efficiency during inference.
id UNSP_fe34f221ee7c91387c33ccbcf163a9e8
oai_identifier_str oai:repositorio.unesp.br:11449/306869
network_acronym_str UNSP
network_name_str Repositório Institucional da UNESP
repository_id_str 2946
spelling A comprehensive study among distance measures on supervised optimum-path forest classificationMetric learningOptimum-path forestSupervised learningSupervised pattern classification relies on a labeled training set to learn decision boundaries that separate samples from different classes. Such samples can be either weakly- or reliably-labeled; in the first case, one can employ techniques specifically designed to cope with uncertainty during labeling, and in the other scenario, it relies on numerous alternatives, including metric learning. Pattern classifiers usually adopt the Euclidean distance to compare samples and assess their proximity, but this implies the feature space is embedded in a plane. However, samples are embedded in curved spaces for some applications, although not straightforward to prove. In this manuscript, we assessed the performance of the Optimum-Path Forest (OPF) classifier under different distance functions, which are used to weigh arcs among samples, for a graph encoding the feature space. This work compared 47 distance measures applied to the OPF classifier considering 22 datasets, plus Decision Trees, Logistic Regression, and Support Vector Machines. The experiments highlighted that OPF is user-friendly when handling distance measures and can obtain better accuracies in some situations than its standard (Euclidean) counterpart and the classifiers mentioned above. On the other hand, time-consuming distance calculations may affect OPF's efficiency during inference.International Business Machines CorporationFundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP)Department of Computing São Paulo State University, Av. Eng. Luiz Edmundo Carrijo Coube, 14-01Department of Computing São Paulo State University, Av. Eng. Luiz Edmundo Carrijo Coube, 14-01FAPESP: #2013/07375-0FAPESP: #2014/12236-1FAPESP: #2019/ 02205-5FAPESP: #2019/07665-4FAPESP: #2020/12101-0FAPESP: #2023/10823-6Universidade Estadual Paulista (UNESP)de Rosa, Gustavo H. [UNESP]Roder, Mateus [UNESP]Passos, Leandro A. [UNESP]Papa, João Paulo [UNESP]2025-04-29T20:07:31Z2024-10-01info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articlehttp://dx.doi.org/10.1016/j.asoc.2024.112021Applied Soft Computing, v. 164.1568-4946https://hdl.handle.net/11449/30686910.1016/j.asoc.2024.1120212-s2.0-85199793906Scopusreponame:Repositório Institucional da UNESPinstname:Universidade Estadual Paulista (UNESP)instacron:UNESPengApplied Soft Computinginfo:eu-repo/semantics/openAccess2025-04-30T14:36:56Zoai:repositorio.unesp.br:11449/306869Repositório InstitucionalPUBhttp://repositorio.unesp.br/oai/requestrepositoriounesp@unesp.bropendoar:29462025-04-30T14:36:56Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)false
dc.title.none.fl_str_mv A comprehensive study among distance measures on supervised optimum-path forest classification
title A comprehensive study among distance measures on supervised optimum-path forest classification
spellingShingle A comprehensive study among distance measures on supervised optimum-path forest classification
de Rosa, Gustavo H. [UNESP]
Metric learning
Optimum-path forest
Supervised learning
title_short A comprehensive study among distance measures on supervised optimum-path forest classification
title_full A comprehensive study among distance measures on supervised optimum-path forest classification
title_fullStr A comprehensive study among distance measures on supervised optimum-path forest classification
title_full_unstemmed A comprehensive study among distance measures on supervised optimum-path forest classification
title_sort A comprehensive study among distance measures on supervised optimum-path forest classification
author de Rosa, Gustavo H. [UNESP]
author_facet de Rosa, Gustavo H. [UNESP]
Roder, Mateus [UNESP]
Passos, Leandro A. [UNESP]
Papa, João Paulo [UNESP]
author_role author
author2 Roder, Mateus [UNESP]
Passos, Leandro A. [UNESP]
Papa, João Paulo [UNESP]
author2_role author
author
author
dc.contributor.none.fl_str_mv Universidade Estadual Paulista (UNESP)
dc.contributor.author.fl_str_mv de Rosa, Gustavo H. [UNESP]
Roder, Mateus [UNESP]
Passos, Leandro A. [UNESP]
Papa, João Paulo [UNESP]
dc.subject.por.fl_str_mv Metric learning
Optimum-path forest
Supervised learning
topic Metric learning
Optimum-path forest
Supervised learning
description Supervised pattern classification relies on a labeled training set to learn decision boundaries that separate samples from different classes. Such samples can be either weakly- or reliably-labeled; in the first case, one can employ techniques specifically designed to cope with uncertainty during labeling, and in the other scenario, it relies on numerous alternatives, including metric learning. Pattern classifiers usually adopt the Euclidean distance to compare samples and assess their proximity, but this implies the feature space is embedded in a plane. However, samples are embedded in curved spaces for some applications, although not straightforward to prove. In this manuscript, we assessed the performance of the Optimum-Path Forest (OPF) classifier under different distance functions, which are used to weigh arcs among samples, for a graph encoding the feature space. This work compared 47 distance measures applied to the OPF classifier considering 22 datasets, plus Decision Trees, Logistic Regression, and Support Vector Machines. The experiments highlighted that OPF is user-friendly when handling distance measures and can obtain better accuracies in some situations than its standard (Euclidean) counterpart and the classifiers mentioned above. On the other hand, time-consuming distance calculations may affect OPF's efficiency during inference.
publishDate 2024
dc.date.none.fl_str_mv 2024-10-01
2025-04-29T20:07:31Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://dx.doi.org/10.1016/j.asoc.2024.112021
Applied Soft Computing, v. 164.
1568-4946
https://hdl.handle.net/11449/306869
10.1016/j.asoc.2024.112021
2-s2.0-85199793906
url http://dx.doi.org/10.1016/j.asoc.2024.112021
https://hdl.handle.net/11449/306869
identifier_str_mv Applied Soft Computing, v. 164.
1568-4946
10.1016/j.asoc.2024.112021
2-s2.0-85199793906
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv Applied Soft Computing
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.source.none.fl_str_mv Scopus
reponame:Repositório Institucional da UNESP
instname:Universidade Estadual Paulista (UNESP)
instacron:UNESP
instname_str Universidade Estadual Paulista (UNESP)
instacron_str UNESP
institution UNESP
reponame_str Repositório Institucional da UNESP
collection Repositório Institucional da UNESP
repository.name.fl_str_mv Repositório Institucional da UNESP - Universidade Estadual Paulista (UNESP)
repository.mail.fl_str_mv repositoriounesp@unesp.br
_version_ 1834482378857250816