Dealing with repeated objects in SNNagg

Detalhes bibliográficos
Autor(a) principal: Galvão, João Rui Magalhães Velho da Cunha
Data de Publicação: 2016
Outros Autores: Santos, Maribel Yasmina, Pires, João Moura, Costa, Carlos
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
Texto Completo: http://hdl.handle.net/1822/42342
Resumo: Due to the constant technological advances and massive use of electronic devices, the amount of data generated has increased at a very high rate, leading to the urgent need to process larger amounts of data in less time. In order to be able to handle these large amounts of data, several techniques and algorithms have been developed in the area of knowledge discovery in databases, which process consists of several stages, including data mining that analyze vast amounts of data, identifying patterns, models or trends. Among the several data mining techniques, this work is focused in clustering spatial data with a density-based approach that uses the Shared Nearest Neighbor algorithm (SNN). SNN has shown several advantages when analyzing this type of data, identifying clusters of different sizes, shapes, and densities, and also dealing with noise. This paper presents and evaluates a new extension of SNN that is able to deal with repeated objects, creating aggregates that reduce the processing time required to cluster a given dataset, as repeated objects are excluded from the most time demanding step, which is associated with the identification of the k-nearest neighbors of a point. The proposed approach, SNNagg, was evaluated and the obtained results show that the processing time is reduced without compromising the quality of the obtained clusters.
id RCAP_a92abd5e913e2d9fd2cade0f357df8c3
oai_identifier_str oai:repositorium.sdum.uminho.pt:1822/42342
network_acronym_str RCAP
network_name_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository_id_str https://opendoar.ac.uk/repository/7160
spelling Dealing with repeated objects in SNNaggSpatial DataSpatio-Temporal DataClusteringSNNDensity-based ClusteringEngenharia e Tecnologia::Engenharia Eletrotécnica, Eletrónica e InformáticaDue to the constant technological advances and massive use of electronic devices, the amount of data generated has increased at a very high rate, leading to the urgent need to process larger amounts of data in less time. In order to be able to handle these large amounts of data, several techniques and algorithms have been developed in the area of knowledge discovery in databases, which process consists of several stages, including data mining that analyze vast amounts of data, identifying patterns, models or trends. Among the several data mining techniques, this work is focused in clustering spatial data with a density-based approach that uses the Shared Nearest Neighbor algorithm (SNN). SNN has shown several advantages when analyzing this type of data, identifying clusters of different sizes, shapes, and densities, and also dealing with noise. This paper presents and evaluates a new extension of SNN that is able to deal with repeated objects, creating aggregates that reduce the processing time required to cluster a given dataset, as repeated objects are excluded from the most time demanding step, which is associated with the identification of the k-nearest neighbors of a point. The proposed approach, SNNagg, was evaluated and the obtained results show that the processing time is reduced without compromising the quality of the obtained clusters.This work has been supported by FCT, Fundação para a Ciência e Tecnologia, within the Project Scope UID/CEC/00319/2013.IAENGUniversidade do MinhoGalvão, João Rui Magalhães Velho da CunhaSantos, Maribel YasminaPires, João MouraCosta, Carlos2016-02-162016-02-16T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/1822/42342engJoao Galvão, Maribel Yasmina Santos, Joao Moura Pires, and Carlos Costa, "Dealing with Repeated Objects in SNNagg", IAENG International Journal of Computer Science, vol. 43, no. 1, pp115-125, 2016, ISSN: 1819656X.1819656Xinfo:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2024-05-11T05:19:26Zoai:repositorium.sdum.uminho.pt:1822/42342Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-28T15:14:39.778346Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse
dc.title.none.fl_str_mv Dealing with repeated objects in SNNagg
title Dealing with repeated objects in SNNagg
spellingShingle Dealing with repeated objects in SNNagg
Galvão, João Rui Magalhães Velho da Cunha
Spatial Data
Spatio-Temporal Data
Clustering
SNN
Density-based Clustering
Engenharia e Tecnologia::Engenharia Eletrotécnica, Eletrónica e Informática
title_short Dealing with repeated objects in SNNagg
title_full Dealing with repeated objects in SNNagg
title_fullStr Dealing with repeated objects in SNNagg
title_full_unstemmed Dealing with repeated objects in SNNagg
title_sort Dealing with repeated objects in SNNagg
author Galvão, João Rui Magalhães Velho da Cunha
author_facet Galvão, João Rui Magalhães Velho da Cunha
Santos, Maribel Yasmina
Pires, João Moura
Costa, Carlos
author_role author
author2 Santos, Maribel Yasmina
Pires, João Moura
Costa, Carlos
author2_role author
author
author
dc.contributor.none.fl_str_mv Universidade do Minho
dc.contributor.author.fl_str_mv Galvão, João Rui Magalhães Velho da Cunha
Santos, Maribel Yasmina
Pires, João Moura
Costa, Carlos
dc.subject.por.fl_str_mv Spatial Data
Spatio-Temporal Data
Clustering
SNN
Density-based Clustering
Engenharia e Tecnologia::Engenharia Eletrotécnica, Eletrónica e Informática
topic Spatial Data
Spatio-Temporal Data
Clustering
SNN
Density-based Clustering
Engenharia e Tecnologia::Engenharia Eletrotécnica, Eletrónica e Informática
description Due to the constant technological advances and massive use of electronic devices, the amount of data generated has increased at a very high rate, leading to the urgent need to process larger amounts of data in less time. In order to be able to handle these large amounts of data, several techniques and algorithms have been developed in the area of knowledge discovery in databases, which process consists of several stages, including data mining that analyze vast amounts of data, identifying patterns, models or trends. Among the several data mining techniques, this work is focused in clustering spatial data with a density-based approach that uses the Shared Nearest Neighbor algorithm (SNN). SNN has shown several advantages when analyzing this type of data, identifying clusters of different sizes, shapes, and densities, and also dealing with noise. This paper presents and evaluates a new extension of SNN that is able to deal with repeated objects, creating aggregates that reduce the processing time required to cluster a given dataset, as repeated objects are excluded from the most time demanding step, which is associated with the identification of the k-nearest neighbors of a point. The proposed approach, SNNagg, was evaluated and the obtained results show that the processing time is reduced without compromising the quality of the obtained clusters.
publishDate 2016
dc.date.none.fl_str_mv 2016-02-16
2016-02-16T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/1822/42342
url http://hdl.handle.net/1822/42342
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv Joao Galvão, Maribel Yasmina Santos, Joao Moura Pires, and Carlos Costa, "Dealing with Repeated Objects in SNNagg", IAENG International Journal of Computer Science, vol. 43, no. 1, pp115-125, 2016, ISSN: 1819656X.
1819656X
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv IAENG
publisher.none.fl_str_mv IAENG
dc.source.none.fl_str_mv reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron:RCAAP
instname_str FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron_str RCAAP
institution RCAAP
reponame_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
collection Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository.name.fl_str_mv Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
repository.mail.fl_str_mv info@rcaap.pt
_version_ 1833595191455383552