Classificação de dados combinando mapas auto-organizáveis com vizinho informativo mais próximo

Detalhes bibliográficos
Ano de defesa: 2016
Autor(a) principal: Moreira, Lenadro Juvêncio lattes
Orientador(a): Silva, Leandro Augusto da lattes
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Universidade Presbiteriana Mackenzie
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Área do conhecimento CNPq:
Link de acesso: http://dspace.mackenzie.br/handle/10899/24442
Resumo: The data classification is a data mining task with relevant utilization in various areas of application, such as medicine, industry, marketing, financial market, teaching and many others. Although this task is an element search for many autors, there are open issues such as, e.g., in situations where there is so much data, noise data and unbalanced classes. In this way, this work will present a data classifier proposal that combines the SOM (Self-Organizing Map) neural network with INN (Informative Nearest Neighbors). The combination of these two algorithms will be called in this work as SOM-INN. Therefore, the SOM-INN process to classify a new object will be done in a first step with the SOM that has a functionality to map a reduced dataset through an approach that utilizes the prototype generation concept, also called the winning neuron and, in a second step, with the INN algorithm that is used to classify the new object through an approach that finds in the reduced dataset by SOM the most informative object. Were made experiments using 21 public datasets comparing classic data classification algorithms of the literature, from the indicators of reduction training set, accuracy, kappa and time consumed in the classification process. The results obtained show that the proposed SOM-INN algorithm, when compared with the others classifiers of the literature, presents better accuracy in databases where the border region is not well defined. The main differential of the SOM-INN is in the classification time, which is extremely important for real applications. Keywords: data classification; prototype generation; K nearest neighbors; self-organizing