Método não supervisionado baseado em curvas principais para reconhecimento de padrões

Detalhes bibliográficos
Ano de defesa: 2015
Autor(a) principal: Moraes, Elson Claudio Correa
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Universidade Federal de Lavras
Programa de Pós-Graduação em Engenharia de Sistemas e Automação
UFLA
brasil
Departamento de Engenharia
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: http://repositorio.ufla.br/jspui/handle/1/10839
Resumo: In this work a new method of data clustering and pattern classification based on principal curves is presented. Principal curves consist of a nonlinear generalization of Principal Component Analysis and are smooth curves, onedimensional, which model a multidimensional dataset, providing a onedimensional summary of it. In the proposed method, the principal curves are extracted by the k-segments algorithm. The method divides the principal curves originally obtained by the k-segments algorithm into two or more curves, according to the number of clusters previously defined by the user. Then, the distances from the data to the curves generated by the method are calculated and thereafter it is made sorting the data according to the criterion of the smallest distance from data to the new curves. The square of the Euclidian distance is used. The method was applied to five databases, two two-dimensional and three multidimensional. The results were compared with the methods k-means and Self Organized Maps, where the proposed method outperformed the other methods in two bases (two-dimensional ones) and obtained the second best result in the other databases. The method shown to be suitable for elongated and circular clusters. Despite its high performance, the method shown to be very sensitive to the input parameters (the segment length and the number of segments). The author intend to exploit the problem of the sensitivity of the method in future works.