Análise comparativa de técnicas avançadas de agrupamento

Detalhes bibliográficos
Ano de defesa: 2016
Autor(a) principal: Piantoni, Jane
Orientador(a): Faceli, Katti lattes
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Universidade Federal de São Carlos
Câmpus Sorocaba
Programa de Pós-Graduação: Programa de Pós-Graduação em Ciência da Computação - PPGCC-So
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Palavras-chave em Inglês:
Área do conhecimento CNPq:
Link de acesso: https://repositorio.ufscar.br/handle/ufscar/8252
Resumo: The goal of this study is to investigate the characteristics of the new data clustering approaches, carrying out a comparative study of clustering techniques that combine or select multiple solutions, analyzing these latest techniques in relation to variety and completeness of knowledge that can be extracted with your application. Studies have been conducted related to the influence of partitions based on traditional ensembles and multi-objective ensemble. The performance of the methods was evaluated by applying them to different sets of base partitions, in order to evaluate them with respect to their ability to identify quality partitions from different initial scenarios. The other study, was conducted to evaluate the ability of the techniques in relation to recover the information available in the data. And for this, investigations were carried out in two contexts: partitions, which is the traditional form of analysis and clusters to internally verify that the recovered partitions contains more relevant information than the partition analysis shows. And to undertake such analyzes were observed the quality of partitions and clusters, the percentage of actual information (partitions and clusters) really recovered, in both contexts, and the volume of irrelevant information that each technique produces. Among the analyzes are the search for novel partitions and more robust than the sets of base partitions assembly used in the experiments, analysis of the influence of the partitions based on ensembles, the capacity analysis techniques in obtaining multiple partitions, and the analysis of the clusters extracted.