Estratégia híbrida de seleção de partições para o problema de agrupamento de dados

Detalhes bibliográficos
Ano de defesa: 2018
Autor(a) principal: Antunes, Vanessa
Orientador(a): Sakata, Tiemi Christine lattes
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Universidade Federal de São Carlos
Câmpus Sorocaba
Programa de Pós-Graduação: Programa de Pós-Graduação em Ciência da Computação - PPGCC-So
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Palavras-chave em Inglês:
Área do conhecimento CNPq:
Link de acesso: https://repositorio.ufscar.br/handle/20.500.14289/9555
Resumo: Inability to identify partitions of different sizes and shapes is a fundamental limitation of any clustering algorithm, especially when different regions of the search space contain clusters with varied characteristics. It is possible to apply diverse clustering algorithms, with different parameters, but then, it is necessary to deal with a large number of partitions. Techniques such as ensemble and multiobjective clustering treat this problem using distinct criteria, but they have high computational cost. Moreover, the ensemble technique generates a single solution, which may not represent every real partition present in the data. On the other hand, multiobjective clustering may generate a large number of partitions, which is difficult to analyze manually. In this dissertation, we propose a hybrid multiojective algorithm, HSS (Hybrid Selection Strategy), that aims to return a reduced and yet diverse set of solutions. It can be divided in three steps: (i) the application of a multiobjective algorithm to a set of base partitions for the generation of an approximation of the Pareto Front, (ii) the division of the solutions from the approximation of the Pareto Front into a certain number of regions and (iii) the selection of a solution per region, through the application of the Adjusted Rand Index. Experiments show the effectiveness of HSS in selecting a reduced number of partitions.