Classificação de dados híbridos através de algoritmos evolucionários
Ano de defesa: | 2012 |
---|---|
Autor(a) principal: | |
Orientador(a): | |
Banca de defesa: | |
Tipo de documento: | Tese |
Tipo de acesso: | Acesso aberto |
Idioma: | por |
Instituição de defesa: |
Universidade Federal de Minas Gerais
UFMG |
Programa de Pós-Graduação: |
Não Informado pela instituição
|
Departamento: |
Não Informado pela instituição
|
País: |
Não Informado pela instituição
|
Palavras-chave em Português: | |
Link de acesso: | http://hdl.handle.net/1843/BUOS-9AYG7M |
Resumo: | Data mining has become an ally of the decision maker today, in both large and small corporations: nontrivial information is identified in order to allow corrections and adjustments in the economic and administrative actions. Moreover, we can see an increasing use of georeferenced information so that conventional data mining is not able to answer all the questions of a corporation. A survey of geographical data mining showed that there are few tools capable of extracting knowledge from georeferenced data, especially when it comes from a database storing conventional (numerical and textual) and geographical (points, lines and polygons) data. The main objective of this work is to develop new algorithms that are able to explore all the attributes, conventional and geographical, of a database in order to extract relevant information. The evolutionary algorithms were chosen as a starting point for the development of these new algorithms by the following reasons: (a) they are flexible, since they can be applied in different contexts, (b) they are robust, they tend to adequately explore the searching space, finding viable solutions. To achieve our objective the first algorithm described in the thesis obtains classification rules (NGAE) based on a genetic algorithm, which can be applied to databases storing numeric data. The second algorithm (DMGeo) is based on genetic programming, which aims to obtain classification rules for patterns that have numerical and spatial attributes. Finally, DMGeo has progressed to a multiobjective version, more robust and efficient, called MDMGeo. All proposed algorithms were compared with other efficient algorithms applied to classification problems, using benchmark datasets and real datasets. Experiments show that the final result is a set of robust and efficient tools, in particular, when applied to a database composed by hybrid attributes. |