Algoritmo evolutivo com representação inteira para seleção de características

Sousa, Rhelcris Salvino de

Algoritmo evolutivo com representação inteira para seleção de características

Detalhes bibliográficos
Ano de defesa:	2017
Autor(a) principal:	Sousa, Rhelcris Salvino de
Orientador(a):	Soares, Telma Woerle de Lima
Banca de defesa:	Soares, Telma Woerle de Lima , Soares, Anderson da Silva , Camilo Junior , Celso Gonçalves, Dias , Jailson Cardoso
Tipo de documento:	Dissertação
Tipo de acesso:	Acesso aberto
Idioma:	por
Instituição de defesa:	Universidade Federal de Goiás
Programa de Pós-Graduação:	Programa de Pós-graduação em Ciência da Computação (INF)
Departamento:	Instituto de Informática - INF (RG)
País:	Brasil
Palavras-chave em Português:	Seleção de características Computação evolutiva Calibração multivariada Regressão linear múltipla
Palavras-chave em Inglês:	Features selection Evolutionary computation Multivariate calibration Multiple linear regression
Área do conhecimento CNPq:	CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO
Link de acesso:	http://repositorio.bc.ufg.br/tede/handle/tede/7395
Resumo:	Machine learning problems usually involve a large number of features or variables. In this context, feature selection algorithms have the challenge of determining a reduced subset from the original set. The main difficulty in this task is the high number of solutions available in the search space. In this context, genetic algorithm is one of the most used techniques in this type of problem due to its implicit parallelism in the exploration of the search space of the problem considered. However, a binary type representation is usually used to encode the solutions. This work proposes an implementation solution that makes use of integer representation called intEA-MLR instead of binary. The integer representation optimizes the understanding of the data, as the features to be selected are represented by integer values, reducing the size of the chromosome used in the search process. The intEA-MLR in this context is presented as an alternative way of solving high dimensional problems in regression problems. As a case study, three different sets of data are used concerning problems involving determination of properties of interest in samples of 1) Grain Wheat, 2) Medicine tablets and 3) petroleum. Such sets were used in competitions held at the International Diffuse Reflectance Conference (IDRC) (http://cnirs.clubexpress.com/content.aspx?page_id=22&club_ id=409746&module_id=190211), in the years 2008, 2012 and 2014, respectively. The results showed that the proposed solution was able to improve the obtained solutions when compared to the classical implementation that makes use of binary coding, with both more accurate prediction models and with reduced number of features. IntEA-MLR also outperformed the competition winners, reaching 91.17% better than the competition winner for the petroleum data set. In addition, the results also indicated that the computation time required by the intEA-MLR is relatively smaller as more features are available.

Algoritmo evolutivo com representação inteira para seleção de características

Registros relacionados