Imputação de dados faltantes via algoritmo EM e rede neural MLP com o método de estimativa de máxima verossimilhança para aumentar a acurácia das estimativas

Detalhes bibliográficos
Ano de defesa: 2015
Autor(a) principal: Ribeiro, Elisalvo Alves lattes
Orientador(a): Montesco, Carlos Alberto Estombelo lattes
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Universidade Federal de Sergipe
Programa de Pós-Graduação: Pós-Graduação em Ciência da Computação
Departamento: Não Informado pela instituição
País: BR
Palavras-chave em Português:
Palavras-chave em Inglês:
Área do conhecimento CNPq:
Link de acesso: https://ri.ufs.br/handle/riufs/3333
Resumo: Database with missing values it is an occurrence often found in the real world, beiging of this problem caused by several reasons (equipment failure that transmits and stores the data, handler failure, failure who provides information, etc.). This may make the data inconsistent and unable to be analyzed, leading to very skewed conclusions. This dissertation aims to explore the use of Multilayer Perceptron Artificial Neural Network (ANN MLP), with new activation functions, considering two approaches (single imputation and multiple imputation). First, we propose the use of Maximum Likelihood Estimation Method (MLE) in each network neuron activation function, against the approach currently used, which is without the use of such a method or when is used only in the cost function (network output). It is then analyzed the results of these approaches compared with the Expectation Maximization algorithm (EM) is that the state of the art to treat missing data. The results indicate that when using the Artificial Neural Network MLP with Maximum Likelihood Estimation Method, both in all neurons and only in the output function, lead the an imputation with lower error. These experimental results, evaluated by metrics such as MAE (Mean Absolute Error) and RMSE (Root Mean Square Error), showed that the better results in most experiments occured when using the MLP RNA addressed in this dissertation to single imputation and multiple.