Fast Deep Stacked Network: Um algoritmo baseado em Extreme Learning Machine para treinamento rápido de uma arquitetura empilhada com pesos compartilhados

Detalhes bibliográficos
Ano de defesa: 2022
Autor(a) principal: Silva, Bruno Legora Souza da
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Universidade Federal do Espírito Santo
BR
Doutorado em Engenharia Elétrica
Centro Tecnológico
UFES
Programa de Pós-Graduação em Engenharia Elétrica
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: http://repositorio.ufes.br/handle/10/15980
Resumo: Artificial Neural Networks have been applied to solve classification and regression problems, increasing their popularity, mainly since the proposal of the backpropagation algorithm for its training stage using datasets. In the past years, the volume of generated data and the increased processing power of computers and Graphical Processing Units (GPU) enabled the training of large (or deep) architectures, capable of extracting and predicting information from complex problems, which are usually computationally expensive. In contrast, fast algorithms to train small (shallow) architectures, such as the single hidden layer feedforward network (SLFN), but capable of approximate any continuous function, were proposed. One of them is called Extreme Learning Machine (ELM), which has a fast and closed solution and was applied in wide range of applications, obtaining better performances than other methods, such as backpropagation-trained neural networks and Support Vector Machines (SVMs). Variants of ELM were proposed to solve problems such as underfitting, overfitting and outliers, but they still suffer when used with large datasets and/or when more neurons are required to extract more information. Thus, Stacked ELM (S-ELM) was proposed, which stacks ELM-trained modules, using information from a module in the next one, improving the results using large datasets, but it has limitation regarding the memory consumption, furthermore, it is not adequate for handling problems that involve a single output, such as some regression tasks. Another stacked method is called Deep Stacked Network (DSN), which has problems with training time and memory usage, but without the application limitation of Stacked ELM. Therefore, this work proposes to combine the DSN architecture with the ELM and Kernel ELM algorithms in order to obtain an model composed of small modules, with fast training and with a reduced memory usage, but capable of obtain similar performances compared more complex models. We also propose a variation of this model which deals with data that arrives gradually, called incremental learning (or online in the ELM context). Extensive experiments were conducted to evaluate the proposed methods in regression and classification tasks. Regarding the online approach, only regression tasks were considered. The results show that the methods are capable of training stacked architectures with statistically equivalent performances to SLFN with a large amount of neurons or (or other online methods), when comparing an error or accuracy metric. Considering the training time, the proposed methods spent less time in many cases. When memory usage is considered, some of the proposed methods were considered statistically better, which favors its use in restricted environments.