LDMAT: função de custo baseada em matrizes de distâncias para o treinamento de redes neurais convolucionais
Ano de defesa: | 2021 |
---|---|
Autor(a) principal: | |
Orientador(a): | |
Banca de defesa: | |
Tipo de documento: | Tese |
Tipo de acesso: | Acesso aberto |
Idioma: | por |
Instituição de defesa: |
Universidade Federal de Minas Gerais
Brasil ENG - DEPARTAMENTO DE ENGENHARIA ELÉTRICA Programa de Pós-Graduação em Engenharia Elétrica UFMG |
Programa de Pós-Graduação: |
Não Informado pela instituição
|
Departamento: |
Não Informado pela instituição
|
País: |
Não Informado pela instituição
|
Palavras-chave em Português: | |
Link de acesso: | http://hdl.handle.net/1843/39773 https://orcid.org/0000-0002-7510-3878 |
Resumo: | Convolutional Neural Networks (CNNs) have been on the forefront of neural network research in recent years. Their breakthrough performance in fields such as image classification has gathered efforts in the development of new CNN-based architectures, but recently a search for new loss functions for CNNs has emerged. Softmax loss remains the most popular loss function due mainly to its efficiency in class separation, but the function is unsatisfactory in terms of intra-class compactness. In this thesis, a new loss function based on distance matrices (LDMAT) is presented, which performs a pairwise analysis of a set of features and operates directly on the features extracted by convolutional filters, thus allowing its use in arbitrary classifiers. A regularization method by inserting a disturbance in the label distance matrix is also presented, with performance similar to dropout. The proposed approach allows the combination of trained models with the LDMAT loss function with different levels of regularization, which improves the generalization of the final model. In order to validate the proposed loss function, experiments were performed to demonstrate its efficiency in training CNNs during classification tasks. In the FMNIST, EMNIST, CIFAR10, CIFAR100 and SVHN datasets, the LDMAT loss function outperforms other approaches that use similar architectures, and in the MNIST dataset, the accuracy were close to results of other loss functions proposed in the literature. |