Towards robust deep learning using entropic losses

Detalhes bibliográficos
Ano de defesa: 2022
Autor(a) principal: MACÊDO, David Lopes de
Orientador(a): LUDERMIR, Teresa Bernarda
Banca de defesa: Não Informado pela instituição
Tipo de documento: Tese
Tipo de acesso: Acesso embargado
Idioma: eng
Instituição de defesa: Universidade Federal de Pernambuco
Programa de Pós-Graduação: Programa de Pos Graduacao em Ciencia da Computacao
Departamento: Não Informado pela instituição
País: Brasil
Palavras-chave em Português:
Link de acesso: https://repositorio.ufpe.br/handle/123456789/46127
Resumo: Despite theoretical achievements and encouraging practical results, deep learning still presents limitations in many areas, such as reasoning, causal inference, interpretability, and explainability. From an application point of view, one of the most impactful restrictions is related to the robustness of these systems. Indeed, current deep learning solutions are well known for not informing whether they can reliably classify an example during inference. Modern neural networks are usually overconfident, even when they are wrong. Therefore, building robust deep learning applications is currently a cutting-edge research topic in computer vision, natural language processing, and many other areas. One of the most effective ways to build more reliable deep learning solutions is to improve their performance in the so-called outof-distribution detection task, which essentially consists of “know that you do not know” or “know the unknown”. In other words, out-of-distribution detection capable systems may reject performing a nonsense classification when submitted to instances of classes on which the neural network was not trained. This thesis tackles the defiant out-of-distribution detection task by proposing novel loss functions and detection scores. Uncertainty estimation is also a crucial auxiliary task in building more robust deep learning systems. Therefore, we also deal with this robustness-related task, which evaluates how realistic the probabilities presented by the deep neural network are. To demonstrate the effectiveness of our approach, in addition to a substantial set of experiments, which includes state-of-the-art results, we use arguments based on the principle of maximum entropy to establish the theoretical foundation of the proposed approaches. Unlike most current methods, our losses and scores are seamless and principled solutions that produce accurate predictions in addition to fast and efficient inference. Moreover, our approaches can be incorporated into current and future projects simply by replacing the loss used to train the deep neural network and computing a rapid score for detection.