Advancing deep learning models for robustness and interpretability in image recognition

Detalhes bibliográficos
Ano de defesa: 2023
Autor(a) principal: SANTOS, Flávio Arthur Oliveira
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: eng
Instituição de defesa: Universidade Federal de Pernambuco
UFPE
Brasil
Programa de Pos Graduacao em Ciencia da Computacao
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: https://repositorio.ufpe.br/handle/123456789/57293
Resumo: Deep Learning architectures are among the most promising machine learning models today. They are used in various domains, including drug discovery, speech recognition, ob- ject recognition, question and answer, machine translation, and image description. Surpris- ingly, some studies even report superhuman performance, that is, a level of performance superior to human experts in certain tasks. Although these models exhibit high precision and coverage, the literature shows that they also have several limitations: (1) they are vulnerable to adversarial attacks, (2) they have difficulty inferring data outside the train- ing distribution, (3) they provide correct inferences based on spurious information, and (4) their inferences are difficult for a domain expert to interpret. These limitations make it challenging to adopt these models in high-risk applications, such as autonomous cars or medical diagnostics. Overcoming these limitations requires robustness, reliability, and interpretability. This thesis conducts a comprehensive exploration of techniques and tools to improve the robustness and interpretability of Deep Learning models in the domain of image processing. These contributions cover four key areas: (1) the development of the Active Image Data Augmentation (ADA) method to improve model robustness, (2) the proposition of the Adversarial Right for Right Reasons (ARRR) loss function to ensure that models are "right for the right reasons" and adversarially robust, (3) the introduction of the Right for Right Reasons Data Augmentation (RRDA) method, which improves the context of the information to be represented among the training data to stimulate the model’s focus on signal characteristics, and (4) the presentation of a new method for interpreting the behavior of models during the inference process. We also present a tool for manipulating visual features and assessing the robustness of models trained under different usage situations. The analyses demonstrate that the ADA method improves the robustness of models without compromising traditional performance metrics. The ARRR method demonstrates robustness against the color bias of images in problems based on the structural information of the images. In addition, the RRDA method significantly im- proves the model’s robustness in relation to background shifts in the image, outperforming the performance of other traditional RRR methods. Finally, the proposed model analy- sis tool reveals the counterintuitive interdependence of features and assesses weaknesses in the models’ inference decisions. These contributions represent significant advances in Deep Learning applied to image processing, providing valuable insights and innovative solutions to challenges associated with the reliability and interpretation of these complex models.