Meta-Learning applied to Neural Architecture Search. Towards new interactive learning approaches for indexing and analyzing images from expert domains

Detalhes bibliográficos
Ano de defesa: 2024
Autor(a) principal: Pereira, Gean Trindade
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: eng
Instituição de defesa: Biblioteca Digitais de Teses e Dissertações da USP
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: https://www.teses.usp.br/teses/disponiveis/55/55134/tde-30042024-135847/
Resumo: A critical factor for the Deep Learning progress over the years was the proposal of novel architectures that enabled considerable advancements in the learning capabilities of Neural Networks. However, experts still mainly define neural architectures in a time-consuming trialand- error process. As a result, the need for optimizing this process led to the emergence of Neural Architecture Search (NAS), which has two main advantages over the status quo: It can optimize practitioners time by automating architecture design, and enables the discovery of novel architectures. The NAS framework has three main components: (i) Search Space, which defines the space of candidate architectures; (ii) Search Strategy, which specifies how the Search Space is explored; and the (iii) Performance Estimation Strategy that defines how an architectures performance is estimated. While the Cell-based Search Space has dominated popular NAS solutions, the same is not true for Search and Performance Estimation Strategies where no dominant approach is used. Many NAS methods explore architectures space using Reinforcement Learning, Evolutionary Computation, and Gradient-based Optimization. As a Performance Estimation Strategy, the so-called One-Shot models and the more recent Training- Free and Prediction-based methods have also gained notoriety. Despite presenting good predictive performance and reduced costs, existing NAS methods using such approaches still suffer from model complexity, requiring many powerful GPUs and long training times. Furthermore, several popular solutions require large amounts of data to converge, involve inefficient and complex procedures, and lack interpretability. In this context, a potential solution is the use of Meta-Learning (MtL). MtL methods have the advantage of being faster and cheaper than mainstream solutions by using previous experience to build new knowledge. Among MtL approaches, three stand out: (i) Learning from Task Properties; (ii) Learning from Model Evaluations; and (iii) Learning from Prior Models. This thesis proposes two methods that use prior knowledge to optimize the NAS framework: Model-based Meta-Learning for Neural Architecture Search (MbML-NAS) and Active Differentiable Network Topology Search (Active- DiNTS). MbML-NAS learns from both task characteristics encoded by architectural metafeatures and performances from pre-trained architectures to predict and select ConvNets for Image Classification. Active-DiNTS learns from model evaluations, prior models, and task properties in the form of an Active Learning framework that takes information from model outputs, uncertainty estimations, and newly labeled examples in an iterative process. Experiments with MbML-NAS showed that the method was able to generalize to different search spaces and datasets using a minimum set of six interpretable meta-features. Using a simple approach with traditional regressors, MbML-NAS reported comparable predictive performances with the stateof- the-art using at least 172 examples or just 0.04% and 1.1% from the NAS-Bench-101 and NAS-Bench-201 search spaces. Active-DiNTS obtained state-of-the-art results in segmenting images in the Brain dataset from the MSD challenge, surpassing the main baseline DiNTS by up to 15%. In terms of efficiency, alternative configurations achieved comparable results to DiNTS using less than 20% of the original data. Furthermore, Active-DiNTS is computationally efficient as it generates models with fewer parameters and better memory allocation using one GPU.