Análise de técnicas de ajuste fino em classificação de texto

Detalhes bibliográficos
Ano de defesa: 2024
Autor(a) principal: Pires, Tobias Gonçalves lattes
Orientador(a): Soares, Anderson da Silva
Banca de defesa: Soares, Anderson da Silva, Fanucchi, Rodrigo Zempulski, Galvão Filho, Arlindo Rodrigues
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Universidade Federal de Goiás
Programa de Pós-Graduação: Programa de Pós-graduação em Ciência da Computação (INF)
Departamento: Instituto de Informática - INF (RMG)
País: Brasil
Palavras-chave em Português:
Palavras-chave em Inglês:
Área do conhecimento CNPq:
Link de acesso: http://repositorio.bc.ufg.br/tede/handle/tede/13751
Resumo: Natural Language Processing (NLP) aims to develop models that enable computers to understand, interpret, process and generate text in a way similar to human communication. The last decade has seen significant advances in the field, with the introduction of deep neural network models, and the subsequent evolution of the architecture of these models such as the attention mechanism and the Transformers architecture, culminating in language models such as ELMo, BERT and GPT. And later models called Large Language Models (LLMs) improved the ability to understand and generate texts in a sophisticated way. Pre-trained models offer the advantage of reusing knowledge accumulated from vast datasets, although specific fine-tuning is required for individual tasks. However, training and tuning these models consumes a lot of processing resources, making it unfeasible for many organizations due to high costs. In resource-constrained environments, efficient fine-tuning techniques such as LoRA (Low-Rank Adaptation) were developed to optimize the model adaptation process, minimizing the number of adjustable parameters and avoiding overfitting. These techniques allow for faster and more economical training, while maintaining the robustness and generalization of the models. This work evaluates three efficient fine-tuning techniques LoRA, AdaLoRA and IA3 (in addition to full fine-tuning) in terms of memory consumption, training time and accuracy, using the DistilBERT, Roberta-base and TinyLlama models on different datasets (AG News, IMDb and SNLI).