Transferência de aprendizado com embeddings contextuais para classificação de texto em cenários de baixo volume de dados

Felipe Freitas de Carvalho

Transferência de aprendizado com embeddings contextuais para classificação de texto em cenários de baixo volume de dados

Detalhes bibliográficos
Ano de defesa:	2020
Autor(a) principal:	Felipe Freitas de Carvalho
Orientador(a):	Não Informado pela instituição
Banca de defesa:	Não Informado pela instituição
Tipo de documento:	Dissertação
Tipo de acesso:	Acesso aberto
Idioma:	por
Instituição de defesa:	Universidade Federal de Minas Gerais Brasil ENG - DEPARTAMENTO DE ENGENHARIA ELÉTRICA Programa de Pós-Graduação em Engenharia Elétrica UFMG
Programa de Pós-Graduação:	Não Informado pela instituição
Departamento:	Não Informado pela instituição
País:	Não Informado pela instituição
Palavras-chave em Português:	Baixo volume de dados Classificação de texto Processamento de linguagem natural Representações contextuais Aprendizado de representação Aprendizado profundo Engenharia elétrica Processamento da linguagem natural (Computação)
Link de acesso:	http://hdl.handle.net/1843/60359
Resumo:	Recent developments in the NLP (Natural Language Processing) field have shown that deep transformer based language model architectures trained on a large corpus of unlabeled data are able to transfer knowledge to downstream tasks efficiently through fine-tuning. In particular, BERT and XLNet have shown impressive results, achieving state of the art performance in many tasks through this process. This is partially due to the ability these models have to create better representations of text in the form of contextual embeddings. However not much has been explored in the literature about the robustness of the transfer learning process of these models on a small data scenario. Also not a lot of effort has been put on analyzing the behavior of the two models fine-tuning process with different amounts of training data available. Besides that, that are no studies about the difference, in terms of performance, that come from the contextual embedding representation versus traditional embedding representations, in a small data scenario. This paper addresses these questions through an empirical evaluation of these models on some datasets when fine-tuned on progressively smaller fractions of training data, for the task of text classification. The performance gains from the new way of text representation are also evaluated by using the models as feature extractors for transfer learning. It is shown that BERT and XLNet perform well with small data and can achieve good performance with very few labels available, in most cases. Results yielded with varying fractions of training data indicate that few examples are necessary in order to fine-tune the models and, although there is a positive effect in training with more labeled data, using only a subset of data is already enough to achieve a comparable performance in comparison to other models trained with substantially more data. It is also possible to observe that part of these models power is in fact due to more robust representations, given they yield better results than traditional embedding representations when used as features for other models, in most cases. However, it is noticeable how the transformer architecture as a whole is able to, after the fine tuning process, yield substantially better results in comparison to using the model as a feature extractor.

Transferência de aprendizado com embeddings contextuais para classificação de texto em cenários de baixo volume de dados

Registros relacionados