Vetores de Parágrafo aplicados à localização de características e bugs de software

Detalhes bibliográficos
Ano de defesa: 2020
Autor(a) principal: Silva, Allysson Costa e
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Universidade Federal de Uberlândia
Brasil
Programa de Pós-graduação em Ciência da Computação
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: https://repositorio.ufu.br/handle/123456789/29339
http://doi.org/10.14393/ufu.te.2020.404
Resumo: Throughout a software maintenance phase, the processes of feature and bug location play an important role in the retrieval of traceability between natural language and source code. However, existing automated tools, which were designed to perform these tasks, have not yet presented the desirable accuracy expected as a definitive outcome. In this sense, this work main objective seeks to attend to the improvement of paragraph vectors, produced by artificial neural network in the Doc2vec algorithm, and applied into the processes of feature and software bug location. By improving the vector quality, these were used in the calculation of similarities found between feature descriptions and source code methods. Also, improvements in identifying bugs similarity and source code classes were observed. In order to reach this result, a cyclic learning rate was applied to the task in addition to customizing neural network loss function, which were associated with the Doc2vec algorithm. Furthermore, other approaches used have also demonstrated the gains in accuracy for the Doc2vec algorithm, provided by the combination of rankings obtained in tools that express the state of the art in Software Engineering. Yet, a set of other minor approaches were applied to this work, such as: improvement in the quality of representative vectors from source code methods, departing from other systems source code; usage of word syntactic influence within paragraphs extracted from source code; and usage of fastText algorithm to generate paragraph vectors. For each proposed approach, considerations were made in order to evaluate its effectiveness in feature and bug location tasks. In sum, the improvements proposed by this work to artificial neural network allowed to improve state of the art work in feature location tasks. Moreover, improvements in bug location were made by using the Doc2vec algorithm, and most influential hyper parameters were defined to improve accuracy.