Vetores de Parágrafo aplicados à localização de características e bugs de software
Ano de defesa: | 2020 |
---|---|
Autor(a) principal: | |
Orientador(a): | |
Banca de defesa: | |
Tipo de documento: | Tese |
Tipo de acesso: | Acesso aberto |
Idioma: | por |
Instituição de defesa: |
Universidade Federal de Uberlândia
Brasil Programa de Pós-graduação em Ciência da Computação |
Programa de Pós-Graduação: |
Não Informado pela instituição
|
Departamento: |
Não Informado pela instituição
|
País: |
Não Informado pela instituição
|
Palavras-chave em Português: | |
Link de acesso: | https://repositorio.ufu.br/handle/123456789/29339 http://doi.org/10.14393/ufu.te.2020.404 |
Resumo: | Throughout a software maintenance phase, the processes of feature and bug location play an important role in the retrieval of traceability between natural language and source code. However, existing automated tools, which were designed to perform these tasks, have not yet presented the desirable accuracy expected as a definitive outcome. In this sense, this work main objective seeks to attend to the improvement of paragraph vectors, produced by artificial neural network in the Doc2vec algorithm, and applied into the processes of feature and software bug location. By improving the vector quality, these were used in the calculation of similarities found between feature descriptions and source code methods. Also, improvements in identifying bugs similarity and source code classes were observed. In order to reach this result, a cyclic learning rate was applied to the task in addition to customizing neural network loss function, which were associated with the Doc2vec algorithm. Furthermore, other approaches used have also demonstrated the gains in accuracy for the Doc2vec algorithm, provided by the combination of rankings obtained in tools that express the state of the art in Software Engineering. Yet, a set of other minor approaches were applied to this work, such as: improvement in the quality of representative vectors from source code methods, departing from other systems source code; usage of word syntactic influence within paragraphs extracted from source code; and usage of fastText algorithm to generate paragraph vectors. For each proposed approach, considerations were made in order to evaluate its effectiveness in feature and bug location tasks. In sum, the improvements proposed by this work to artificial neural network allowed to improve state of the art work in feature location tasks. Moreover, improvements in bug location were made by using the Doc2vec algorithm, and most influential hyper parameters were defined to improve accuracy. |