Enhancing the SZZ Algorithm to Deal with Refactoring Changes

Campos Neto, Edmilson Barbalho

Enhancing the SZZ Algorithm to Deal with Refactoring Changes

Detalhes bibliográficos
Ano de defesa:	2018
Autor(a) principal:	Campos Neto, Edmilson Barbalho
Orientador(a):	Kulesza, Uira
Banca de defesa:	Não Informado pela instituição
Tipo de documento:	Tese
Tipo de acesso:	Acesso aberto
Idioma:	por
Instituição de defesa:	Não Informado pela instituição
Programa de Pós-Graduação:	PROGRAMA DE PÓS-GRADUAÇÃO EM SISTEMAS E COMPUTAÇÃO
Departamento:	Não Informado pela instituição
País:	Brasil
Palavras-chave em Português:	SZZ algorithm Fix-inducing changes Bug-fix changes Refactoring changes
Área do conhecimento CNPq:	CNPQ::CIENCIAS EXATAS E DA TERRA::CIENCIA DA COMPUTACAO::SISTEMAS DE COMPUTACAO
Link de acesso:	https://repositorio.ufrn.br/jspui/handle/123456789/26204
Resumo:	SZZ was proposed by Śliwerski, Zimmermann, and Zeller (hence the SZZ abbreviation) to identify fix-inducing changes, i.e., changes that are likely to induce bugs. Despite the wide adoption of this algorithm, SZZ still faces limitations, which have been recently reported. No existing research work widely surveys how SZZ has been used, extended, and evaluated by the software engineering community. Furthermore, only a few studies have investigated the SZZ improvements. Therefore, this thesis aims to explore the existing limitations that have been documented in the SZZ literature and to enhance the state-of-the-art of SZZ by proposing solutions to some of its limitations. First, we perform a systematic mapping study to determine the current state-of-the-art of the SZZ algorithm. Our results exhibit that majority of the analyzed studies use SZZ as a foundation to conduct their empirical studies (79%), while only a few studies have proposed direct improvements to SZZ (6%) or evaluated it (4%). We further observe that SZZ exhibits several unaddressed limitations, such as the bias related to the refactoring changes. Second, we conducted an empirical study to investigate the relationship between the refactoring changes and the SZZ results. We use RefDiff — a tool that detects code refactoring with high precision — to analyze an extensive dataset, including 31,518 issues of ten systems, with 64,855 bug-fixes and 20,298 fix-inducing changes. We run RefDiff for both bug-fix and fix-inducing changes that were generated by a recent SZZ implementation, meta-change aware SZZ (MASZZ). The results indicate a refactoring ratio of 6.5% for fix-inducing changes and 19.9% for bug-fix changes. We incorporated RefDiff into MA-SZZ and proposed a refactoring aware SZZ implementation (RA-SZZ). RA-SZZ reduces the number of lines flagged as fix-inducing changes by MA-SZZ by 20.8%. These results indicate that refactoring can impact the SZZ results. Using an evaluation framework, we observe that RA-SZZ reduces the disagreement ratio compared to previous implementations; however, our results suggest the SZZ accuracy must still be improved. Finally, we evaluated the RA-SZZ accuracy using a well-accepted dataset, which we validated for evaluating SZZ implementations. Furthermore, we revisited the known RA-SZZ limitations to improve the accuracy of the algorithm by integrating a novel refactoring-detection tool, RMiner. We observed that after refining RA-SZZ, in the median, 44% of the lines that were flagged as fix-inducing are accurate, while only 29% were flagged accurately in case of the MA-SZZ-generated results. We also manually analyzed the RA-SZZ results and observed that there are still refactoring (31.17%) and equivalent changes (13.64%) to be recognized by SZZ. This result reiterates that detecting refactoring indeed increases the SZZ accuracy. Our thesis results contribute to SZZ maturation and indicate that the impact of refactoring upon SZZ may be even higher if further improvements can be made in future studies.

Enhancing the SZZ Algorithm to Deal with Refactoring Changes

Registros relacionados