How they relate and leave: understanding atoms of confusion in open-source java projects

Detalhes bibliográficos
Ano de defesa: 2024
Autor(a) principal: Pinheiro Neto, Francisco Oton
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: eng
Instituição de defesa: Não Informado pela instituição
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: http://repositorio.ufc.br/handle/riufc/76456
Resumo: Software comprehension is essential to improve understanding and avoid mistakes in the software development lifecycle. Code confusion occurs when a developer and the computer reach different interpretations about the behavior of the same piece of code. Such pieces of code can be represented as small and isolated code patterns called Atoms of Confusion (ACs). In this study, we empirically investigated the effects of ACs in the software development lifecycle of 21 open-source Java projects. We built a dataset linking more than 8,000 commits, 4,000 reported issues, and 7,000 ACs from the subject projects. Our findings showed a positive correlation between the number of ACs and the number of reported bugs and improvements. We also investigated changes in commits, looking forward to gathering a better understanding of in what context ACs are removed. As each commit is linked to at least one reported issue (e.g., bug and improvement), we were able to compare the ratio of ACs removal regarding each kind of commit and use it as a proxy to indicate whether ACs are likely to be the cause behind a reported issue. We found a higher ratio of removed ACs in bug-fix and improvement commits than in the other kinds of commits (task, sub-task, new feature, wish, and test) for 14 of the 19 studied projects, which had ACs removed in commits. Finally, to support our quantitative results, we conducted a qualitative analysis to better understand how often atoms of confusion contributed to the occurrence of a bug or improvement. We inspected ACs removed in these types of commits with up to ten lines removed, analyzing the source code, messages of each involved commit, and the title, description, and comments of related Jira issues. Out of a universe of 8,641 commits from 21 analyzed projects, 391 removed ACs. Among them, 53 met the condition for our qualitative analysis. In 7 of these commits, 9 removed ACs were likely to contribute directly to the occurrence of a bug or improvement. To the best of our knowledge, our research is the first to investigate the connection between Atoms of Confusion and the source of bugs or the cause of improvements in Java projects.