Contributions to bug-fixing time estimation: an empirical study in open source projects of apache ecosystem

Detalhes bibliográficos
Ano de defesa: 2022
Autor(a) principal: Vieira, Renan Gomes
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: eng
Instituição de defesa: Não Informado pela instituição
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: http://www.repositorio.ufc.br/handle/riufc/66693
Resumo: Fixing bugs is a crucial aspect of software maintenance. Developers and managers must deal with many bug reports that need immediate attention despite limited resources and tight deadlines. Generally, software projects use issue tracking systems to report and monitor bug-fixing tasks. Several researchers have used this data source to conduct research and better understand the problem, providing means to reduce costs and improve efficiency in the correction task. This thesis presents three contributions to the bugs correction process. The first is a dataset and its mining script, along with a series of analyzes and visualizations. We describe the data acquisition process, the necessity to mine a new dataset, and provide a deeper analysis of some reporting fields that we use in the subsequent contributions presented in this thesis. A second contribution is a new approach to estimating the time to fix bugs. We consider the concept of bug report evolution to create a dataset containing all investigated report states. First, we check how often the bug reports and their fields are updated. Next, we evaluate our approach using different machine learning methods as a classification problem, with a number of output configurations and class balancing techniques. Using the best models (considering all possible designs) for the different stages of the evolution of a bug report, we evaluate whether there are significant differences in the estimation capacity of the models according to the report state. We gathered evidence that report fields are frequently updated, which characterizes the evolution of reports, impacting the creation of bugs fixing-time estimation models. The evaluation of the models shows promising results in predicting whether a bug will be fixed in less or more than five days, especially in the initial states of the reports. The third contribution is a study on the relationship between bug correction time and three fields: priority, links (the relationship between reports), and code-churn (related to the fixing patch associated with the bug report). Through Bayesian data analysis, we evaluated two different models - one ‘specific’ for each project and one ‘hierarchical’ considering all projects at once. We also explored three other hierarchical models to illustrate the flexibility of this type of modeling. Finally, we have gathered evidence that bug reports with links and higher values of code-churn (above the project’s median) tend to take longer to fix. On the other hand, the priority level appears to have no significant influence on the time to fix a bug.