Detalhes bibliográficos
Ano de defesa: |
2024 |
Autor(a) principal: |
Pinheiro, Darwin de Oliveira |
Orientador(a): |
Não Informado pela instituição |
Banca de defesa: |
Não Informado pela instituição |
Tipo de documento: |
Dissertação
|
Tipo de acesso: |
Acesso aberto |
Idioma: |
eng |
Instituição de defesa: |
Não Informado pela instituição
|
Programa de Pós-Graduação: |
Não Informado pela instituição
|
Departamento: |
Não Informado pela instituição
|
País: |
Não Informado pela instituição
|
Palavras-chave em Português: |
|
Link de acesso: |
http://repositorio.ufc.br/handle/riufc/78984
|
Resumo: |
Refactoring changes the internal structure of the code without changing its external behavior, improving quality, maintainability, and readability, in addition to reducing technical debt. Studies indicate the need to improve the detection and correction of refactorings, recommending the use of machine learning to investigate motivations, difficulties, and improvements in software. This Master’s dissertation aims to identify the relationship between trivial and non-trivial refactorings, in addition to proposing a metric that evaluates the triviality of implementing refactorings. Initially, we use supervised learning classifier models to examine the impact of trivial refactorings on the prediction of non-trivial ones. We analyzed three datasets, with 1,291 open source projects and approximately 1.9M refactoring operations, using 45 code metrics. The 5 classification models were used, in different dataset configurations. Second, we also propose an ML-based metric to evaluate the triviality of refactoring, considering complexity, speed, and risk. The study examined how the prioritization of 58 features, identified by 15 developers, affected the effectiveness of seven regression models. The effectiveness of 7 regression and ensemble models was analyzed. In addition, the alignment between the perceptions of 16 experienced developers and the results of the models was verified. Our results are promising: (i) Algorithms such as Random Forest, Decision Tree and Neural Network performed better when using code metrics to identify opportunities for refactorings; (ii) Separating trivial and non-trivial refactorings improves the efficiency of the models, even on different datasets; (iii) Using all available features outperforms the prioritization made by developers in predictive models; (iv) Ensemble models, such as Random Forest and Gradient Boosting, outperform linear models, regardless of feature prioritization; and (v) There is strong alignment between the perceptions of experts and the results of the models. In summary, this Master’s dissertation contributed to the refactoring process, an important support for developers, as it can influence the decision of whether or not to apply a refactoring. In addition, it highlights insights, challenges and opportunities for future work. |