An Assessment of Machine Learning Algorithms and Models for Prediction of Change-Prone Java Methods

Bibliographic Details
Main Author: Silva R.
Publication Date: 2023
Other Authors: Vergilio S., Farah, Paulo Roberto
Format: Conference object
Language: eng
Source: Repositório Institucional da Udesc
dARK ID: ark:/33523/0013000002pjq
Download full: https://repositorio.udesc.br/handle/UDESC/2205
Summary: © 2023 ACM.Identifying which parts of code are prone to change during software evolution allows developers to prioritize and allocate resources efficiently. Having as focus a smaller scope makes easier change management and allows monitoring the type of modification and its impact. However, existing change-proneness prediction approaches are focused mainly on system classes. But the problem is that classes contain many characteristics of different software attributes and some software behaviors are more granular and better captured at the method-level. Motivated by these facts, in this paper, we empirically assess the performance of four machine learning algorithms for change-prone method prediction in seven open-source software projects. We derived and compared models obtained with three sets of independent variables (features): a set composed of structural metrics, a second set composed of evolution-based metrics, and a third that includes a combination of both kinds of metrics. The results show that, Random Forest presents the best general performance, independently of the used indicator and set of features. The model composed by both sets of metrics outperforms the other two. Two features based on the frequency of changes that happened in the evolution history of the method are point out as the most important for our problem.
id UDESC-2_f0b21c6a17740a17ab8b8e4d6cff35bd
oai_identifier_str oai:repositorio.udesc.br:UDESC/2205
network_acronym_str UDESC-2
network_name_str Repositório Institucional da Udesc
repository_id_str 6391
spelling An Assessment of Machine Learning Algorithms and Models for Prediction of Change-Prone Java Methods© 2023 ACM.Identifying which parts of code are prone to change during software evolution allows developers to prioritize and allocate resources efficiently. Having as focus a smaller scope makes easier change management and allows monitoring the type of modification and its impact. However, existing change-proneness prediction approaches are focused mainly on system classes. But the problem is that classes contain many characteristics of different software attributes and some software behaviors are more granular and better captured at the method-level. Motivated by these facts, in this paper, we empirically assess the performance of four machine learning algorithms for change-prone method prediction in seven open-source software projects. We derived and compared models obtained with three sets of independent variables (features): a set composed of structural metrics, a second set composed of evolution-based metrics, and a third that includes a combination of both kinds of metrics. The results show that, Random Forest presents the best general performance, independently of the used indicator and set of features. The model composed by both sets of metrics outperforms the other two. Two features based on the frequency of changes that happened in the evolution history of the method are point out as the most important for our problem.2024-12-05T13:51:25Z2023info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/conferenceObjectp. 322 - 33110.1145/3613372.3613395https://repositorio.udesc.br/handle/UDESC/2205ark:/33523/0013000002pjqACM International Conference Proceeding SeriesSilva R.Vergilio S.Farah, Paulo Robertoengreponame:Repositório Institucional da Udescinstname:Universidade do Estado de Santa Catarina (UDESC)instacron:UDESCinfo:eu-repo/semantics/openAccess2024-12-07T20:38:07Zoai:repositorio.udesc.br:UDESC/2205Biblioteca Digital de Teses e Dissertaçõeshttps://pergamumweb.udesc.br/biblioteca/index.phpPRIhttps://repositorio-api.udesc.br/server/oai/requestri@udesc.bropendoar:63912024-12-07T20:38:07Repositório Institucional da Udesc - Universidade do Estado de Santa Catarina (UDESC)false
dc.title.none.fl_str_mv An Assessment of Machine Learning Algorithms and Models for Prediction of Change-Prone Java Methods
title An Assessment of Machine Learning Algorithms and Models for Prediction of Change-Prone Java Methods
spellingShingle An Assessment of Machine Learning Algorithms and Models for Prediction of Change-Prone Java Methods
Silva R.
title_short An Assessment of Machine Learning Algorithms and Models for Prediction of Change-Prone Java Methods
title_full An Assessment of Machine Learning Algorithms and Models for Prediction of Change-Prone Java Methods
title_fullStr An Assessment of Machine Learning Algorithms and Models for Prediction of Change-Prone Java Methods
title_full_unstemmed An Assessment of Machine Learning Algorithms and Models for Prediction of Change-Prone Java Methods
title_sort An Assessment of Machine Learning Algorithms and Models for Prediction of Change-Prone Java Methods
author Silva R.
author_facet Silva R.
Vergilio S.
Farah, Paulo Roberto
author_role author
author2 Vergilio S.
Farah, Paulo Roberto
author2_role author
author
dc.contributor.author.fl_str_mv Silva R.
Vergilio S.
Farah, Paulo Roberto
description © 2023 ACM.Identifying which parts of code are prone to change during software evolution allows developers to prioritize and allocate resources efficiently. Having as focus a smaller scope makes easier change management and allows monitoring the type of modification and its impact. However, existing change-proneness prediction approaches are focused mainly on system classes. But the problem is that classes contain many characteristics of different software attributes and some software behaviors are more granular and better captured at the method-level. Motivated by these facts, in this paper, we empirically assess the performance of four machine learning algorithms for change-prone method prediction in seven open-source software projects. We derived and compared models obtained with three sets of independent variables (features): a set composed of structural metrics, a second set composed of evolution-based metrics, and a third that includes a combination of both kinds of metrics. The results show that, Random Forest presents the best general performance, independently of the used indicator and set of features. The model composed by both sets of metrics outperforms the other two. Two features based on the frequency of changes that happened in the evolution history of the method are point out as the most important for our problem.
publishDate 2023
dc.date.none.fl_str_mv 2023
2024-12-05T13:51:25Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/conferenceObject
format conferenceObject
status_str publishedVersion
dc.identifier.uri.fl_str_mv 10.1145/3613372.3613395
https://repositorio.udesc.br/handle/UDESC/2205
dc.identifier.dark.fl_str_mv ark:/33523/0013000002pjq
identifier_str_mv 10.1145/3613372.3613395
ark:/33523/0013000002pjq
url https://repositorio.udesc.br/handle/UDESC/2205
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv ACM International Conference Proceeding Series
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv p. 322 - 331
dc.source.none.fl_str_mv reponame:Repositório Institucional da Udesc
instname:Universidade do Estado de Santa Catarina (UDESC)
instacron:UDESC
instname_str Universidade do Estado de Santa Catarina (UDESC)
instacron_str UDESC
institution UDESC
reponame_str Repositório Institucional da Udesc
collection Repositório Institucional da Udesc
repository.name.fl_str_mv Repositório Institucional da Udesc - Universidade do Estado de Santa Catarina (UDESC)
repository.mail.fl_str_mv ri@udesc.br
_version_ 1848168320633667584