Predicting software defects with causality tests = Predizendo defeitos de software com testes de causalidade

Detalhes bibliográficos
Ano de defesa: 2013
Autor(a) principal: César Francisco de Moura Couto
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Universidade Federal de Minas Gerais
UFMG
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: http://hdl.handle.net/1843/ESBF-9GMMLN
Resumo: Defect prediction is a central area of research in software engineering that aims to identify the components of a software system that are more likely to present defects. Despite the large investment in research aiming to identify an effective way to predict defects in software systems, there is still no widely used solution to this problem. Current defect prediction approaches present at least two main problems in the current defect prediction approaches. First, most approaches do not consider the idea of causality between software metrics and defects. More specifically, the studies performed to evaluate defect prediction techniques do not investigate in-depth whether the discovered relationships indicate cause-effect relations or whether they are statistical coincidences. The second problem concerns the output of the current defect prediction models. Typically, most indicate the number or the existence of defects in a component in the future. Clearly, the availability of this information is important to foster software quality. However, predicting defects as soon as they are introduced in the code is more useful to maintainers than simply signaling the future occurrences of defects.To tackle these questions, in this thesis we propose a defect prediction approach centered on more robust evidences towards causality between source code metrics (as predictors) and the occurrence of defects. More specifically, we rely on a statistical hypothesis test proposed by Clive Granger to evaluate whether past variations in source code metrics values can be used to forecast changes in time series of defects. The Granger Causality Test was originally proposed to evaluate causality between time series of economic data. Our approach triggers alarms whenever changes made to the source code of a target system are likely to present defects. We evaluated our approach in several life stages of four Java-based systems. We reached an average precision greater than 50% in three out of the four systems we evaluated. Moreover, by comparing our approach with baselines that are not based on causality tests, it achieved a better precision.