On the use of control- and data-ow in fault localization

Ribeiro, Henrique Lemos

On the use of control- and data-ow in fault localization

Detalhes bibliográficos
Ano de defesa:	2016
Autor(a) principal:	Ribeiro, Henrique Lemos
Orientador(a):	Não Informado pela instituição
Banca de defesa:	Não Informado pela instituição
Tipo de documento:	Dissertação
Tipo de acesso:	Acesso aberto
Idioma:	eng
Instituição de defesa:	Biblioteca Digitais de Teses e Dissertações da USP
Programa de Pós-Graduação:	Não Informado pela instituição
Departamento:	Não Informado pela instituição
País:	Não Informado pela instituição
Palavras-chave em Português:	Control-flow Data-flow Engenharia de software Fault localization Fluxo de controle Fluxo de dados Localização de defeitos Software engineering
Link de acesso:	http://www.teses.usp.br/teses/disponiveis/100/100131/tde-18102016-092654/
Resumo:	Testing and debugging are key tasks during the development cycle. However, they are among the most expensive activities during the development process. To improve the productivity of developers during the debugging process various fault localization techniques have been proposed, being Spectrum-based Fault Localization (SFL), or Coverage-based Fault Localization (CBFL), one of the most promising. SFL techniques pinpoints program elements (e.g., statements, branches, and definition-use associations), sorting them by their suspiciousness. Heuristics are used to rank the most suspicious program elements which are then mapped into lines to be inspected by developers. Although data-flow spectra (definition-use associations) has been shown to perform better than control-flow spectra (statements and branches) to locate the bug site, the high overhead to collect data-flow spectra has prevented their use on industry-level code. A data-flow coverage tool was recently implemented presenting on average 38% run-time overhead for large programs. Such a fairly modest overhead motivates the study of SFL techniques using data-flow information in programs similar to those developed in the industry. To achieve such a goal, we implemented Jaguar (JAva coveraGe faUlt locAlization Ranking), a tool that employ control-flow and data-flow coverage on SFL techniques. The effectiveness and efficiency of both coverages are compared using 173 faulty versions with sizes varying from 10 to 96 KLOC. Ten known SFL heuristics to rank the most suspicious lines are utilized. The results show that the behavior of the heuristics are similar both to control- and data-flow coverage: Kulczynski2 and Mccon perform better for small number of lines investigated (from 5 to 30 lines) while Ochiai performs better when more lines are inspected (30 to 100 lines). The comparison between control- and data-flow coverages shows that data-flow locates more defects in the range of 10 to 50 inspected lines, being up to 22% more effective. Moreover, in the range of 20 and 100 lines, data-flow ranks the bug better than control-flow with statistical significance. However, data-flow is still more expensive than control-flow: it takes from 23% to 245% longer to obtain the most suspicious lines; on average data-flow is 129% more costly. Therefore, our results suggest that data-flow is more effective in locating faults because it tracks more relationships during the program execution. On the other hand, SFL techniques supported by data-flow coverage needs to be improved for practical use at industrial settings

On the use of control- and data-ow in fault localization

Registros relacionados