Comparing integration effort and correctness of different merge approaches in version control systems

Detalhes bibliográficos
Ano de defesa: 2016
Autor(a) principal: CAVALCANTI, Guilherme José Carvalho
Orientador(a): BORBA, Paulo Henrique Monteiro
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: eng
Instituição de defesa: Universidade Federal de Pernambuco
Programa de Pós-Graduação: Programa de Pos Graduacao em Ciencia da Computacao
Departamento: Não Informado pela instituição
País: Brasil
Palavras-chave em Português:
Link de acesso: https://repositorio.ufpe.br/handle/123456789/17923
Resumo: During the integration of code contributions resulting from development tasks, one likely has to deal with conflicting changes and dedicate substantial effort to resolve conflicts. While unstructured merge tools try to automatically resolve part of the conflicts via textual similarity, semistructured tools try to go further by exploiting the syntactic structure of part of the artefacts involved. To understand the impact of the unstructured and semistructured merge approaches on integration effort (Productivity) and correctness of the merging process (Quality), we conduct two empirical studies. In the first one, aiming at increasing the existing body of evidence and assessing results for systems developed under an alternative version control paradigm, we replicate an experiment to compare the unstructured and semistructured approaches with respect to the number of conflicts reported by both merge approaches. We used both semistructured and unstructured merge in a sample 2.5 times bigger than the original study regarding the number of projects and 18 times bigger regarding the number of performed merges, and we compared the occurrence of conflicts. Similar to the original study, we observed that semistructured merge reduces the number of conflicts in 55% of the performed merges of the new sample. Besides that, the observed average conflict reduction of 62% in these merges is far superior than what has been observed before. We also bring new evidence that the use of semistructured merge can reduce the occurrence of conflicting merges by half. In order to verify the frequency of false positives and false negatives arising from the use of these merge approaches, we move forward and we conduct a second empirical study. We compare the unstructured and semistructured merge approaches by reproducing more than 30,000 merges from 50 projects, and collecting evidence about reported conflicts that do not represent interferences between development tasks (false positives), and interferences not reported as conflicts (false negatives). In particular, our assumption is that false positives amount to unnecessary integration effort because developers have to resolve conflicts that actually do not represent interferences. Besides that, false negatives amount to build issues or bugs, negatively impacting software quality and correctness of the merging process. By analyzing such critical factors we hope to guide developers on deciding which approach should be used in practice. Finally, our results show that semistructured merge eliminates a significant part of the false positives reported by unstructured merge, but brings false positives of its own. The overall number of false positives is reduced with semistructured merge, and we argue that the conflicts associated to its false positives are easier to resolve when comparing to the false positives reported by unstructured merge. We also observe that more interferences were missed by unstructured merge and reported by semistructured merge, but we argue that the semistructured merge ones are harder to detect and resolve than the other way around. Finally, our study suggests how a semistructured merge tool could be improved to eliminate the extra false positives and negatives it has in relation to unstructured merge.