Type inference for C: applications to the analysis of incomplete programs

Detalhes bibliográficos
Ano de defesa: 2019
Autor(a) principal: Leandro Terra Cunha Melo
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Universidade Federal de Minas Gerais
Brasil
ICX - DEPARTAMENTO DE CIÊNCIA DA COMPUTAÇÃO
Programa de Pós-Graduação em Ciência da Computação
UFMG
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: http://hdl.handle.net/1843/30386
Resumo: Type inference is a feature that is common to a variety of programming languages. While, in the past, it has been prominently present in functional languages (e.g., ML andHaskell),today,manyobject-oriented/multi-paradigmlanguageslikeC#andC++ offer, to a certain extent, such a feature. Nevertheless, whole-program type inference is still an unsolved problem in C. The first difficulty encountered when tackling this problem is the fact that parsing C requires, not only syntactic, but also semantic information. Yet, greater challenges emerge due to C’s intricate type system. In this work, we present a solution to this problem: a unification-based approach that lets us infer types that are not declared. As a primary application of our technique, we investigate the reconstruction of partial C programs. Incomplete source code naturally appears in software development: during design, and while evolving, testing and analyzing programs. Therefore, the ability to understand it is a valuable asset. Reconstructing a partial program into a complete well typed one can: (i) enable static analysis tools in scenarios where components may be absent; (ii) improve precision of static analysis tools that require no build-specifications; (iii) allow stub-generation and testing tools to work on code snippets; and (iv) assist programmers on the extraction of data-structures from algorithms. We evaluate our technique on code from a variety of C libraries such as GNU’s Coreutils, GNULib, GNOME’s GLib, and GDSL; from implementations of a book; and on snippets from popular projects like CPython, FreeBSD, and Git.