GPT-3-powered type error debugging: investigating the use of large language models for code repair

Bibliographic Details
Main Author: Ribeiro, Francisco José Torres
Publication Date: 2023
Other Authors: Macedo, José Nuno Castro, Tsushima, Kanae, Abreu, Rui, Saraiva, João
Language: eng
Source: Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
Download full: https://hdl.handle.net/1822/89919
Summary: Type systems are responsible for assigning types to terms in programs. That way, they enforce the actions that can be taken and can, consequently, detect type errors during compilation. However, while they are able to ag the existence of an error, they often fail to pinpoint its cause or provide a helpful error message. Thus, without adequate support, debugging this kind of errors can take a considerable amount of effort. Recently, neural network models have been developed that are able to understand programming languages and perform several downstream tasks. We argue that type error debugging can be enhanced by taking advantage of this deeper understanding of the language’s structure. In this paper, we present a technique that leverages GPT-3’s capabilities to automatically fix type errors in OCaml programs. We perform multiple source code analysis tasks to produce useful prompts that are then provided to GPT-3 to generate potential patches. Our publicly available tool, Mentat, supports multiple modes and was validated on an existing public dataset with thousands of OCaml programs. We automatically validate successful repairs by using Quickcheck to verify which generated patches produce the same output as the user-intended fixed version, achieving a 39% repair rate. In a comparative study, Mentat outperformed two other techniques in automatically fixing ill-typed OCaml programs.
id RCAP_67a9055ec8e0dd0c413adf11c5f6f134
oai_identifier_str oai:repositorium.sdum.uminho.pt:1822/89919
network_acronym_str RCAP
network_name_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository_id_str https://opendoar.ac.uk/repository/7160
spelling GPT-3-powered type error debugging: investigating the use of large language models for code repairAutomated Program RepairGPT-3Fault LocalizationCode GenerationType systems are responsible for assigning types to terms in programs. That way, they enforce the actions that can be taken and can, consequently, detect type errors during compilation. However, while they are able to ag the existence of an error, they often fail to pinpoint its cause or provide a helpful error message. Thus, without adequate support, debugging this kind of errors can take a considerable amount of effort. Recently, neural network models have been developed that are able to understand programming languages and perform several downstream tasks. We argue that type error debugging can be enhanced by taking advantage of this deeper understanding of the language’s structure. In this paper, we present a technique that leverages GPT-3’s capabilities to automatically fix type errors in OCaml programs. We perform multiple source code analysis tasks to produce useful prompts that are then provided to GPT-3 to generate potential patches. Our publicly available tool, Mentat, supports multiple modes and was validated on an existing public dataset with thousands of OCaml programs. We automatically validate successful repairs by using Quickcheck to verify which generated patches produce the same output as the user-intended fixed version, achieving a 39% repair rate. In a comparative study, Mentat outperformed two other techniques in automatically fixing ill-typed OCaml programs.This work is financed by National Funds through the Portuguese funding agency, FCT - Fundação para a Ciência e a Tecnologia, within project UIDP/50014/2020. Francisco Ribeiro and José Nuno Macedo acknowledge FCT PhD grants SFRH/BD/144938/2019 and 2021.08184.BD, respectively. Additional funding: JSPS KAKENHI-JP19K20248.Association for Computing Machinery (ACM)Universidade do MinhoRibeiro, Francisco José TorresMacedo, José Nuno CastroTsushima, KanaeAbreu, RuiSaraiva, João20232023-01-01T00:00:00Zconference paperinfo:eu-repo/semantics/publishedVersionapplication/pdfhttps://hdl.handle.net/1822/89919eng979-8-4007-0396-610.1145/3623476.3623522https://dl.acm.org/doi/10.1145/3623476.3623522info:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2024-05-11T05:52:01Zoai:repositorium.sdum.uminho.pt:1822/89919Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-28T15:32:48.499799Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse
dc.title.none.fl_str_mv GPT-3-powered type error debugging: investigating the use of large language models for code repair
title GPT-3-powered type error debugging: investigating the use of large language models for code repair
spellingShingle GPT-3-powered type error debugging: investigating the use of large language models for code repair
Ribeiro, Francisco José Torres
Automated Program Repair
GPT-3
Fault Localization
Code Generation
title_short GPT-3-powered type error debugging: investigating the use of large language models for code repair
title_full GPT-3-powered type error debugging: investigating the use of large language models for code repair
title_fullStr GPT-3-powered type error debugging: investigating the use of large language models for code repair
title_full_unstemmed GPT-3-powered type error debugging: investigating the use of large language models for code repair
title_sort GPT-3-powered type error debugging: investigating the use of large language models for code repair
author Ribeiro, Francisco José Torres
author_facet Ribeiro, Francisco José Torres
Macedo, José Nuno Castro
Tsushima, Kanae
Abreu, Rui
Saraiva, João
author_role author
author2 Macedo, José Nuno Castro
Tsushima, Kanae
Abreu, Rui
Saraiva, João
author2_role author
author
author
author
dc.contributor.none.fl_str_mv Universidade do Minho
dc.contributor.author.fl_str_mv Ribeiro, Francisco José Torres
Macedo, José Nuno Castro
Tsushima, Kanae
Abreu, Rui
Saraiva, João
dc.subject.por.fl_str_mv Automated Program Repair
GPT-3
Fault Localization
Code Generation
topic Automated Program Repair
GPT-3
Fault Localization
Code Generation
description Type systems are responsible for assigning types to terms in programs. That way, they enforce the actions that can be taken and can, consequently, detect type errors during compilation. However, while they are able to ag the existence of an error, they often fail to pinpoint its cause or provide a helpful error message. Thus, without adequate support, debugging this kind of errors can take a considerable amount of effort. Recently, neural network models have been developed that are able to understand programming languages and perform several downstream tasks. We argue that type error debugging can be enhanced by taking advantage of this deeper understanding of the language’s structure. In this paper, we present a technique that leverages GPT-3’s capabilities to automatically fix type errors in OCaml programs. We perform multiple source code analysis tasks to produce useful prompts that are then provided to GPT-3 to generate potential patches. Our publicly available tool, Mentat, supports multiple modes and was validated on an existing public dataset with thousands of OCaml programs. We automatically validate successful repairs by using Quickcheck to verify which generated patches produce the same output as the user-intended fixed version, achieving a 39% repair rate. In a comparative study, Mentat outperformed two other techniques in automatically fixing ill-typed OCaml programs.
publishDate 2023
dc.date.none.fl_str_mv 2023
2023-01-01T00:00:00Z
dc.type.driver.fl_str_mv conference paper
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://hdl.handle.net/1822/89919
url https://hdl.handle.net/1822/89919
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv 979-8-4007-0396-6
10.1145/3623476.3623522
https://dl.acm.org/doi/10.1145/3623476.3623522
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Association for Computing Machinery (ACM)
publisher.none.fl_str_mv Association for Computing Machinery (ACM)
dc.source.none.fl_str_mv reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron:RCAAP
instname_str FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron_str RCAAP
institution RCAAP
reponame_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
collection Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository.name.fl_str_mv Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
repository.mail.fl_str_mv info@rcaap.pt
_version_ 1833595383324868609