Ai-driven software development: how has the rise of Chatgpt impacted Github users? - Impact on gode quality

Detalhes bibliográficos
Autor(a) principal: Menezes, Guilherme Nascimento Ortega de Paiva
Data de Publicação: 2023
Tipo de documento: Dissertação
Idioma: eng
Título da fonte: Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
Texto Completo: http://hdl.handle.net/10362/176254
Resumo: This thesis analyses AI-driven Software Development practices, investigating how the ban of ChatGPT in Italy affected GitHub activity using a Difference in Difference analysis. It covers the process of retrieving and processing data from GitHub Archive to form a four-week dataset spanning the period from 17/03/2023 to 14/04/2023. A scoring and selection approach to identify users from Italy (Treatment Group) and Germany (Control Group) resulted in a dataset comprising 244,401 commits from 10,520 individual GitHub users. Results suggest that users affiliated with organizations show a 12.12% increase in GitHub events, implying a decrease in coding efficiency after the ban of ChatGPT. In contrast, individual users' activities remain largely unaffected by the ban. Coding errors rose by 8.91% on business days, further indicating a reduction in code quality, while results for weekends and public holidays were insignificant. Lastly, organization-related users active on business days exhibited a 13.39% increase in GitHub events post-ban, suggesting a reduction in coding efficiency, and a 20.6% increase in coding errors, pointing to a decline in code quality. No empirical evidence is found for the bans’ effects on collaboration practices. These findings suggest that ChatGPT is well integrated into the daily software development workflow and actively used to assist in writing and debugging code, especially in professional settings.
id RCAP_59eaaa65a75167b49f5934967a3768d2
oai_identifier_str oai:run.unl.pt:10362/176254
network_acronym_str RCAP
network_name_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository_id_str https://opendoar.ac.uk/repository/7160
spelling Ai-driven software development: how has the rise of Chatgpt impacted Github users? - Impact on gode qualitySoftware developmentChatgptGithubCode qualityDifferences in differencesLlmMachine learningData analysisText classificationData curationDomínio/Área Científica::Ciências Sociais::Economia e GestãoThis thesis analyses AI-driven Software Development practices, investigating how the ban of ChatGPT in Italy affected GitHub activity using a Difference in Difference analysis. It covers the process of retrieving and processing data from GitHub Archive to form a four-week dataset spanning the period from 17/03/2023 to 14/04/2023. A scoring and selection approach to identify users from Italy (Treatment Group) and Germany (Control Group) resulted in a dataset comprising 244,401 commits from 10,520 individual GitHub users. Results suggest that users affiliated with organizations show a 12.12% increase in GitHub events, implying a decrease in coding efficiency after the ban of ChatGPT. In contrast, individual users' activities remain largely unaffected by the ban. Coding errors rose by 8.91% on business days, further indicating a reduction in code quality, while results for weekends and public holidays were insignificant. Lastly, organization-related users active on business days exhibited a 13.39% increase in GitHub events post-ban, suggesting a reduction in coding efficiency, and a 20.6% increase in coding errors, pointing to a decline in code quality. No empirical evidence is found for the bans’ effects on collaboration practices. These findings suggest that ChatGPT is well integrated into the daily software development workflow and actively used to assist in writing and debugging code, especially in professional settings.Batikas, MichailRUNMenezes, Guilherme Nascimento Ortega de Paiva2024-12-06T10:56:12Z2024-01-122023-12-192024-01-12T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10362/176254TID:203680731enginfo:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2024-12-09T01:37:22Zoai:run.unl.pt:10362/176254Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-28T19:17:54.460244Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse
dc.title.none.fl_str_mv Ai-driven software development: how has the rise of Chatgpt impacted Github users? - Impact on gode quality
title Ai-driven software development: how has the rise of Chatgpt impacted Github users? - Impact on gode quality
spellingShingle Ai-driven software development: how has the rise of Chatgpt impacted Github users? - Impact on gode quality
Menezes, Guilherme Nascimento Ortega de Paiva
Software development
Chatgpt
Github
Code quality
Differences in differences
Llm
Machine learning
Data analysis
Text classification
Data curation
Domínio/Área Científica::Ciências Sociais::Economia e Gestão
title_short Ai-driven software development: how has the rise of Chatgpt impacted Github users? - Impact on gode quality
title_full Ai-driven software development: how has the rise of Chatgpt impacted Github users? - Impact on gode quality
title_fullStr Ai-driven software development: how has the rise of Chatgpt impacted Github users? - Impact on gode quality
title_full_unstemmed Ai-driven software development: how has the rise of Chatgpt impacted Github users? - Impact on gode quality
title_sort Ai-driven software development: how has the rise of Chatgpt impacted Github users? - Impact on gode quality
author Menezes, Guilherme Nascimento Ortega de Paiva
author_facet Menezes, Guilherme Nascimento Ortega de Paiva
author_role author
dc.contributor.none.fl_str_mv Batikas, Michail
RUN
dc.contributor.author.fl_str_mv Menezes, Guilherme Nascimento Ortega de Paiva
dc.subject.por.fl_str_mv Software development
Chatgpt
Github
Code quality
Differences in differences
Llm
Machine learning
Data analysis
Text classification
Data curation
Domínio/Área Científica::Ciências Sociais::Economia e Gestão
topic Software development
Chatgpt
Github
Code quality
Differences in differences
Llm
Machine learning
Data analysis
Text classification
Data curation
Domínio/Área Científica::Ciências Sociais::Economia e Gestão
description This thesis analyses AI-driven Software Development practices, investigating how the ban of ChatGPT in Italy affected GitHub activity using a Difference in Difference analysis. It covers the process of retrieving and processing data from GitHub Archive to form a four-week dataset spanning the period from 17/03/2023 to 14/04/2023. A scoring and selection approach to identify users from Italy (Treatment Group) and Germany (Control Group) resulted in a dataset comprising 244,401 commits from 10,520 individual GitHub users. Results suggest that users affiliated with organizations show a 12.12% increase in GitHub events, implying a decrease in coding efficiency after the ban of ChatGPT. In contrast, individual users' activities remain largely unaffected by the ban. Coding errors rose by 8.91% on business days, further indicating a reduction in code quality, while results for weekends and public holidays were insignificant. Lastly, organization-related users active on business days exhibited a 13.39% increase in GitHub events post-ban, suggesting a reduction in coding efficiency, and a 20.6% increase in coding errors, pointing to a decline in code quality. No empirical evidence is found for the bans’ effects on collaboration practices. These findings suggest that ChatGPT is well integrated into the daily software development workflow and actively used to assist in writing and debugging code, especially in professional settings.
publishDate 2023
dc.date.none.fl_str_mv 2023-12-19
2024-12-06T10:56:12Z
2024-01-12
2024-01-12T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10362/176254
TID:203680731
url http://hdl.handle.net/10362/176254
identifier_str_mv TID:203680731
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron:RCAAP
instname_str FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron_str RCAAP
institution RCAAP
reponame_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
collection Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository.name.fl_str_mv Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
repository.mail.fl_str_mv info@rcaap.pt
_version_ 1833597999809298432