The epidemics of programming language adoption

Bibliographic Details
Main Author: BARREIROS, Emanoel Francisco Spósito
Publication Date: 2016
Format: Doctoral thesis
Language: eng
Source: Repositório Institucional da UFPE
dARK ID: ark:/64986/001300000xrng
Download full: https://repositorio.ufpe.br/handle/123456789/18000
Summary: Context: In Software Engineering, technology transfer has been treated as a problem that concernsonly two agents (innovation and adoption agents) working together to fill the knowledge gap between them. In this scenario, the transfer is carried out in a “peer-to-peer” fashion, not changing the reality of individuals and organizations around them. This approach works well when one is just seeking the adoption of a technology by a“specific client”. However, it can not solve a common problem that is the adoption of new technologies by a large mass of potential new users. In a wider context like this, it no longer makes sense to focus on “peer-to-peer” transfer. A new way of looking at the problem is necessary. It makes more sense to approach it as diffusion of innovations, where there is an information spreading in a community, similar to that observed in epidemics. Objective: This thesis proposes a paradigm shift to show the adoption of programming languages can be formally addressed as an epidemic. This focus shift allows the dynamics of programming language adoption to be mathematically modelled as such, and besides finding models that explain the community’s behaviour when adopting programming languages, it allows some predictions to be made, helping both individuals who wish to adopt a new language that might seem to be a new industry standard, and language designers to understand in real time the adoption of a particular language by a community. Method: After a proof of concept with data from Sourceforge (2000 to 2009), data from GitHub (2009 to January 2016), a well-known open source software repository, and Stack Overflow (2008 to March 2016), a popular Q&A system for software developers, were obtained and preprocessed. Using cumulative biological growth functions, often used in epidemiological contexts, we obtained adjusted models to the data. Once with the adjusted models, we evaluated their predictive capabilities through repeated applications of hypothesis testing and statistical calculations in different versions of the models obtained after adjusting the functions to samples of different time frames from the repositories. Results: We show that programming language adoption can be formally considered an epidemiological phenomenon by adjusting a well-known mathematical function used to describe such phenomena. We also show that, using the models found, it is possible to forecast programming languages adoption. We also show that it is possible to have similar insights by observing user data, as well as data from the community itself, not using software developers as susceptible individuals. Limitations: The forecast of the adoption outcome (asymptote) needs to be taken with care because it varies depending on the sample size, which also influences the quality of forecasts in general. Unfortunately, we not always have control over the sample size, because it depends on the population under analysis. The forecast of programming language adoption is only valid for the analysed population; generalizations should be made with caution. Conclusion: Addressing programming languages adoption as an epidemiological phenomenon allows us to perform analyses not possible otherwise. We can have an overview of a population in real time regarding the use of a programming language, which allows us, as innovation agents, to adjust our technology if it is not achieving the desired “penetration”; as adoption agents, we may decide, ahead of our competitors, to adopt a seemingly promising technology that may ultimately become a standard.
id UFPE_2008e39dbdda5c93e85c150c81b768a7
oai_identifier_str oai:repositorio.ufpe.br:123456789/18000
network_acronym_str UFPE
network_name_str Repositório Institucional da UFPE
repository_id_str 2221
spelling The epidemics of programming language adoptionSoftware Engineering. Technology Transfer. Diffusion of Innovations. Programming Languages Adoption. Epidemic Models. Computational EpidemiologyEngenharia de Software. Transferência de Tecnologia. Difusão de Inovações. Adoção de Linguagens de Programação. Modelos Epidemiológicos. Epidemiologia ComputacionalContext: In Software Engineering, technology transfer has been treated as a problem that concernsonly two agents (innovation and adoption agents) working together to fill the knowledge gap between them. In this scenario, the transfer is carried out in a “peer-to-peer” fashion, not changing the reality of individuals and organizations around them. This approach works well when one is just seeking the adoption of a technology by a“specific client”. However, it can not solve a common problem that is the adoption of new technologies by a large mass of potential new users. In a wider context like this, it no longer makes sense to focus on “peer-to-peer” transfer. A new way of looking at the problem is necessary. It makes more sense to approach it as diffusion of innovations, where there is an information spreading in a community, similar to that observed in epidemics. Objective: This thesis proposes a paradigm shift to show the adoption of programming languages can be formally addressed as an epidemic. This focus shift allows the dynamics of programming language adoption to be mathematically modelled as such, and besides finding models that explain the community’s behaviour when adopting programming languages, it allows some predictions to be made, helping both individuals who wish to adopt a new language that might seem to be a new industry standard, and language designers to understand in real time the adoption of a particular language by a community. Method: After a proof of concept with data from Sourceforge (2000 to 2009), data from GitHub (2009 to January 2016), a well-known open source software repository, and Stack Overflow (2008 to March 2016), a popular Q&A system for software developers, were obtained and preprocessed. Using cumulative biological growth functions, often used in epidemiological contexts, we obtained adjusted models to the data. Once with the adjusted models, we evaluated their predictive capabilities through repeated applications of hypothesis testing and statistical calculations in different versions of the models obtained after adjusting the functions to samples of different time frames from the repositories. Results: We show that programming language adoption can be formally considered an epidemiological phenomenon by adjusting a well-known mathematical function used to describe such phenomena. We also show that, using the models found, it is possible to forecast programming languages adoption. We also show that it is possible to have similar insights by observing user data, as well as data from the community itself, not using software developers as susceptible individuals. Limitations: The forecast of the adoption outcome (asymptote) needs to be taken with care because it varies depending on the sample size, which also influences the quality of forecasts in general. Unfortunately, we not always have control over the sample size, because it depends on the population under analysis. The forecast of programming language adoption is only valid for the analysed population; generalizations should be made with caution. Conclusion: Addressing programming languages adoption as an epidemiological phenomenon allows us to perform analyses not possible otherwise. We can have an overview of a population in real time regarding the use of a programming language, which allows us, as innovation agents, to adjust our technology if it is not achieving the desired “penetration”; as adoption agents, we may decide, ahead of our competitors, to adopt a seemingly promising technology that may ultimately become a standard.FACEPEContexto: Em Engenharia de Software, transferência de tecnologia tem sido tratada como um problema pontual, um processo que diz respeito a dois agentes (os agentes de inovação e adoção) trabalhando juntos para preencher uma lacuna no conhecimento entre estes dois. Neste cenário, a transferência é realizada “ponto a ponto”, envolvendo e tendo efeito apenas nos indivíduos que participam do processo. Esta abordagem funciona bem quando se está buscando apenas a adoção da tecnologia por um “cliente” específico. No entanto, ela não consegue resolver um problema bastante comum que é a adoção de novas tecnologias por uma grande massa de potenciais novos usuários. Neste contexto mais amplo, não faz mais sentido focar em transferência ponto a ponto, faz-se necessária uma nova maneira de olhar para o problema. É mais interessante abordá-lo como difusão de inovações, onde existe um espalhamento da informação em uma comunidade, de maneira semelhante ao que se observa em epidemias. Objetivo: Esta tese de doutorado mostra que a adoção de linguagens de programação pode ser tratada formalmente como uma epidemia. Esta mudança conceitual na maneira de olhar para o fenômeno permite que a dinâmica da adoção de linguagens de programação seja modelada matematicamente como tal, e além de encontrar modelos que expliquem o comportamento da comunidade quando da adoção de uma linguagem de programação, permite que algumas previsões sejam realizadas, ajudando tanto indivíduos que desejem adotar uma nova linguagem que parece se apresentar como um novo padrão industrial, quanto ajudando projetistas de linguagens a entender em tempo real a adoção de uma determinada linguagem pela comunidade. Método: Após uma prova de conceito com dados do Sourceforge (2000 a 2009), dados do GitHub (2009 a janeiro de 2016) um repositório de projetos software de código aberto, e Stack Overflow (2008 a março de 2016) um popular sistema de perguntas e respostas para desenvolvedores de software, from obtidos e pré processados. Utilizando uma função de crescimento biológico cumulativo, frequentemente usada em contextos epidemiológicos, obtivemos modelos ajustados aos dados. Uma vez com os modelos ajustados, realizamos avaliações de sua precisão. Avaliamos suas capacidades de previsão através de repetidas aplicações de testes de hipóteses e cálculos de estatísticas em diferentes versões dos modelos, obtidas após ajustes das funções a amostras de diferentes tamanhos dos dados obtidos. Resultados: Mostramos que a adoção de linguagens de programação pode ser considerada formalmente um fenômeno epidemiológico através do ajuste de uma função matemática reconhecidamente útil para descrever tais fenômenos. Mostramos também que é possível, utilizando os modelos encontrados, realizar previsões da adoção de linguagens de programação em uma determinada comunidade. Ainda, mostramos que é possível obter conclusões semelhantes observando dados de usuários e dados da comunidade apenas, não usando desenvolvedores de software como indivíduos suscetíveis. Limitações: A previsão do limite superior da adoção (assíntota) não é confiável, variando muito dependendo do tamanho da amostra, que também influencia na qualidade das previsões em geral. Infelizmente, nem sempre teremos controle sob o tamanho da amostra, pois ela depende da população em análise. A adoção da linguagem de programação só é válida para a população em análise; generalizações devem ser realizadas com cautela. Conclusão: Abordar o fenômeno de adoção de linguagens de programação como um fenômeno epidemiológico nos permite realizar análises que não são possíveis de outro modo. Podemos ter uma visão geral de uma população em tempo real no que diz respeito ao uso de uma linguagem de programação, o que nos permite, com agentes de inovação, ajustar a tecnologia caso ela não esteja alcançando o alcance desejado; como agentes de adoção, podemos decidir por adotar uma tecnologia aparentemente promissora que pode vir a se tornar um padrão.Universidade Federal de PernambucoUFPEBrasilPrograma de Pos Graduacao em Ciencia da ComputacaoSOARES, Sérgio Castelo BrancoALBUQUERQUE, Jones Oliveira dehttp://lattes.cnpq.br/9019769374377941http://lattes.cnpq.br/6456667887502521BARREIROS, Emanoel Francisco Spósito2016-10-17T18:29:55Z2016-10-17T18:29:55Z2016-08-29info:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisapplication/pdfhttps://repositorio.ufpe.br/handle/123456789/18000ark:/64986/001300000xrngengAttribution-NonCommercial-NoDerivs 3.0 Brazilhttp://creativecommons.org/licenses/by-nc-nd/3.0/br/info:eu-repo/semantics/openAccessreponame:Repositório Institucional da UFPEinstname:Universidade Federal de Pernambuco (UFPE)instacron:UFPE2019-10-25T05:19:43Zoai:repositorio.ufpe.br:123456789/18000Repositório InstitucionalPUBhttps://repositorio.ufpe.br/oai/requestattena@ufpe.bropendoar:22212019-10-25T05:19:43Repositório Institucional da UFPE - Universidade Federal de Pernambuco (UFPE)false
dc.title.none.fl_str_mv The epidemics of programming language adoption
title The epidemics of programming language adoption
spellingShingle The epidemics of programming language adoption
BARREIROS, Emanoel Francisco Spósito
Software Engineering. Technology Transfer. Diffusion of Innovations. Programming Languages Adoption. Epidemic Models. Computational Epidemiology
Engenharia de Software. Transferência de Tecnologia. Difusão de Inovações. Adoção de Linguagens de Programação. Modelos Epidemiológicos. Epidemiologia Computacional
title_short The epidemics of programming language adoption
title_full The epidemics of programming language adoption
title_fullStr The epidemics of programming language adoption
title_full_unstemmed The epidemics of programming language adoption
title_sort The epidemics of programming language adoption
author BARREIROS, Emanoel Francisco Spósito
author_facet BARREIROS, Emanoel Francisco Spósito
author_role author
dc.contributor.none.fl_str_mv SOARES, Sérgio Castelo Branco
ALBUQUERQUE, Jones Oliveira de
http://lattes.cnpq.br/9019769374377941
http://lattes.cnpq.br/6456667887502521
dc.contributor.author.fl_str_mv BARREIROS, Emanoel Francisco Spósito
dc.subject.por.fl_str_mv Software Engineering. Technology Transfer. Diffusion of Innovations. Programming Languages Adoption. Epidemic Models. Computational Epidemiology
Engenharia de Software. Transferência de Tecnologia. Difusão de Inovações. Adoção de Linguagens de Programação. Modelos Epidemiológicos. Epidemiologia Computacional
topic Software Engineering. Technology Transfer. Diffusion of Innovations. Programming Languages Adoption. Epidemic Models. Computational Epidemiology
Engenharia de Software. Transferência de Tecnologia. Difusão de Inovações. Adoção de Linguagens de Programação. Modelos Epidemiológicos. Epidemiologia Computacional
description Context: In Software Engineering, technology transfer has been treated as a problem that concernsonly two agents (innovation and adoption agents) working together to fill the knowledge gap between them. In this scenario, the transfer is carried out in a “peer-to-peer” fashion, not changing the reality of individuals and organizations around them. This approach works well when one is just seeking the adoption of a technology by a“specific client”. However, it can not solve a common problem that is the adoption of new technologies by a large mass of potential new users. In a wider context like this, it no longer makes sense to focus on “peer-to-peer” transfer. A new way of looking at the problem is necessary. It makes more sense to approach it as diffusion of innovations, where there is an information spreading in a community, similar to that observed in epidemics. Objective: This thesis proposes a paradigm shift to show the adoption of programming languages can be formally addressed as an epidemic. This focus shift allows the dynamics of programming language adoption to be mathematically modelled as such, and besides finding models that explain the community’s behaviour when adopting programming languages, it allows some predictions to be made, helping both individuals who wish to adopt a new language that might seem to be a new industry standard, and language designers to understand in real time the adoption of a particular language by a community. Method: After a proof of concept with data from Sourceforge (2000 to 2009), data from GitHub (2009 to January 2016), a well-known open source software repository, and Stack Overflow (2008 to March 2016), a popular Q&A system for software developers, were obtained and preprocessed. Using cumulative biological growth functions, often used in epidemiological contexts, we obtained adjusted models to the data. Once with the adjusted models, we evaluated their predictive capabilities through repeated applications of hypothesis testing and statistical calculations in different versions of the models obtained after adjusting the functions to samples of different time frames from the repositories. Results: We show that programming language adoption can be formally considered an epidemiological phenomenon by adjusting a well-known mathematical function used to describe such phenomena. We also show that, using the models found, it is possible to forecast programming languages adoption. We also show that it is possible to have similar insights by observing user data, as well as data from the community itself, not using software developers as susceptible individuals. Limitations: The forecast of the adoption outcome (asymptote) needs to be taken with care because it varies depending on the sample size, which also influences the quality of forecasts in general. Unfortunately, we not always have control over the sample size, because it depends on the population under analysis. The forecast of programming language adoption is only valid for the analysed population; generalizations should be made with caution. Conclusion: Addressing programming languages adoption as an epidemiological phenomenon allows us to perform analyses not possible otherwise. We can have an overview of a population in real time regarding the use of a programming language, which allows us, as innovation agents, to adjust our technology if it is not achieving the desired “penetration”; as adoption agents, we may decide, ahead of our competitors, to adopt a seemingly promising technology that may ultimately become a standard.
publishDate 2016
dc.date.none.fl_str_mv 2016-10-17T18:29:55Z
2016-10-17T18:29:55Z
2016-08-29
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/doctoralThesis
format doctoralThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://repositorio.ufpe.br/handle/123456789/18000
dc.identifier.dark.fl_str_mv ark:/64986/001300000xrng
url https://repositorio.ufpe.br/handle/123456789/18000
identifier_str_mv ark:/64986/001300000xrng
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv Attribution-NonCommercial-NoDerivs 3.0 Brazil
http://creativecommons.org/licenses/by-nc-nd/3.0/br/
info:eu-repo/semantics/openAccess
rights_invalid_str_mv Attribution-NonCommercial-NoDerivs 3.0 Brazil
http://creativecommons.org/licenses/by-nc-nd/3.0/br/
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Universidade Federal de Pernambuco
UFPE
Brasil
Programa de Pos Graduacao em Ciencia da Computacao
publisher.none.fl_str_mv Universidade Federal de Pernambuco
UFPE
Brasil
Programa de Pos Graduacao em Ciencia da Computacao
dc.source.none.fl_str_mv reponame:Repositório Institucional da UFPE
instname:Universidade Federal de Pernambuco (UFPE)
instacron:UFPE
instname_str Universidade Federal de Pernambuco (UFPE)
instacron_str UFPE
institution UFPE
reponame_str Repositório Institucional da UFPE
collection Repositório Institucional da UFPE
repository.name.fl_str_mv Repositório Institucional da UFPE - Universidade Federal de Pernambuco (UFPE)
repository.mail.fl_str_mv attena@ufpe.br
_version_ 1846272633573736448