Deep learning for early performance prediction of introductory programming students: a comparative and explanatory study

Detalhes bibliográficos
Autor(a) principal: Pereira, Filipe Dwan
Data de Publicação: 2020
Outros Autores: Fonseca, Samuel C., Oliveira, Elaine H. T., Oliveira, David B. F., Cristea, Alexandra I., Carvalho, Leandro S. G.
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Revista Brasileira de Informática na Educação
Texto Completo: https://journals-sol.sbc.org.br/index.php/rbie/article/view/3959
Resumo: Introductory programming may be complex for many students. Moreover, there is a high failure and dropout rate in these courses. A potential way to tackle this problem is to predict student performance at an early stage, as it facilitates human-AI collaboration towards prescriptive analytics, where the instructors/monitors will be told how to intervene and support students - where early intervention is crucial. However, the literature states that there is no reliable predictor yet for programming students’ performance, since even large-scale analysis of multiple features have resulted in only limited predictive power. Notice that Deep Learning (DL) can provide high-quality results for huge amount of data and complex problems. In this sense, we employed DL for early prediction of students’ performance using data collected in the very first two weeks from introductory programming courses offered for a total of 2058 students during 6 semesters (longitudinal study). We compared our results with the state-of-the-art, an Evolutionary Algorithm (EA) that automatic creates and optimises machine learning pipelines. Our DL model achieved an average accuracy of 82.5%, which is statistically superior to the model constructed and optimised by the EA (p-value << 0.05 even with Bonferroni correction). In addition, we also adapted the DL model in a stacking ensemble for continuous prediction purposes. As a result, our regression model explained ~62% of the final grade variance. In closing, we also provide results on the interpretation of our regression model to understand the leading factors of success and failure in introductory programming.
id SBC-6_59e1fba1decf80ddbf49561161a7e205
oai_identifier_str oai:journals-sol.sbc.org.br:article/3959
network_acronym_str SBC-6
network_name_str Revista Brasileira de Informática na Educação
repository_id_str
spelling Deep learning for early performance prediction of introductory programming students: a comparative and explanatory studyDeep learning for early performance prediction of introductory programming students: a comparative and explanatory studyDeep learning for early performance prediction of introductory programming students: a comparative and explanatory studyOnline judgesDeep LearningCS1introductory programmingpredictionOnline judgesDeep LearningCS1introductory programmingpredictionOnline judgesDeep LearningCS1introductory programmingpredictionIntroductory programming may be complex for many students. Moreover, there is a high failure and dropout rate in these courses. A potential way to tackle this problem is to predict student performance at an early stage, as it facilitates human-AI collaboration towards prescriptive analytics, where the instructors/monitors will be told how to intervene and support students - where early intervention is crucial. However, the literature states that there is no reliable predictor yet for programming students’ performance, since even large-scale analysis of multiple features have resulted in only limited predictive power. Notice that Deep Learning (DL) can provide high-quality results for huge amount of data and complex problems. In this sense, we employed DL for early prediction of students’ performance using data collected in the very first two weeks from introductory programming courses offered for a total of 2058 students during 6 semesters (longitudinal study). We compared our results with the state-of-the-art, an Evolutionary Algorithm (EA) that automatic creates and optimises machine learning pipelines. Our DL model achieved an average accuracy of 82.5%, which is statistically superior to the model constructed and optimised by the EA (p-value << 0.05 even with Bonferroni correction). In addition, we also adapted the DL model in a stacking ensemble for continuous prediction purposes. As a result, our regression model explained ~62% of the final grade variance. In closing, we also provide results on the interpretation of our regression model to understand the leading factors of success and failure in introductory programming.Introductory programming may be complex for many students. Moreover, there is a high failure and dropout rate in these courses. A potential way to tackle this problem is to predict student performance at an early stage, as it facilitates human-AI collaboration towards prescriptive analytics, where the instructors/monitors will be told how to intervene and support students - where early intervention is crucial. However, the literature states that there is no reliable predictor yet for programming students’ performance, since even large-scale analysis of multiple features have resulted in only limited predictive power. Notice that Deep Learning (DL) can provide high-quality results for huge amount of data and complex problems. In this sense, we employed DL for early prediction of students’ performance using data collected in the very first two weeks from introductory programming courses offered for a total of 2058 students during 6 semesters (longitudinal study). We compared our results with the state-of-the-art, an Evolutionary Algorithm (EA) that automatic creates and optimises machine learning pipelines. Our DL model achieved an average accuracy of 82.5%, which is statistically superior to the model constructed and optimised by the EA (p-value << 0.05 even with Bonferroni correction). In addition, we also adapted the DL model in a stacking ensemble for continuous prediction purposes. As a result, our regression model explained ~62% of the final grade variance. In closing, we also provide results on the interpretation of our regression model to understand the leading factors of success and failure in introductory programming.Introductory programming may be complex for many students. Moreover, there is a high failure and dropout rate in these courses. A potential way to tackle this problem is to predict student performance at an early stage, as it facilitates human-AI collaboration towards prescriptive analytics, where the instructors/monitors will be told how to intervene and support students - where early intervention is crucial. However, the literature states that there is no reliable predictor yet for programming students’ performance, since even large-scale analysis of multiple features have resulted in only limited predictive power. Notice that Deep Learning (DL) can provide high-quality results for huge amount of data and complex problems. In this sense, we employed DL for early prediction of students’ performance using data collected in the very first two weeks from introductory programming courses offered for a total of 2058 students during 6 semesters (longitudinal study). We compared our results with the state-of-the-art, an Evolutionary Algorithm (EA) that automatic creates and optimises machine learning pipelines. Our DL model achieved an average accuracy of 82.5%, which is statistically superior to the model constructed and optimised by the EA (p-value << 0.05 even with Bonferroni correction). In addition, we also adapted the DL model in a stacking ensemble for continuous prediction purposes. As a result, our regression model explained ~62% of the final grade variance. In closing, we also provide results on the interpretation of our regression model to understand the leading factors of success and failure in introductory programming.Sociedade Brasileira de Computação2020-10-12info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionPeer-reviewed articleArtículo revisado por paresArtigo avaliado pelos paresapplication/pdfhttps://journals-sol.sbc.org.br/index.php/rbie/article/view/395910.5753/rbie.2020.28.0.723Revista Brasileña de Informática en la Educación; Vol. 28 (2020); 723-748Revista Brasileira de Informática na Educação; Vol. 28 (2020); 723-748Brazilian Journal of Computers in Education; Vol. 28 (2020); 723-7482317-61211414-5685reponame:Revista Brasileira de Informática na Educaçãoinstname:Sociedade Brasileira de Computação (SBC)instacron:SBCenghttps://journals-sol.sbc.org.br/index.php/rbie/article/view/3959/2462Copyright (c) 2020 Filipe Dwan Pereira, Samuel C. Fonseca, Elaine H. T. Oliveira, David B. F. Oliveira, Alexandra I. Cristea, Leandro S. G. Carvalhohttps://creativecommons.org/licenses/by-nc-nd/4.0info:eu-repo/semantics/openAccessPereira, Filipe DwanFonseca, Samuel C.Oliveira, Elaine H. T.Oliveira, David B. F.Cristea, Alexandra I.Carvalho, Leandro S. G.2023-12-28T21:06:21Zoai:journals-sol.sbc.org.br:article/3959Revistahttps://journals-sol.sbc.org.br/index.php/rbieONGhttps://journals-sol.sbc.org.br/index.php/rbie/oaipublicacoes@sbc.org.br2317-61211414-5685opendoar:2023-12-28T21:06:21Revista Brasileira de Informática na Educação - Sociedade Brasileira de Computação (SBC)false
dc.title.none.fl_str_mv Deep learning for early performance prediction of introductory programming students: a comparative and explanatory study
Deep learning for early performance prediction of introductory programming students: a comparative and explanatory study
Deep learning for early performance prediction of introductory programming students: a comparative and explanatory study
title Deep learning for early performance prediction of introductory programming students: a comparative and explanatory study
spellingShingle Deep learning for early performance prediction of introductory programming students: a comparative and explanatory study
Pereira, Filipe Dwan
Online judges
Deep Learning
CS1
introductory programming
prediction
Online judges
Deep Learning
CS1
introductory programming
prediction
Online judges
Deep Learning
CS1
introductory programming
prediction
title_short Deep learning for early performance prediction of introductory programming students: a comparative and explanatory study
title_full Deep learning for early performance prediction of introductory programming students: a comparative and explanatory study
title_fullStr Deep learning for early performance prediction of introductory programming students: a comparative and explanatory study
title_full_unstemmed Deep learning for early performance prediction of introductory programming students: a comparative and explanatory study
title_sort Deep learning for early performance prediction of introductory programming students: a comparative and explanatory study
author Pereira, Filipe Dwan
author_facet Pereira, Filipe Dwan
Fonseca, Samuel C.
Oliveira, Elaine H. T.
Oliveira, David B. F.
Cristea, Alexandra I.
Carvalho, Leandro S. G.
author_role author
author2 Fonseca, Samuel C.
Oliveira, Elaine H. T.
Oliveira, David B. F.
Cristea, Alexandra I.
Carvalho, Leandro S. G.
author2_role author
author
author
author
author
dc.contributor.author.fl_str_mv Pereira, Filipe Dwan
Fonseca, Samuel C.
Oliveira, Elaine H. T.
Oliveira, David B. F.
Cristea, Alexandra I.
Carvalho, Leandro S. G.
dc.subject.por.fl_str_mv Online judges
Deep Learning
CS1
introductory programming
prediction
Online judges
Deep Learning
CS1
introductory programming
prediction
Online judges
Deep Learning
CS1
introductory programming
prediction
topic Online judges
Deep Learning
CS1
introductory programming
prediction
Online judges
Deep Learning
CS1
introductory programming
prediction
Online judges
Deep Learning
CS1
introductory programming
prediction
description Introductory programming may be complex for many students. Moreover, there is a high failure and dropout rate in these courses. A potential way to tackle this problem is to predict student performance at an early stage, as it facilitates human-AI collaboration towards prescriptive analytics, where the instructors/monitors will be told how to intervene and support students - where early intervention is crucial. However, the literature states that there is no reliable predictor yet for programming students’ performance, since even large-scale analysis of multiple features have resulted in only limited predictive power. Notice that Deep Learning (DL) can provide high-quality results for huge amount of data and complex problems. In this sense, we employed DL for early prediction of students’ performance using data collected in the very first two weeks from introductory programming courses offered for a total of 2058 students during 6 semesters (longitudinal study). We compared our results with the state-of-the-art, an Evolutionary Algorithm (EA) that automatic creates and optimises machine learning pipelines. Our DL model achieved an average accuracy of 82.5%, which is statistically superior to the model constructed and optimised by the EA (p-value << 0.05 even with Bonferroni correction). In addition, we also adapted the DL model in a stacking ensemble for continuous prediction purposes. As a result, our regression model explained ~62% of the final grade variance. In closing, we also provide results on the interpretation of our regression model to understand the leading factors of success and failure in introductory programming.
publishDate 2020
dc.date.none.fl_str_mv 2020-10-12
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
info:eu-repo/semantics/publishedVersion
Peer-reviewed article
Artículo revisado por pares
Artigo avaliado pelos pares
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://journals-sol.sbc.org.br/index.php/rbie/article/view/3959
10.5753/rbie.2020.28.0.723
url https://journals-sol.sbc.org.br/index.php/rbie/article/view/3959
identifier_str_mv 10.5753/rbie.2020.28.0.723
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv https://journals-sol.sbc.org.br/index.php/rbie/article/view/3959/2462
dc.rights.driver.fl_str_mv https://creativecommons.org/licenses/by-nc-nd/4.0
info:eu-repo/semantics/openAccess
rights_invalid_str_mv https://creativecommons.org/licenses/by-nc-nd/4.0
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Sociedade Brasileira de Computação
publisher.none.fl_str_mv Sociedade Brasileira de Computação
dc.source.none.fl_str_mv Revista Brasileña de Informática en la Educación; Vol. 28 (2020); 723-748
Revista Brasileira de Informática na Educação; Vol. 28 (2020); 723-748
Brazilian Journal of Computers in Education; Vol. 28 (2020); 723-748
2317-6121
1414-5685
reponame:Revista Brasileira de Informática na Educação
instname:Sociedade Brasileira de Computação (SBC)
instacron:SBC
instname_str Sociedade Brasileira de Computação (SBC)
instacron_str SBC
institution SBC
reponame_str Revista Brasileira de Informática na Educação
collection Revista Brasileira de Informática na Educação
repository.name.fl_str_mv Revista Brasileira de Informática na Educação - Sociedade Brasileira de Computação (SBC)
repository.mail.fl_str_mv publicacoes@sbc.org.br
_version_ 1832111043536486400