Deep learning for early performance prediction of introductory programming students: a comparative and explanatory study
Autor(a) principal: | |
---|---|
Data de Publicação: | 2020 |
Outros Autores: | , , , , |
Tipo de documento: | Artigo |
Idioma: | eng |
Título da fonte: | Revista Brasileira de Informática na Educação |
Texto Completo: | https://journals-sol.sbc.org.br/index.php/rbie/article/view/3959 |
Resumo: | Introductory programming may be complex for many students. Moreover, there is a high failure and dropout rate in these courses. A potential way to tackle this problem is to predict student performance at an early stage, as it facilitates human-AI collaboration towards prescriptive analytics, where the instructors/monitors will be told how to intervene and support students - where early intervention is crucial. However, the literature states that there is no reliable predictor yet for programming students’ performance, since even large-scale analysis of multiple features have resulted in only limited predictive power. Notice that Deep Learning (DL) can provide high-quality results for huge amount of data and complex problems. In this sense, we employed DL for early prediction of students’ performance using data collected in the very first two weeks from introductory programming courses offered for a total of 2058 students during 6 semesters (longitudinal study). We compared our results with the state-of-the-art, an Evolutionary Algorithm (EA) that automatic creates and optimises machine learning pipelines. Our DL model achieved an average accuracy of 82.5%, which is statistically superior to the model constructed and optimised by the EA (p-value << 0.05 even with Bonferroni correction). In addition, we also adapted the DL model in a stacking ensemble for continuous prediction purposes. As a result, our regression model explained ~62% of the final grade variance. In closing, we also provide results on the interpretation of our regression model to understand the leading factors of success and failure in introductory programming. |
id |
SBC-6_59e1fba1decf80ddbf49561161a7e205 |
---|---|
oai_identifier_str |
oai:journals-sol.sbc.org.br:article/3959 |
network_acronym_str |
SBC-6 |
network_name_str |
Revista Brasileira de Informática na Educação |
repository_id_str |
|
spelling |
Deep learning for early performance prediction of introductory programming students: a comparative and explanatory studyDeep learning for early performance prediction of introductory programming students: a comparative and explanatory studyDeep learning for early performance prediction of introductory programming students: a comparative and explanatory studyOnline judgesDeep LearningCS1introductory programmingpredictionOnline judgesDeep LearningCS1introductory programmingpredictionOnline judgesDeep LearningCS1introductory programmingpredictionIntroductory programming may be complex for many students. Moreover, there is a high failure and dropout rate in these courses. A potential way to tackle this problem is to predict student performance at an early stage, as it facilitates human-AI collaboration towards prescriptive analytics, where the instructors/monitors will be told how to intervene and support students - where early intervention is crucial. However, the literature states that there is no reliable predictor yet for programming students’ performance, since even large-scale analysis of multiple features have resulted in only limited predictive power. Notice that Deep Learning (DL) can provide high-quality results for huge amount of data and complex problems. In this sense, we employed DL for early prediction of students’ performance using data collected in the very first two weeks from introductory programming courses offered for a total of 2058 students during 6 semesters (longitudinal study). We compared our results with the state-of-the-art, an Evolutionary Algorithm (EA) that automatic creates and optimises machine learning pipelines. Our DL model achieved an average accuracy of 82.5%, which is statistically superior to the model constructed and optimised by the EA (p-value << 0.05 even with Bonferroni correction). In addition, we also adapted the DL model in a stacking ensemble for continuous prediction purposes. As a result, our regression model explained ~62% of the final grade variance. In closing, we also provide results on the interpretation of our regression model to understand the leading factors of success and failure in introductory programming.Introductory programming may be complex for many students. Moreover, there is a high failure and dropout rate in these courses. A potential way to tackle this problem is to predict student performance at an early stage, as it facilitates human-AI collaboration towards prescriptive analytics, where the instructors/monitors will be told how to intervene and support students - where early intervention is crucial. However, the literature states that there is no reliable predictor yet for programming students’ performance, since even large-scale analysis of multiple features have resulted in only limited predictive power. Notice that Deep Learning (DL) can provide high-quality results for huge amount of data and complex problems. In this sense, we employed DL for early prediction of students’ performance using data collected in the very first two weeks from introductory programming courses offered for a total of 2058 students during 6 semesters (longitudinal study). We compared our results with the state-of-the-art, an Evolutionary Algorithm (EA) that automatic creates and optimises machine learning pipelines. Our DL model achieved an average accuracy of 82.5%, which is statistically superior to the model constructed and optimised by the EA (p-value << 0.05 even with Bonferroni correction). In addition, we also adapted the DL model in a stacking ensemble for continuous prediction purposes. As a result, our regression model explained ~62% of the final grade variance. In closing, we also provide results on the interpretation of our regression model to understand the leading factors of success and failure in introductory programming.Introductory programming may be complex for many students. Moreover, there is a high failure and dropout rate in these courses. A potential way to tackle this problem is to predict student performance at an early stage, as it facilitates human-AI collaboration towards prescriptive analytics, where the instructors/monitors will be told how to intervene and support students - where early intervention is crucial. However, the literature states that there is no reliable predictor yet for programming students’ performance, since even large-scale analysis of multiple features have resulted in only limited predictive power. Notice that Deep Learning (DL) can provide high-quality results for huge amount of data and complex problems. In this sense, we employed DL for early prediction of students’ performance using data collected in the very first two weeks from introductory programming courses offered for a total of 2058 students during 6 semesters (longitudinal study). We compared our results with the state-of-the-art, an Evolutionary Algorithm (EA) that automatic creates and optimises machine learning pipelines. Our DL model achieved an average accuracy of 82.5%, which is statistically superior to the model constructed and optimised by the EA (p-value << 0.05 even with Bonferroni correction). In addition, we also adapted the DL model in a stacking ensemble for continuous prediction purposes. As a result, our regression model explained ~62% of the final grade variance. In closing, we also provide results on the interpretation of our regression model to understand the leading factors of success and failure in introductory programming.Sociedade Brasileira de Computação2020-10-12info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionPeer-reviewed articleArtículo revisado por paresArtigo avaliado pelos paresapplication/pdfhttps://journals-sol.sbc.org.br/index.php/rbie/article/view/395910.5753/rbie.2020.28.0.723Revista Brasileña de Informática en la Educación; Vol. 28 (2020); 723-748Revista Brasileira de Informática na Educação; Vol. 28 (2020); 723-748Brazilian Journal of Computers in Education; Vol. 28 (2020); 723-7482317-61211414-5685reponame:Revista Brasileira de Informática na Educaçãoinstname:Sociedade Brasileira de Computação (SBC)instacron:SBCenghttps://journals-sol.sbc.org.br/index.php/rbie/article/view/3959/2462Copyright (c) 2020 Filipe Dwan Pereira, Samuel C. Fonseca, Elaine H. T. Oliveira, David B. F. Oliveira, Alexandra I. Cristea, Leandro S. G. Carvalhohttps://creativecommons.org/licenses/by-nc-nd/4.0info:eu-repo/semantics/openAccessPereira, Filipe DwanFonseca, Samuel C.Oliveira, Elaine H. T.Oliveira, David B. F.Cristea, Alexandra I.Carvalho, Leandro S. G.2023-12-28T21:06:21Zoai:journals-sol.sbc.org.br:article/3959Revistahttps://journals-sol.sbc.org.br/index.php/rbieONGhttps://journals-sol.sbc.org.br/index.php/rbie/oaipublicacoes@sbc.org.br2317-61211414-5685opendoar:2023-12-28T21:06:21Revista Brasileira de Informática na Educação - Sociedade Brasileira de Computação (SBC)false |
dc.title.none.fl_str_mv |
Deep learning for early performance prediction of introductory programming students: a comparative and explanatory study Deep learning for early performance prediction of introductory programming students: a comparative and explanatory study Deep learning for early performance prediction of introductory programming students: a comparative and explanatory study |
title |
Deep learning for early performance prediction of introductory programming students: a comparative and explanatory study |
spellingShingle |
Deep learning for early performance prediction of introductory programming students: a comparative and explanatory study Pereira, Filipe Dwan Online judges Deep Learning CS1 introductory programming prediction Online judges Deep Learning CS1 introductory programming prediction Online judges Deep Learning CS1 introductory programming prediction |
title_short |
Deep learning for early performance prediction of introductory programming students: a comparative and explanatory study |
title_full |
Deep learning for early performance prediction of introductory programming students: a comparative and explanatory study |
title_fullStr |
Deep learning for early performance prediction of introductory programming students: a comparative and explanatory study |
title_full_unstemmed |
Deep learning for early performance prediction of introductory programming students: a comparative and explanatory study |
title_sort |
Deep learning for early performance prediction of introductory programming students: a comparative and explanatory study |
author |
Pereira, Filipe Dwan |
author_facet |
Pereira, Filipe Dwan Fonseca, Samuel C. Oliveira, Elaine H. T. Oliveira, David B. F. Cristea, Alexandra I. Carvalho, Leandro S. G. |
author_role |
author |
author2 |
Fonseca, Samuel C. Oliveira, Elaine H. T. Oliveira, David B. F. Cristea, Alexandra I. Carvalho, Leandro S. G. |
author2_role |
author author author author author |
dc.contributor.author.fl_str_mv |
Pereira, Filipe Dwan Fonseca, Samuel C. Oliveira, Elaine H. T. Oliveira, David B. F. Cristea, Alexandra I. Carvalho, Leandro S. G. |
dc.subject.por.fl_str_mv |
Online judges Deep Learning CS1 introductory programming prediction Online judges Deep Learning CS1 introductory programming prediction Online judges Deep Learning CS1 introductory programming prediction |
topic |
Online judges Deep Learning CS1 introductory programming prediction Online judges Deep Learning CS1 introductory programming prediction Online judges Deep Learning CS1 introductory programming prediction |
description |
Introductory programming may be complex for many students. Moreover, there is a high failure and dropout rate in these courses. A potential way to tackle this problem is to predict student performance at an early stage, as it facilitates human-AI collaboration towards prescriptive analytics, where the instructors/monitors will be told how to intervene and support students - where early intervention is crucial. However, the literature states that there is no reliable predictor yet for programming students’ performance, since even large-scale analysis of multiple features have resulted in only limited predictive power. Notice that Deep Learning (DL) can provide high-quality results for huge amount of data and complex problems. In this sense, we employed DL for early prediction of students’ performance using data collected in the very first two weeks from introductory programming courses offered for a total of 2058 students during 6 semesters (longitudinal study). We compared our results with the state-of-the-art, an Evolutionary Algorithm (EA) that automatic creates and optimises machine learning pipelines. Our DL model achieved an average accuracy of 82.5%, which is statistically superior to the model constructed and optimised by the EA (p-value << 0.05 even with Bonferroni correction). In addition, we also adapted the DL model in a stacking ensemble for continuous prediction purposes. As a result, our regression model explained ~62% of the final grade variance. In closing, we also provide results on the interpretation of our regression model to understand the leading factors of success and failure in introductory programming. |
publishDate |
2020 |
dc.date.none.fl_str_mv |
2020-10-12 |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion Peer-reviewed article Artículo revisado por pares Artigo avaliado pelos pares |
format |
article |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
https://journals-sol.sbc.org.br/index.php/rbie/article/view/3959 10.5753/rbie.2020.28.0.723 |
url |
https://journals-sol.sbc.org.br/index.php/rbie/article/view/3959 |
identifier_str_mv |
10.5753/rbie.2020.28.0.723 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.relation.none.fl_str_mv |
https://journals-sol.sbc.org.br/index.php/rbie/article/view/3959/2462 |
dc.rights.driver.fl_str_mv |
https://creativecommons.org/licenses/by-nc-nd/4.0 info:eu-repo/semantics/openAccess |
rights_invalid_str_mv |
https://creativecommons.org/licenses/by-nc-nd/4.0 |
eu_rights_str_mv |
openAccess |
dc.format.none.fl_str_mv |
application/pdf |
dc.publisher.none.fl_str_mv |
Sociedade Brasileira de Computação |
publisher.none.fl_str_mv |
Sociedade Brasileira de Computação |
dc.source.none.fl_str_mv |
Revista Brasileña de Informática en la Educación; Vol. 28 (2020); 723-748 Revista Brasileira de Informática na Educação; Vol. 28 (2020); 723-748 Brazilian Journal of Computers in Education; Vol. 28 (2020); 723-748 2317-6121 1414-5685 reponame:Revista Brasileira de Informática na Educação instname:Sociedade Brasileira de Computação (SBC) instacron:SBC |
instname_str |
Sociedade Brasileira de Computação (SBC) |
instacron_str |
SBC |
institution |
SBC |
reponame_str |
Revista Brasileira de Informática na Educação |
collection |
Revista Brasileira de Informática na Educação |
repository.name.fl_str_mv |
Revista Brasileira de Informática na Educação - Sociedade Brasileira de Computação (SBC) |
repository.mail.fl_str_mv |
publicacoes@sbc.org.br |
_version_ |
1832111043536486400 |