Deep learning for early performance prediction of introductory programming students: a comparative and explanatory study

Pereira, Filipe Dwan; Fonseca, Samuel C.; Oliveira, Elaine H. T.; Oliveira, David B. F.; Cristea, Alexandra I.; Carvalho, Leandro S. G.

Deep learning for early performance prediction of introductory programming students: a comparative and explanatory study

Detalhes bibliográficos
Autor(a) principal:	Pereira, Filipe Dwan
Data de Publicação:	2020
Outros Autores:	Fonseca, Samuel C., Oliveira, Elaine H. T., Oliveira, David B. F., Cristea, Alexandra I., Carvalho, Leandro S. G.
Tipo de documento:	Artigo
Idioma:	eng
Título da fonte:	Revista Brasileira de Informática na Educação
Texto Completo:	https://journals-sol.sbc.org.br/index.php/rbie/article/view/3959
Resumo:	Introductory programming may be complex for many students. Moreover, there is a high failure and dropout rate in these courses. A potential way to tackle this problem is to predict student performance at an early stage, as it facilitates human-AI collaboration towards prescriptive analytics, where the instructors/monitors will be told how to intervene and support students - where early intervention is crucial. However, the literature states that there is no reliable predictor yet for programming students’ performance, since even large-scale analysis of multiple features have resulted in only limited predictive power. Notice that Deep Learning (DL) can provide high-quality results for huge amount of data and complex problems. In this sense, we employed DL for early prediction of students’ performance using data collected in the very first two weeks from introductory programming courses offered for a total of 2058 students during 6 semesters (longitudinal study). We compared our results with the state-of-the-art, an Evolutionary Algorithm (EA) that automatic creates and optimises machine learning pipelines. Our DL model achieved an average accuracy of 82.5%, which is statistically superior to the model constructed and optimised by the EA (p-value << 0.05 even with Bonferroni correction). In addition, we also adapted the DL model in a stacking ensemble for continuous prediction purposes. As a result, our regression model explained ~62% of the final grade variance. In closing, we also provide results on the interpretation of our regression model to understand the leading factors of success and failure in introductory programming.

Metadados do item

id	SBC-6_59e1fba1decf80ddbf49561161a7e205
oai_identifier_str	oai:journals-sol.sbc.org.br:article/3959
network_acronym_str	SBC-6
network_name_str	Revista Brasileira de Informática na Educação
repository_id_str
spelling	Deep learning for early performance prediction of introductory programming students: a comparative and explanatory studyDeep learning for early performance prediction of introductory programming students: a comparative and explanatory studyDeep learning for early performance prediction of introductory programming students: a comparative and explanatory studyOnline judgesDeep LearningCS1introductory programmingpredictionOnline judgesDeep LearningCS1introductory programmingpredictionOnline judgesDeep LearningCS1introductory programmingpredictionIntroductory programming may be complex for many students. Moreover, there is a high failure and dropout rate in these courses. A potential way to tackle this problem is to predict student performance at an early stage, as it facilitates human-AI collaboration towards prescriptive analytics, where the instructors/monitors will be told how to intervene and support students - where early intervention is crucial. However, the literature states that there is no reliable predictor yet for programming students’ performance, since even large-scale analysis of multiple features have resulted in only limited predictive power. Notice that Deep Learning (DL) can provide high-quality results for huge amount of data and complex problems. In this sense, we employed DL for early prediction of students’ performance using data collected in the very first two weeks from introductory programming courses offered for a total of 2058 students during 6 semesters (longitudinal study). We compared our results with the state-of-the-art, an Evolutionary Algorithm (EA) that automatic creates and optimises machine learning pipelines. Our DL model achieved an average accuracy of 82.5%, which is statistically superior to the model constructed and optimised by the EA (p-value << 0.05 even with Bonferroni correction). In addition, we also adapted the DL model in a stacking ensemble for continuous prediction purposes. As a result, our regression model explained ~62% of the final grade variance. In closing, we also provide results on the interpretation of our regression model to understand the leading factors of success and failure in introductory programming.Introductory programming may be complex for many students. Moreover, there is a high failure and dropout rate in these courses. A potential way to tackle this problem is to predict student performance at an early stage, as it facilitates human-AI collaboration towards prescriptive analytics, where the instructors/monitors will be told how to intervene and support students - where early intervention is crucial. However, the literature states that there is no reliable predictor yet for programming students’ performance, since even large-scale analysis of multiple features have resulted in only limited predictive power. Notice that Deep Learning (DL) can provide high-quality results for huge amount of data and complex problems. In this sense, we employed DL for early prediction of students’ performance using data collected in the very first two weeks from introductory programming courses offered for a total of 2058 students during 6 semesters (longitudinal study). We compared our results with the state-of-the-art, an Evolutionary Algorithm (EA) that automatic creates and optimises machine learning pipelines. Our DL model achieved an average accuracy of 82.5%, which is statistically superior to the model constructed and optimised by the EA (p-value << 0.05 even with Bonferroni correction). In addition, we also adapted the DL model in a stacking ensemble for continuous prediction purposes. As a result, our regression model explained ~62% of the final grade variance. In closing, we also provide results on the interpretation of our regression model to understand the leading factors of success and failure in introductory programming.Introductory programming may be complex for many students. Moreover, there is a high failure and dropout rate in these courses. A potential way to tackle this problem is to predict student performance at an early stage, as it facilitates human-AI collaboration towards prescriptive analytics, where the instructors/monitors will be told how to intervene and support students - where early intervention is crucial. However, the literature states that there is no reliable predictor yet for programming students’ performance, since even large-scale analysis of multiple features have resulted in only limited predictive power. Notice that Deep Learning (DL) can provide high-quality results for huge amount of data and complex problems. In this sense, we employed DL for early prediction of students’ performance using data collected in the very first two weeks from introductory programming courses offered for a total of 2058 students during 6 semesters (longitudinal study). We compared our results with the state-of-the-art, an Evolutionary Algorithm (EA) that automatic creates and optimises machine learning pipelines. Our DL model achieved an average accuracy of 82.5%, which is statistically superior to the model constructed and optimised by the EA (p-value << 0.05 even with Bonferroni correction). In addition, we also adapted the DL model in a stacking ensemble for continuous prediction purposes. As a result, our regression model explained ~62% of the final grade variance. In closing, we also provide results on the interpretation of our regression model to understand the leading factors of success and failure in introductory programming.Sociedade Brasileira de Computação2020-10-12info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersionPeer-reviewed articleArtículo revisado por paresArtigo avaliado pelos paresapplication/pdfhttps://journals-sol.sbc.org.br/index.php/rbie/article/view/395910.5753/rbie.2020.28.0.723Revista Brasileña de Informática en la Educación; Vol. 28 (2020); 723-748Revista Brasileira de Informática na Educação; Vol. 28 (2020); 723-748Brazilian Journal of Computers in Education; Vol. 28 (2020); 723-7482317-61211414-5685reponame:Revista Brasileira de Informática na Educaçãoinstname:Sociedade Brasileira de Computação (SBC)instacron:SBCenghttps://journals-sol.sbc.org.br/index.php/rbie/article/view/3959/2462Copyright (c) 2020 Filipe Dwan Pereira, Samuel C. Fonseca, Elaine H. T. Oliveira, David B. F. Oliveira, Alexandra I. Cristea, Leandro S. G. Carvalhohttps://creativecommons.org/licenses/by-nc-nd/4.0info:eu-repo/semantics/openAccessPereira, Filipe DwanFonseca, Samuel C.Oliveira, Elaine H. T.Oliveira, David B. F.Cristea, Alexandra I.Carvalho, Leandro S. G.2023-12-28T21:06:21Zoai:journals-sol.sbc.org.br:article/3959Revistahttps://journals-sol.sbc.org.br/index.php/rbieONGhttps://journals-sol.sbc.org.br/index.php/rbie/oaipublicacoes@sbc.org.br2317-61211414-5685opendoar:2023-12-28T21:06:21Revista Brasileira de Informática na Educação - Sociedade Brasileira de Computação (SBC)false
dc.title.none.fl_str_mv	Deep learning for early performance prediction of introductory programming students: a comparative and explanatory study Deep learning for early performance prediction of introductory programming students: a comparative and explanatory study Deep learning for early performance prediction of introductory programming students: a comparative and explanatory study
title	Deep learning for early performance prediction of introductory programming students: a comparative and explanatory study
spellingShingle	Deep learning for early performance prediction of introductory programming students: a comparative and explanatory study Pereira, Filipe Dwan Online judges Deep Learning CS1 introductory programming prediction Online judges Deep Learning CS1 introductory programming prediction Online judges Deep Learning CS1 introductory programming prediction
title_short	Deep learning for early performance prediction of introductory programming students: a comparative and explanatory study
title_full	Deep learning for early performance prediction of introductory programming students: a comparative and explanatory study
title_fullStr	Deep learning for early performance prediction of introductory programming students: a comparative and explanatory study
title_full_unstemmed	Deep learning for early performance prediction of introductory programming students: a comparative and explanatory study
title_sort	Deep learning for early performance prediction of introductory programming students: a comparative and explanatory study
author	Pereira, Filipe Dwan
author_facet	Pereira, Filipe Dwan Fonseca, Samuel C. Oliveira, Elaine H. T. Oliveira, David B. F. Cristea, Alexandra I. Carvalho, Leandro S. G.
author_role	author
author2	Fonseca, Samuel C. Oliveira, Elaine H. T. Oliveira, David B. F. Cristea, Alexandra I. Carvalho, Leandro S. G.
author2_role	author author author author author
dc.contributor.author.fl_str_mv	Pereira, Filipe Dwan Fonseca, Samuel C. Oliveira, Elaine H. T. Oliveira, David B. F. Cristea, Alexandra I. Carvalho, Leandro S. G.
dc.subject.por.fl_str_mv	Online judges Deep Learning CS1 introductory programming prediction Online judges Deep Learning CS1 introductory programming prediction Online judges Deep Learning CS1 introductory programming prediction
topic	Online judges Deep Learning CS1 introductory programming prediction Online judges Deep Learning CS1 introductory programming prediction Online judges Deep Learning CS1 introductory programming prediction
description	Introductory programming may be complex for many students. Moreover, there is a high failure and dropout rate in these courses. A potential way to tackle this problem is to predict student performance at an early stage, as it facilitates human-AI collaboration towards prescriptive analytics, where the instructors/monitors will be told how to intervene and support students - where early intervention is crucial. However, the literature states that there is no reliable predictor yet for programming students’ performance, since even large-scale analysis of multiple features have resulted in only limited predictive power. Notice that Deep Learning (DL) can provide high-quality results for huge amount of data and complex problems. In this sense, we employed DL for early prediction of students’ performance using data collected in the very first two weeks from introductory programming courses offered for a total of 2058 students during 6 semesters (longitudinal study). We compared our results with the state-of-the-art, an Evolutionary Algorithm (EA) that automatic creates and optimises machine learning pipelines. Our DL model achieved an average accuracy of 82.5%, which is statistically superior to the model constructed and optimised by the EA (p-value << 0.05 even with Bonferroni correction). In addition, we also adapted the DL model in a stacking ensemble for continuous prediction purposes. As a result, our regression model explained ~62% of the final grade variance. In closing, we also provide results on the interpretation of our regression model to understand the leading factors of success and failure in introductory programming.
publishDate	2020
dc.date.none.fl_str_mv	2020-10-12
dc.type.driver.fl_str_mv	info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion Peer-reviewed article Artículo revisado por pares Artigo avaliado pelos pares
format	article
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	https://journals-sol.sbc.org.br/index.php/rbie/article/view/3959 10.5753/rbie.2020.28.0.723
url	https://journals-sol.sbc.org.br/index.php/rbie/article/view/3959
identifier_str_mv	10.5753/rbie.2020.28.0.723
dc.language.iso.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	https://journals-sol.sbc.org.br/index.php/rbie/article/view/3959/2462
dc.rights.driver.fl_str_mv	https://creativecommons.org/licenses/by-nc-nd/4.0 info:eu-repo/semantics/openAccess
rights_invalid_str_mv	https://creativecommons.org/licenses/by-nc-nd/4.0
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	application/pdf
dc.publisher.none.fl_str_mv	Sociedade Brasileira de Computação
publisher.none.fl_str_mv	Sociedade Brasileira de Computação
dc.source.none.fl_str_mv	Revista Brasileña de Informática en la Educación; Vol. 28 (2020); 723-748 Revista Brasileira de Informática na Educação; Vol. 28 (2020); 723-748 Brazilian Journal of Computers in Education; Vol. 28 (2020); 723-748 2317-6121 1414-5685 reponame:Revista Brasileira de Informática na Educação instname:Sociedade Brasileira de Computação (SBC) instacron:SBC
instname_str	Sociedade Brasileira de Computação (SBC)
instacron_str	SBC
institution	SBC
reponame_str	Revista Brasileira de Informática na Educação
collection	Revista Brasileira de Informática na Educação
repository.name.fl_str_mv	Revista Brasileira de Informática na Educação - Sociedade Brasileira de Computação (SBC)
repository.mail.fl_str_mv	publicacoes@sbc.org.br
_version_	1832111043536486400

Deep learning for early performance prediction of introductory programming students: a comparative and explanatory study

Registros relacionados