Using data mining to predict secondary school student performance

Bibliographic Details
Main Author: Cortez, Paulo
Publication Date: 2008
Other Authors: Silva, Alice Maria Gonçalves
Language: eng
Source: Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
Download full: http://hdl.handle.net/1822/8024
Summary: Although the educational level of the Portuguese population has improved in the last decades, the statistics keep Portugal at Europe’s tail end due to its high student failure rates. In particular, lack of success in the core classes of Mathematics and the Portuguese language is extremely serious. On the other hand, the fields of Business Intelligence (BI)/Data Mining (DM), which aim at extracting high-level knowledge from raw data, offer interesting automated tools that can aid the education domain. The present work intends to approach student achievement in secondary education using BI/DM techniques. Recent real-world data (e.g. student grades, demographic, social and school related features) was collected by using school reports and questionnaires. The two core classes (i.e. Mathematics and Portuguese) were modeled under binary/five-level classification and regression tasks. Also, four DM models (i.e. Decision Trees, Random Forest, Neural Networks and Support Vector Machines) and three input selections (e.g. with and without previous grades) were tested. The results show that a good predictive accuracy can be achieved, provided that the first and/or second school period grades are available. Although student achievement is highly influenced by past evaluations, an explanatory analysis has shown that there are also other relevant features (e.g. number of absences, parent’s job and education, alcohol consumption). As a direct outcome of this research, more efficient student prediction tools can be be developed, improving the quality of education and enhancing school resource management.
id RCAP_ac56b4f70e448e77dee99d74f937194c
oai_identifier_str oai:repositorium.sdum.uminho.pt:1822/8024
network_acronym_str RCAP
network_name_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository_id_str https://opendoar.ac.uk/repository/7160
spelling Using data mining to predict secondary school student performanceBusiness intelligence in educationClassification and regressionDecision treesRandom forestSocial SciencesScience & TechnologyAlthough the educational level of the Portuguese population has improved in the last decades, the statistics keep Portugal at Europe’s tail end due to its high student failure rates. In particular, lack of success in the core classes of Mathematics and the Portuguese language is extremely serious. On the other hand, the fields of Business Intelligence (BI)/Data Mining (DM), which aim at extracting high-level knowledge from raw data, offer interesting automated tools that can aid the education domain. The present work intends to approach student achievement in secondary education using BI/DM techniques. Recent real-world data (e.g. student grades, demographic, social and school related features) was collected by using school reports and questionnaires. The two core classes (i.e. Mathematics and Portuguese) were modeled under binary/five-level classification and regression tasks. Also, four DM models (i.e. Decision Trees, Random Forest, Neural Networks and Support Vector Machines) and three input selections (e.g. with and without previous grades) were tested. The results show that a good predictive accuracy can be achieved, provided that the first and/or second school period grades are available. Although student achievement is highly influenced by past evaluations, an explanatory analysis has shown that there are also other relevant features (e.g. number of absences, parent’s job and education, alcohol consumption). As a direct outcome of this research, more efficient student prediction tools can be be developed, improving the quality of education and enhancing school resource management.EUROSIS-ETIUniversidade do MinhoCortez, PauloSilva, Alice Maria Gonçalves2008-042008-04-01T00:00:00Zconference paperinfo:eu-repo/semantics/publishedVersionapplication/pdfhttp://hdl.handle.net/1822/8024engBRITO, A. ; TEIXEIRA, J., eds. lit. – “Proceedings of 5th Annual Future Business Technology Conference, Porto, 2008”. [S.l. : EUROSIS, 2008]. ISBN 978-9077381-39-7. p. 5-12.978-9077381-39-7info:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2024-05-11T05:09:47Zoai:repositorium.sdum.uminho.pt:1822/8024Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-28T15:09:59.846744Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse
dc.title.none.fl_str_mv Using data mining to predict secondary school student performance
title Using data mining to predict secondary school student performance
spellingShingle Using data mining to predict secondary school student performance
Cortez, Paulo
Business intelligence in education
Classification and regression
Decision trees
Random forest
Social Sciences
Science & Technology
title_short Using data mining to predict secondary school student performance
title_full Using data mining to predict secondary school student performance
title_fullStr Using data mining to predict secondary school student performance
title_full_unstemmed Using data mining to predict secondary school student performance
title_sort Using data mining to predict secondary school student performance
author Cortez, Paulo
author_facet Cortez, Paulo
Silva, Alice Maria Gonçalves
author_role author
author2 Silva, Alice Maria Gonçalves
author2_role author
dc.contributor.none.fl_str_mv Universidade do Minho
dc.contributor.author.fl_str_mv Cortez, Paulo
Silva, Alice Maria Gonçalves
dc.subject.por.fl_str_mv Business intelligence in education
Classification and regression
Decision trees
Random forest
Social Sciences
Science & Technology
topic Business intelligence in education
Classification and regression
Decision trees
Random forest
Social Sciences
Science & Technology
description Although the educational level of the Portuguese population has improved in the last decades, the statistics keep Portugal at Europe’s tail end due to its high student failure rates. In particular, lack of success in the core classes of Mathematics and the Portuguese language is extremely serious. On the other hand, the fields of Business Intelligence (BI)/Data Mining (DM), which aim at extracting high-level knowledge from raw data, offer interesting automated tools that can aid the education domain. The present work intends to approach student achievement in secondary education using BI/DM techniques. Recent real-world data (e.g. student grades, demographic, social and school related features) was collected by using school reports and questionnaires. The two core classes (i.e. Mathematics and Portuguese) were modeled under binary/five-level classification and regression tasks. Also, four DM models (i.e. Decision Trees, Random Forest, Neural Networks and Support Vector Machines) and three input selections (e.g. with and without previous grades) were tested. The results show that a good predictive accuracy can be achieved, provided that the first and/or second school period grades are available. Although student achievement is highly influenced by past evaluations, an explanatory analysis has shown that there are also other relevant features (e.g. number of absences, parent’s job and education, alcohol consumption). As a direct outcome of this research, more efficient student prediction tools can be be developed, improving the quality of education and enhancing school resource management.
publishDate 2008
dc.date.none.fl_str_mv 2008-04
2008-04-01T00:00:00Z
dc.type.driver.fl_str_mv conference paper
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/1822/8024
url http://hdl.handle.net/1822/8024
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv BRITO, A. ; TEIXEIRA, J., eds. lit. – “Proceedings of 5th Annual Future Business Technology Conference, Porto, 2008”. [S.l. : EUROSIS, 2008]. ISBN 978-9077381-39-7. p. 5-12.
978-9077381-39-7
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv EUROSIS-ETI
publisher.none.fl_str_mv EUROSIS-ETI
dc.source.none.fl_str_mv reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron:RCAAP
instname_str FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron_str RCAAP
institution RCAAP
reponame_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
collection Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository.name.fl_str_mv Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
repository.mail.fl_str_mv info@rcaap.pt
_version_ 1833595138800091136