Multiple imputation in big identifiable data for educational research: An example from the Brazilian education assessment system

Bibliographic Details
Main Author: Ferrão, Maria Eugénia
Publication Date: 2020
Other Authors: Prata, Paula, Alves, Maria Teresa G.
Format: Article
Language: eng
Source: Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
Download full: http://hdl.handle.net/10400.6/10484
Summary: Almost all quantitative studies in educational assessment, evaluation and educational research are based on incomplete data sets, which have been a problem for years without a single solution. The use of big identifiable data poses new challenges in dealing with missing values. In the first part of this paper, we present the state-of-art of the topic in the Brazilian education scientific literature, and how researchers have dealt with missing data since the turn of the century. Next, we use open access software to analyze real-world data, the 2017 Prova Brasil , for several federation units to document how the naïve assumption of missing completely at random may substantially affect statistical conclusions, researcher interpretations, and subsequent implications for policy and practice. We conclude with straightforward suggestions for any education researcher on applying R routines to conduct the hypotheses test of missing completely at random and, if the null hypothesis is rejected, then how to implement the multiple imputation, which appears to be one of the most appropriate methods for handling missing data.
id RCAP_e13b79e2396f1d3fac07a2456c2d3609
oai_identifier_str oai:ubibliorum.ubi.pt:10400.6/10484
network_acronym_str RCAP
network_name_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository_id_str https://opendoar.ac.uk/repository/7160
spelling Multiple imputation in big identifiable data for educational research: An example from the Brazilian education assessment systemProva BrasilMissing dataRMultiple imputationAlmost all quantitative studies in educational assessment, evaluation and educational research are based on incomplete data sets, which have been a problem for years without a single solution. The use of big identifiable data poses new challenges in dealing with missing values. In the first part of this paper, we present the state-of-art of the topic in the Brazilian education scientific literature, and how researchers have dealt with missing data since the turn of the century. Next, we use open access software to analyze real-world data, the 2017 Prova Brasil , for several federation units to document how the naïve assumption of missing completely at random may substantially affect statistical conclusions, researcher interpretations, and subsequent implications for policy and practice. We conclude with straightforward suggestions for any education researcher on applying R routines to conduct the hypotheses test of missing completely at random and, if the null hypothesis is rejected, then how to implement the multiple imputation, which appears to be one of the most appropriate methods for handling missing data.ScielouBibliorumFerrão, Maria EugéniaPrata, PaulaAlves, Maria Teresa G.2020-10-26T09:41:00Z20202020-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/10400.6/10484eng10.1590/s0104-40362020002802346info:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2025-03-11T15:48:46Zoai:ubibliorum.ubi.pt:10400.6/10484Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-29T01:29:10.342910Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse
dc.title.none.fl_str_mv Multiple imputation in big identifiable data for educational research: An example from the Brazilian education assessment system
title Multiple imputation in big identifiable data for educational research: An example from the Brazilian education assessment system
spellingShingle Multiple imputation in big identifiable data for educational research: An example from the Brazilian education assessment system
Ferrão, Maria Eugénia
Prova Brasil
Missing data
R
Multiple imputation
title_short Multiple imputation in big identifiable data for educational research: An example from the Brazilian education assessment system
title_full Multiple imputation in big identifiable data for educational research: An example from the Brazilian education assessment system
title_fullStr Multiple imputation in big identifiable data for educational research: An example from the Brazilian education assessment system
title_full_unstemmed Multiple imputation in big identifiable data for educational research: An example from the Brazilian education assessment system
title_sort Multiple imputation in big identifiable data for educational research: An example from the Brazilian education assessment system
author Ferrão, Maria Eugénia
author_facet Ferrão, Maria Eugénia
Prata, Paula
Alves, Maria Teresa G.
author_role author
author2 Prata, Paula
Alves, Maria Teresa G.
author2_role author
author
dc.contributor.none.fl_str_mv uBibliorum
dc.contributor.author.fl_str_mv Ferrão, Maria Eugénia
Prata, Paula
Alves, Maria Teresa G.
dc.subject.por.fl_str_mv Prova Brasil
Missing data
R
Multiple imputation
topic Prova Brasil
Missing data
R
Multiple imputation
description Almost all quantitative studies in educational assessment, evaluation and educational research are based on incomplete data sets, which have been a problem for years without a single solution. The use of big identifiable data poses new challenges in dealing with missing values. In the first part of this paper, we present the state-of-art of the topic in the Brazilian education scientific literature, and how researchers have dealt with missing data since the turn of the century. Next, we use open access software to analyze real-world data, the 2017 Prova Brasil , for several federation units to document how the naïve assumption of missing completely at random may substantially affect statistical conclusions, researcher interpretations, and subsequent implications for policy and practice. We conclude with straightforward suggestions for any education researcher on applying R routines to conduct the hypotheses test of missing completely at random and, if the null hypothesis is rejected, then how to implement the multiple imputation, which appears to be one of the most appropriate methods for handling missing data.
publishDate 2020
dc.date.none.fl_str_mv 2020-10-26T09:41:00Z
2020
2020-01-01T00:00:00Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10400.6/10484
url http://hdl.handle.net/10400.6/10484
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv 10.1590/s0104-40362020002802346
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv Scielo
publisher.none.fl_str_mv Scielo
dc.source.none.fl_str_mv reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron:RCAAP
instname_str FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron_str RCAAP
institution RCAAP
reponame_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
collection Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository.name.fl_str_mv Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
repository.mail.fl_str_mv info@rcaap.pt
_version_ 1833601005086834688