Explainable models for automated essay scoring in the presence of biased scoring

Detalhes bibliográficos
Ano de defesa: 2019
Autor(a) principal: Evelin Carvalho Freire de Amorim
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: eng
Instituição de defesa: Universidade Federal de Minas Gerais
Brasil
ICX - DEPARTAMENTO DE CIÊNCIA DA COMPUTAÇÃO
Programa de Pós-Graduação em Ciência da Computação
UFMG
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: http://hdl.handle.net/1843/33535
Resumo: Written essays are the common way to select candidates for universities; therefore, students need to write essays as many as possible. Because of this, several methods for Automatic Essay Scoring (AES) for the English language have been proposed. Such methods should explain the score assigned to essays, and then the student can use the feedback to improve his or her writing skills. Therefore, many of the existing AES proposals employ handcrafted features instead of continuous vector representation. By using handcrafted features, it is easier for the system to give feedback to a student. Handcrafted features are also helpful to scrutinize the score assigned by an AES sys- tem and even the scores of human evaluators. This kind of investigation is useful to identify whether the features related to a writing skill are being considered during the assessment, which is essential if we desire fairer evaluations. We present in this work an AES methodology to score essays according to five aspects or skills using handcrafted features and classical machine learning algorithms. In addition to that, we perform experiments to analyze which features influence which aspects in two different datasets evaluated by two distinct human evaluators. The performance of each aspect is explained by the feature analysis. Also, we explore the efficacy of AES models in the presence of biased data. Finally, we analyzed the evaluator’s comments about essays by using a Portuguese lexicon list of biased words, which was assembled by Cançado et al. [2019]. Several experiments demonstrate the explainability of our models, and our proposed approach enhances the efficacy of AES models. The results regarding explainability are clear and assert that some features are particularly important for some aspects, while for other aspects, they are unimportant. We also show that the bias affects the efficacy of the classifiers, and when biased ratings are removed from the dataset, the accuracy of the model improves.