Uma Abordagem Multi-Nível Baseada em Redes Neurais Convolucionais para Redução do Viés Algorítmico na Localização de Pontos Faciais

Detalhes bibliográficos
Ano de defesa: 2023
Autor(a) principal: FREITAS, Ricardo Teles lattes
Orientador(a): AIRES, Kelson Rômulo Teixeira lattes
Banca de defesa: AIRES, Kelson Rômulo Teixeira lattes, PAIVA, Anselmo Cardoso de lattes, GOMES, Herman Martins lattes, CLUA, Esteban Walter Gonzalez lattes, BRAZ JÚNIOR, Geraldo lattes
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Universidade Federal do Maranhão
Programa de Pós-Graduação: PROGRAMA DE PÓS-GRADUAÇÃO DOUTORADO EM CIÊNCIA DA COMPUTAÇÃO
Departamento: DEPARTAMENTO DE INFORMÁTICA/CCET
País: Brasil
Palavras-chave em Português:
Palavras-chave em Inglês:
Área do conhecimento CNPq:
Link de acesso: https://tedebc.ufma.br/jspui/handle/tede/5108
Resumo: Facial landmarks localization is a challenging task in the context of Vision Computing whose results are explored in many high-level facial applications. To solve this problem, convolutional neural networks models are capable of achieving performance levels close to human annotation. The main criteria adopted to evaluate these models is the average point-to-point distance, regarding the whole dataset, from an expert annotation and the predicted value. However, recent studies shed light on a problem still obscure in those models’ evaluations. The issue consists on the existence of significant performance differences of facial analysis models when applied to different demographic groups. This characterizes an algorithmic bias that may lead to discrimination or favoritism in the analysis of a group over another. This work proposes a face alignment approach aiming to reduce the performance gaps among demographically distinct groups with respect to facial landmarks location. The approach focuses on a multi-level strategy based on convolutional neural networks for face modeling. The top level comprises a coordinate regression model for facial subunit detection. The bottom level uses the responses to model the subunits landmarks coordinates. The models were trained with the unbalanced datasets Helen, LFPW, AFW, and 300W and applied in the Toronto Neuroface balanced with ALS (amyotrophic lateral sclerosis) patients, post-stroke patients, and a control group. The comparison with the state-of-the-art models for face alignment revealed two significant advances: the application of bottom-level models in ideal facial subunits detection conditions was capable of significantly reducing the performance gap among all groups, besides the overall error was significantly reduced as well; and the application of the multi-level approach reduced the performance gap between ALS and the control group to insignificant, besides the overall performance is comparable to the state-of-the-art. The approach showed to be capable of mitigating the algorithmic bias present in predictive models generated after an unbalanced dataset.