FERAtt : new architecture learning for facial expression characterization

Detalhes bibliográficos
Ano de defesa: 2019
Autor(a) principal: FERNÁNDEZ, Pedro Diamel Marrero
Orientador(a): REN, Tsang Ing
Banca de defesa: Não Informado pela instituição
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Universidade Federal de Pernambuco
Programa de Pós-Graduação: Programa de Pos Graduacao em Ciencia da Computacao
Departamento: Não Informado pela instituição
País: Brasil
Palavras-chave em Português:
Link de acesso: https://repositorio.ufpe.br/handle/123456789/36907
Resumo: Affective computing is a branch of artificial intelligence responsible for the development of equipment and systems capable of interpreting, recognizing and processing human motions. The automatic understanding of human behavior is of great interest since it allows the creation of new human-machine interfaces. Within this behavior, facial expressions are the most convenient because of the wide range of emotions that can be transmitted. The human face conveys a large part of our emotional behavior. We use facial expressions to demonstrate our emotional states and to communicate our interactions. In addition, we express and read emotions through the expressions of faces without effort. However, automatic understanding of facial expressions is a task not yet solved from the computational point of view, especially in the presence of highly variable expression, artifacts, and poses. Currently, obtaining a semantic representation of expressions is a challenge for the affective computing community. This work promotes the field of facial expression recognition by providing new tools for the representation analysis of expression in static images. First, we present an analysis of the methods of extracting characteristics and methods of combining classifiers based on sparse representation applied to the facial expression recognition problem. We propose a system of multi-classifiers based on trainable combination rules for this problem. Second, we present a study of the main deep neural networks architectures applied in this problem. A comparative analysis allows to determine the best models of deep learning for the classification of facial expressions. Third, we propose a new supervised and semi-supervised representation approach based on metric learning. This type of approach allows us to obtain semantic representations of the facial expressions that are evaluated in this work. We propose a new loss function that generates Gaussian structures in the embedded space of facial expressions. Lastly, we propose FERAtt, a new end-to-end network architecture for facial expression recognition with an attention model. The FERAtt neuralnet focuses attention in the human face and uses a Gaussian space representation for expression recognition. We devise this architecture based on two fundamental complementary components: (1) facial image correction and attention and (2) facial expression representation and classification.