Um estudo sobre as relações de padrões do movimento facial com a acústica da fala e com a identidade do locutor

Detalhes bibliográficos
Ano de defesa: 2008
Autor(a) principal: Ketia Soares Moreira
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Universidade Federal de Minas Gerais
UFMG
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: http://hdl.handle.net/1843/BUOS-8CVHCU
Resumo: The study of the coupling between facial motion and speech acoustics is important for the comprehension of the speech production process. Moreover, the relation of facial motion with speaker identity during speech is important in the process of identification based on biometry. The ob jective of this work is to evaluate facial motion in order to: (i) evaluate the variability of parameters related to facial motion during speech production; (ii) verify whether such parameters are context dependent or independent; and (iii) evaluate to which degree facial motion is an individual characteristic. During speech, the geometry of the vocal tract determines its resonant frequencies (formants) and strongly influences the facial motion that occurs simultaneously. As a result, speech acoustic patterns and facial motion are coupled. The relation between regions of the face can be efficiently modeled by means of Principal Component Analysis, whereas the coupling between speech acoustics and facial motion can be modeled by facial motion components aligned with LSP parameters extracted from speech acoustics. One of the ob jectives of this study is to evaluate how that alignment varies with time. The results obtained show that only the first facial motion component is stable during speech, independently of the speech contents, and concentrates up to 55% of the facial motion variance. For the first acoustically aligned facial motion component, this stability is smaller. However, a larger stability is observed when LSP parameters, used to represent speech acoustically, are ordered based on their vocal tract cavity affiliation, rather than simply put in increasing order. In the study of facial motion patterns applied to person identification, the first eigenvector of the facial motion covariance matrix is used, as it exhibits speaker specific information. In this direction, tests using an MLP neural network were carried out for the task of person identification based on the eigenvector associated to the largest eigenvalue of the facial motion covariance matrix. An identification rate of 86,7% was attained, indicating that facial motion information alone is not enough for person identification. Nevertheless, this information can be used together with other pieces of information, such as static images or the speaker's voice, to improve the robustness of the identification process, specially under adverse conditions