An?lise comparativa entre os m?todos HMM e GMM-UBM na busca pelo a-?timo dos locutores crian?as utilizando a t?cnica VTLN

Detalhes bibliográficos
Ano de defesa: 2014
Autor(a) principal: Martins, Ramon Mayor lattes
Orientador(a): Ynoguti, Carlos Alberto lattes
Banca de defesa: Ynoguti, Carlos Alberto lattes, Guimar?es, Dayan Adionel lattes, Minami, M?rio lattes
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Instituto Nacional de Telecomunica??es
Programa de Pós-Graduação: Mestrado em Engenharia de Telecomunica??es
Departamento: Instituto Nacional de Telecomunica??es
País: Brasil
Palavras-chave em Português:
Área do conhecimento CNPq:
Link de acesso: http://tede.inatel.br:8080/tede/handle/tede/23
Resumo: The aim of this work is to find means to minimize the high error rate found in speech recognition systems which are trained on adult speakers and tested on children speakers. In this regard, we propose the use of the GMM-UBM method as an alternative to the HMM method to find the optimal warping factor (?-optimal) for children speakers when the speaker normalization technique is used. The adopted normalization technique was VTLN, which normalizes the vocal tract of different children speakers through the use of mel filterbank frequency warping. The assessment of this technique also aimed to find the optimal mixture quantity that improves the system performance. Thus, the error rate in the system trained with adults and tested on children was reduced from 4,95% to 1,88% when VTLN was used with ?-optimals found by HMM and to 1,92% when VTLN was used with ?-optimals found by GMM. It was noticed that the application of VTLN technique using ?-optimals found by GMM-UBM method achieved a similar performance to HMM in the experiments. From the experiments it was observed that choosing GMM-UBM method turns to be more suitable due to its implementation simplicity and to the need of lower computational cost, being thus an alternative to HMM in the use of VTLN in Speech Recognition Systems for children speakers.