Medida da relação harmônico/ruído em vozes disfônicas pelo processamento digital de imagens espectrográficas
Ano de defesa: | 2009 |
---|---|
Autor(a) principal: | |
Orientador(a): | |
Banca de defesa: | |
Tipo de documento: | Dissertação |
Tipo de acesso: | Acesso aberto |
Idioma: | por |
Instituição de defesa: |
Universidade Federal de Minas Gerais
UFMG |
Programa de Pós-Graduação: |
Não Informado pela instituição
|
Departamento: |
Não Informado pela instituição
|
País: |
Não Informado pela instituição
|
Palavras-chave em Português: | |
Link de acesso: | http://hdl.handle.net/1843/BUOS-8CZGJ3 |
Resumo: | This work presents the S2NR, Spectrographic Signal-to-Noise Ratio, a signal-to-noise ratio measurement obtained from the processing of vowel spectrograms by using adaptations of fingerprint image enhancement algorithms. In order to validate the S2NR method, a test bench was set to generate synthetic vowels with controlled values of fundamental frequency, amplitude, additive white noise, and cycle-to-cycle perturbations in the waveform amplitude (shimmer) and phonatory period (jitter). For comparison purposes, vowels were synthesized with known signal-to-noise ratio values. Next, the signal-to-noise ratio was measured with the S2NR algorithm and a method based on time domain periodicity analysis. In most of the synthetic voices, the S2NR exhibited a behavior more robust to jitter and shimmer perturbations than the time based algorithm, having also a reduced sensitivity to the vowel type. Both male and female fundamental frequencies were tested with /a/, /i/, and /u/ vocal tract shapes. Initially, jitter and shimmer were assessed independently, the simulated perturbation values varying from inexistent to extreme conditions in the human voice (0% to 3% for jitter, and 0% to 30% for shimmer). With jitter and Fo = 120 Hz , the measured S2NR estimates deviated from the reference values by 2.1 dB, 11.5 dB, and 2.9 dB for /a/, /i/ and /u/ respectively. With shimmer, these differences were 2.5 dB, 4.4 dB, and 3.6 dB. Subsequently both perturbations were varied simultaneously within the same ranges, no performance degradation occurring other than those observed with separated perturbations. Finally, the S2NR algorithm was tested with real, dysphonic, and predominantly breathy voices. Results showed a consistent relation between S2NR values and perceptual ratings of breathiness. Additionally, the potential application of the S2NR algorithm in running speech was explored. |