Caracterização e classificação de sinais de voz por combinação de vogais sustentadas: um estudo baseado na transformada wavelet haar

Detalhes bibliográficos
Ano de defesa: 2021
Autor(a) principal: Oliveira, Brígida Farias Cardoso
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Não Informado pela instituição
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: http://www.repositorio.ufc.br/handle/riufc/59968
Resumo: This work investigates the use of single and combined sustained vowels in the characterization of normal and abnormal voice signals based on the Haar wavelet decomposition coefficients, considering signals from the biological genders male and female and a combination of both. We also perform the Kruskal-Wallis statistical test, in order to analyze the pairs of variables composed of the vowels (single or combined) and the wavelet decomposition levels. The output of this statistical test allows identifying which level reaches (leads to) the best classification accuracy. We selected the lowest level of decomposition because it is associated with the lowest computational cost. Although recent studies have shown that single sustained vowels allow accurate voice characterization, the literature lacks evidence on using the combination of them. The proposed methodology for characterizing and classifying normal and abnormal voices considers the energy calculation of details and approximation coefficients obtained from the decomposition of the Haar wavelet transform. We use the /a/, /i/, and /u/ single vowels and the combination to identify the scenario that results in the best classification of the voice signals. We conducted experiments on two public datasets, one from Portugal named Advanced Voice Function Assessment Database (AVFAD) and another one from Germany named Saarbrücken Voice Database (SVD). We analyzed the wavelet decomposition levels in the range of 4 to 18 in different scenarios: voices of female and male speakers and both genders. Our results revealed that the wavelet coefficients extracted from the combination of vowels improved the signal description and identified subtle features of pathological voices. We also showed that the Haar wavelet-based features extracted from combined vowels achieved accurate voice classification with fewer decomposition levels. This approach enabled accuracy improvements of at least 2,61% e 15,61% for AVFAD and SVD datasets, respectively, regardless of the biological gender.