Análise de medidas objetivas e da percepção auditiva da soprosidade vocal

Detalhes bibliográficos
Ano de defesa: 2018
Autor(a) principal: Joao Pedro Hallack Sansao
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Universidade Federal de Minas Gerais
UFMG
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: http://hdl.handle.net/1843/BUOS-B63JZV
Resumo: Voice quality is a vast eld, which covers the areas of speech science and technology, telecommunications, phonetics and speech therapy. It aims to complement vocal fold visualization exams (such as video-laryngoscopy) in the investigation of dysphonic voices. In general, this type of anomaly results from changes aecting vocal folds anatomy due to lesions (nodules, polyps, etc.) and phonatory behavior due to abnormal tension in the laryngeal muscles. Clinical voice analysis is usually done in two ways: through the use of instrumental measures of the acoustic signal, for example, fundamental frequency cycle-to-cycle perturbations, or by auditory perceptive scales. In the GRBAS scale, which is the most commonly used, the perceptual attributes are general (holistic) impression (G), roughness (R), breathiness (B), asthenia (A) and strain (S). Attributes such as those are subjective and, although widely used clinically in the evaluation of problems in the voice, there are still diculties in using them consistently. In this work, we chose the breathiness perceptual attribute, a characteristic perceived due to turbulentow generated in the glottis or due to excessive air escape through a slit. Dierent degrees of severity are found in patients, varying from mild disturbance levels to extreme cases. The study of vocal breathiness was performed on two fronts: instrumental measurements and auditory perception. Related to instrumental measures, a platform has been developed to test various methods. This method comparison consists of (i) evaluating synthetic samples of voice with known signal-to-noise ratios, jitter and shimmer; and (ii) to evaluate predominantly breathy real voice samples on a 7-point perceptual scale. Methods known from the literature, such as CPPS (cepstral peak prominence smoothed), SFR (spectral atness residue signal), HNR (harmonic-to-noise ratio) and S2NR (spectrographic signal-to-noise ratio) were tested, the latter method being developed in this work. In this study, it was determined that the methods that present the greatest correlation between acoustic (objective) and perceptive (subjective) measurements are CPPS and S2NR. It was also observed that the S2NR presents greater robustness to frequency and amplitude perturbations than CPPS, being, in this criterion, a better measure of breathiness. Regarding perception, the initial step was breathiness psychophysics characterization relating the variations in the physical dimension (glottal noise level) and the perceptive scale. Another result was to obtain the just noticeable dierences related to noise intensity variation. The next objective of this work was establishing a comparative classication method for voice quality, which would require raters minimal training and prior knowledge, and has high intra-rater and inter-rater agreement. This method was based on the search for elements in a binary tree. The sequence of comparisons follows the structure of the tree, following until it is not possible to distinguish dierences or reach the last level. To do so, reference samples (anchors) are chosen, which are compared to the samples under analysis. Choices of Anchors and search tree depth are made based in breathiness psychophysics. The relative nature of the human ear is thus explored in contrast to current methods of perceptual evaluation in which the ear is treated as absolute. In the developed method, 3 or 7 anchors were chosen, with increasing signal-to-noise ratio, requiring 2 or 3 dierent comparisons for a single voice sample evaluation. In the experiments, using synthetic vowels and human breathy voice, the comparative evaluation presented high inter-rater reliability, even with inexperienced evaluators.