Reconhecimento de comandos de voz em português brasileiro em ambientes ruidosos usando laringofone

Detalhes bibliográficos
Ano de defesa: 2019
Autor(a) principal: Ribeiro, Fábio Cisne
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Não Informado pela instituição
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: http://www.repositorio.ufc.br/handle/riufc/40251
Resumo: This thesis has as main objective the development of a system for voice commands recognition in noisy environments through isolated words spoken independent of a speaker, with emphasis on the use of throat microphone which is a acquisition sensor for speech signal more robust for this type of environment. The technology studied is presented through integrated hardware and software device that allow the use of speech as an instrument for the operation of a technological equipment. Thus, were research which techniques are best to perform the proposed voice processing. There is no other database with voice commands captured using throat microphone in Portuguese language in the researched literature. We created a database with isolated voice commands with captured utterances of 150 people (men and women). All voice samples are captured in Brazilian Portuguese, and are the digits “0” through “9” and the words “Ok” and “Cancel”. To remove the captured noises two filters were used, the Least Mean Squares in the temporal space and the Wavelet Transform in the space in frequency, so that this set allowed to remove the noises that are captured by the laringophone. The best feature extractor tested is the Perceptual LinearPrediction and its best configuration is the use of 9 or 10 indexes in the order of their coefficients. For classification it been used a voting committee composed of three classifiers, MLP, BMLP and SOM to recognize the voice command. For classification a voting committee composed of three classifiers, Multilayer Perceptron, Binary Multilayer Perceptron and SelfOrganizing Maps to recognize command of voice. The results show that throat microphone is robust in noise environment, reaching 96,6% of hit rate in our voice command recognition system. It was observed that vowels with low intensity and fricatives present in the words “3” and “7” in Portuguese confuse the classifier.