Multimodal silent speech interface based on video, depth, surface electromyography and ultrasonic Doppler: Data collection and first recognition results

Freitas, J.; Teixeira, A.; Dias, M. S.

Multimodal silent speech interface based on video, depth, surface electromyography and ultrasonic Doppler: Data collection and first recognition results

Bibliographic Details
Main Author:	Freitas, J.
Publication Date:	2013
Other Authors:	Teixeira, A., Dias, M. S.
Language:	eng
Source:	Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
Download full:	http://hdl.handle.net/10071/29220
Summary:	Silent Speech Interfaces use data from the speech production process, such as visual information of face movements. However, using a single modality limits the amount of available information. In this study we start to explore the use of multiple data input modalities in order to acquire a more complete representation of the speech production model. We have selected 4 non-invasive modalities – Visual data from Video and Depth, Surface Electromyography and Ultrasonic Doppler - and created a system that explores the synchronous combination of all 4, or of a subset of them, into a multimodal Silent Speech Interface (SSI). This paper describes the system design, data collection and first word recognition results. As the first acquired corpora are necessarily small for this SSI, we use for classification an example based recognition approach based on Dynamic Time Warping followed by a weighted k-Nearest Neighbor classifier. The first classification results using different vocabularies, with digits, a small set of commands related to Ambient Assisted Living and minimal nasal pairs, show that word recognition benefits can be obtained from a multimodal approach.

Item metadata

id	RCAP_55e46d9bd1ff0a64de089543f41236c9
oai_identifier_str	oai:repositorio.iscte-iul.pt:10071/29220
network_acronym_str	RCAP
network_name_str	Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository_id_str	https://opendoar.ac.uk/repository/7160
spelling	Multimodal silent speech interface based on video, depth, surface electromyography and ultrasonic Doppler: Data collection and first recognition resultsSilent speech interfacesMultimodalVideo and depth informationSurface electromyographyUltrasonic doppler sensingSilent Speech Interfaces use data from the speech production process, such as visual information of face movements. However, using a single modality limits the amount of available information. In this study we start to explore the use of multiple data input modalities in order to acquire a more complete representation of the speech production model. We have selected 4 non-invasive modalities – Visual data from Video and Depth, Surface Electromyography and Ultrasonic Doppler - and created a system that explores the synchronous combination of all 4, or of a subset of them, into a multimodal Silent Speech Interface (SSI). This paper describes the system design, data collection and first word recognition results. As the first acquired corpora are necessarily small for this SSI, we use for classification an example based recognition approach based on Dynamic Time Warping followed by a weighted k-Nearest Neighbor classifier. The first classification results using different vocabularies, with digits, a small set of commands related to Ambient Assisted Living and minimal nasal pairs, show that word recognition benefits can be obtained from a multimodal approach.International Speech and Communication Association2023-08-30T14:09:41Z2013-01-01T00:00:00Z20132023-08-30T15:06:56Zconference objectinfo:eu-repo/semantics/publishedVersionapplication/pdfhttp://hdl.handle.net/10071/29220eng2308-457XFreitas, J.Teixeira, A.Dias, M. S.info:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2024-07-07T03:21:31Zoai:repositorio.iscte-iul.pt:10071/29220Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-28T18:21:44.976507Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse
dc.title.none.fl_str_mv	Multimodal silent speech interface based on video, depth, surface electromyography and ultrasonic Doppler: Data collection and first recognition results
title	Multimodal silent speech interface based on video, depth, surface electromyography and ultrasonic Doppler: Data collection and first recognition results
spellingShingle	Multimodal silent speech interface based on video, depth, surface electromyography and ultrasonic Doppler: Data collection and first recognition results Freitas, J. Silent speech interfaces Multimodal Video and depth information Surface electromyography Ultrasonic doppler sensing
title_short	Multimodal silent speech interface based on video, depth, surface electromyography and ultrasonic Doppler: Data collection and first recognition results
title_full	Multimodal silent speech interface based on video, depth, surface electromyography and ultrasonic Doppler: Data collection and first recognition results
title_fullStr	Multimodal silent speech interface based on video, depth, surface electromyography and ultrasonic Doppler: Data collection and first recognition results
title_full_unstemmed	Multimodal silent speech interface based on video, depth, surface electromyography and ultrasonic Doppler: Data collection and first recognition results
title_sort	Multimodal silent speech interface based on video, depth, surface electromyography and ultrasonic Doppler: Data collection and first recognition results
author	Freitas, J.
author_facet	Freitas, J. Teixeira, A. Dias, M. S.
author_role	author
author2	Teixeira, A. Dias, M. S.
author2_role	author author
dc.contributor.author.fl_str_mv	Freitas, J. Teixeira, A. Dias, M. S.
dc.subject.por.fl_str_mv	Silent speech interfaces Multimodal Video and depth information Surface electromyography Ultrasonic doppler sensing
topic	Silent speech interfaces Multimodal Video and depth information Surface electromyography Ultrasonic doppler sensing
description	Silent Speech Interfaces use data from the speech production process, such as visual information of face movements. However, using a single modality limits the amount of available information. In this study we start to explore the use of multiple data input modalities in order to acquire a more complete representation of the speech production model. We have selected 4 non-invasive modalities – Visual data from Video and Depth, Surface Electromyography and Ultrasonic Doppler - and created a system that explores the synchronous combination of all 4, or of a subset of them, into a multimodal Silent Speech Interface (SSI). This paper describes the system design, data collection and first word recognition results. As the first acquired corpora are necessarily small for this SSI, we use for classification an example based recognition approach based on Dynamic Time Warping followed by a weighted k-Nearest Neighbor classifier. The first classification results using different vocabularies, with digits, a small set of commands related to Ambient Assisted Living and minimal nasal pairs, show that word recognition benefits can be obtained from a multimodal approach.
publishDate	2013
dc.date.none.fl_str_mv	2013-01-01T00:00:00Z 2013 2023-08-30T14:09:41Z 2023-08-30T15:06:56Z
dc.type.driver.fl_str_mv	conference object
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	http://hdl.handle.net/10071/29220
url	http://hdl.handle.net/10071/29220
dc.language.iso.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	2308-457X
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	application/pdf
dc.publisher.none.fl_str_mv	International Speech and Communication Association
publisher.none.fl_str_mv	International Speech and Communication Association
dc.source.none.fl_str_mv	reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia instacron:RCAAP
instname_str	FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron_str	RCAAP
institution	RCAAP
reponame_str	Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
collection	Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository.name.fl_str_mv	Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
repository.mail.fl_str_mv	info@rcaap.pt
_version_	1833597357542866944

Multimodal silent speech interface based on video, depth, surface electromyography and ultrasonic Doppler: Data collection and first recognition results

Similar Items