Gesture recognition for interaction tasks in immersive reality

Bibliographic Details
Main Author: Santos, José Luís Picão
Publication Date: 2024
Format: Master thesis
Language: eng
Source: Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
Download full: http://hdl.handle.net/10400.26/54137
Summary: With continuous advancements in Immersive technology, enhancing humancomputer interaction (HCI) with virtual environments has become a relevant topic. Commercial virtual reality (VR) systems are equipped with handheld controllers, which are limited in naturalness and user interactions. Immersive technologies benefit from the sense of presence, immersion, and embodiment to augment user experience. This project addresses these aspects by developing a computer vision-based dynamic gesture recognition system with full-body pose tracking integrated into a virtual reality experience. Unlike most existing applications that use only body tracking, our system combines both body and hand tracking and includes a gesture recognition module. To demonstrate the effectiveness of the approach, we propose a 3D virtual immersive environment scenario where the user engages in a game of "rock-paper-scissors" against the system, aiming to outperform it. This project uses the Mediapipe framework as a body pose tracking mechanism to extract the user’s articulation coordinates. The obtained data is processed using a long-short-term memory (LSTM) deep neural network (DNN) to classify dynamic gestures. The game engine Unity 3D is used to represent the avatar and to present the immersive experience to the user. To validate the developed system, experimental tasks were conducted with eight participants. Each participant had to play the game 5 times. The system was validated quantitatively by measuring the online classification accuracy, and the v subjective sense of realism, presence, involvement and system usability were evaluated using the iGroup Presence Questionnaire (IPQ). Results show the effective possibility of tracking both body and hands, with potential applications ranging from rehabilitation to sports. Regarding the integration of pose tracking for controlling the avatar in the 3D immersive VR environment, results indicate the experience causes a positive impact on users in terms of realism and presence, as evidenced by a 73.44% score on the IPQ. However, users reported a mismatch in their movements and the avatars during the experience. The performance of the gesture recognition classification model did not match the accuracy achieved during the offline validation and testing phases. This lack of generalisation of the model is attributed to the limited number of training samples and the low variability in participant's gestures, as the dataset included only a single individual. Overall, the biggest challenge of the project was the integration of the pose data to control the avatar. Nevertheless, the results demonstrated the feasibility of combining computer vision-based gesture interaction and full-body tracking ina VR experience
id RCAP_a26fe6b4cd7f83e5eb4a012d91f1f39e
oai_identifier_str oai:comum.rcaap.pt:10400.26/54137
network_acronym_str RCAP
network_name_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository_id_str https://opendoar.ac.uk/repository/7160
spelling Gesture recognition for interaction tasks in immersive realityComputer visionPose trackingSkeleton trackingGesture classificationImmersive realityWith continuous advancements in Immersive technology, enhancing humancomputer interaction (HCI) with virtual environments has become a relevant topic. Commercial virtual reality (VR) systems are equipped with handheld controllers, which are limited in naturalness and user interactions. Immersive technologies benefit from the sense of presence, immersion, and embodiment to augment user experience. This project addresses these aspects by developing a computer vision-based dynamic gesture recognition system with full-body pose tracking integrated into a virtual reality experience. Unlike most existing applications that use only body tracking, our system combines both body and hand tracking and includes a gesture recognition module. To demonstrate the effectiveness of the approach, we propose a 3D virtual immersive environment scenario where the user engages in a game of "rock-paper-scissors" against the system, aiming to outperform it. This project uses the Mediapipe framework as a body pose tracking mechanism to extract the user’s articulation coordinates. The obtained data is processed using a long-short-term memory (LSTM) deep neural network (DNN) to classify dynamic gestures. The game engine Unity 3D is used to represent the avatar and to present the immersive experience to the user. To validate the developed system, experimental tasks were conducted with eight participants. Each participant had to play the game 5 times. The system was validated quantitatively by measuring the online classification accuracy, and the v subjective sense of realism, presence, involvement and system usability were evaluated using the iGroup Presence Questionnaire (IPQ). Results show the effective possibility of tracking both body and hands, with potential applications ranging from rehabilitation to sports. Regarding the integration of pose tracking for controlling the avatar in the 3D immersive VR environment, results indicate the experience causes a positive impact on users in terms of realism and presence, as evidenced by a 73.44% score on the IPQ. However, users reported a mismatch in their movements and the avatars during the experience. The performance of the gesture recognition classification model did not match the accuracy achieved during the offline validation and testing phases. This lack of generalisation of the model is attributed to the limited number of training samples and the low variability in participant's gestures, as the dataset included only a single individual. Overall, the biggest challenge of the project was the integration of the pose data to control the avatar. Nevertheless, the results demonstrated the feasibility of combining computer vision-based gesture interaction and full-body tracking ina VR experiencePires, Gabriel PereiraAlmeida, Luís Agnelo deRepositório ComumSantos, José Luís Picão2025-02-01T12:01:28Z202420242024-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/masterThesisapplication/pdfhttp://hdl.handle.net/10400.26/54137urn:tid:203880854enginfo:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2025-05-05T09:53:21Zoai:comum.rcaap.pt:10400.26/54137Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-29T06:58:34.017912Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse
dc.title.none.fl_str_mv Gesture recognition for interaction tasks in immersive reality
title Gesture recognition for interaction tasks in immersive reality
spellingShingle Gesture recognition for interaction tasks in immersive reality
Santos, José Luís Picão
Computer vision
Pose tracking
Skeleton tracking
Gesture classification
Immersive reality
title_short Gesture recognition for interaction tasks in immersive reality
title_full Gesture recognition for interaction tasks in immersive reality
title_fullStr Gesture recognition for interaction tasks in immersive reality
title_full_unstemmed Gesture recognition for interaction tasks in immersive reality
title_sort Gesture recognition for interaction tasks in immersive reality
author Santos, José Luís Picão
author_facet Santos, José Luís Picão
author_role author
dc.contributor.none.fl_str_mv Pires, Gabriel Pereira
Almeida, Luís Agnelo de
Repositório Comum
dc.contributor.author.fl_str_mv Santos, José Luís Picão
dc.subject.por.fl_str_mv Computer vision
Pose tracking
Skeleton tracking
Gesture classification
Immersive reality
topic Computer vision
Pose tracking
Skeleton tracking
Gesture classification
Immersive reality
description With continuous advancements in Immersive technology, enhancing humancomputer interaction (HCI) with virtual environments has become a relevant topic. Commercial virtual reality (VR) systems are equipped with handheld controllers, which are limited in naturalness and user interactions. Immersive technologies benefit from the sense of presence, immersion, and embodiment to augment user experience. This project addresses these aspects by developing a computer vision-based dynamic gesture recognition system with full-body pose tracking integrated into a virtual reality experience. Unlike most existing applications that use only body tracking, our system combines both body and hand tracking and includes a gesture recognition module. To demonstrate the effectiveness of the approach, we propose a 3D virtual immersive environment scenario where the user engages in a game of "rock-paper-scissors" against the system, aiming to outperform it. This project uses the Mediapipe framework as a body pose tracking mechanism to extract the user’s articulation coordinates. The obtained data is processed using a long-short-term memory (LSTM) deep neural network (DNN) to classify dynamic gestures. The game engine Unity 3D is used to represent the avatar and to present the immersive experience to the user. To validate the developed system, experimental tasks were conducted with eight participants. Each participant had to play the game 5 times. The system was validated quantitatively by measuring the online classification accuracy, and the v subjective sense of realism, presence, involvement and system usability were evaluated using the iGroup Presence Questionnaire (IPQ). Results show the effective possibility of tracking both body and hands, with potential applications ranging from rehabilitation to sports. Regarding the integration of pose tracking for controlling the avatar in the 3D immersive VR environment, results indicate the experience causes a positive impact on users in terms of realism and presence, as evidenced by a 73.44% score on the IPQ. However, users reported a mismatch in their movements and the avatars during the experience. The performance of the gesture recognition classification model did not match the accuracy achieved during the offline validation and testing phases. This lack of generalisation of the model is attributed to the limited number of training samples and the low variability in participant's gestures, as the dataset included only a single individual. Overall, the biggest challenge of the project was the integration of the pose data to control the avatar. Nevertheless, the results demonstrated the feasibility of combining computer vision-based gesture interaction and full-body tracking ina VR experience
publishDate 2024
dc.date.none.fl_str_mv 2024
2024
2024-01-01T00:00:00Z
2025-02-01T12:01:28Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/masterThesis
format masterThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10400.26/54137
urn:tid:203880854
url http://hdl.handle.net/10400.26/54137
identifier_str_mv urn:tid:203880854
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.source.none.fl_str_mv reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron:RCAAP
instname_str FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron_str RCAAP
institution RCAAP
reponame_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
collection Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository.name.fl_str_mv Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
repository.mail.fl_str_mv info@rcaap.pt
_version_ 1833602844333178880