Detalhes bibliográficos
Ano de defesa: |
2021 |
Autor(a) principal: |
Alves, Adson Nogueira |
Orientador(a): |
Não Informado pela instituição |
Banca de defesa: |
Não Informado pela instituição |
Tipo de documento: |
Dissertação
|
Tipo de acesso: |
Acesso aberto |
Idioma: |
eng |
Instituição de defesa: |
Universidade Estadual Paulista (Unesp)
|
Programa de Pós-Graduação: |
Não Informado pela instituição
|
Departamento: |
Não Informado pela instituição
|
País: |
Não Informado pela instituição
|
Palavras-chave em Português: |
|
Link de acesso: |
http://hdl.handle.net/11449/214035
|
Resumo: |
Unmanned Aerial Vehicles (UAV) have received increasing attention in recent years mainly due to their breadth of application in complex and costly activities, such as surveillance, agriculture, and entertainment. All of this market and academic interest has highlighted new challenges that the platform will confront. Among these challenges is the complexity of navigation in unknown environments due to the randomness of agents’ position and movement dynamics in the environment. Thus, new learning techniques have been proposed for these and other tasks in recent years. Particularly, model-free algorithms based on the process of exploration and autonomous learning have been highlighted in this domain. This is the case of Reinforcement Learning (RL). RL seeks appropriate behavior for the robot through a trial and error approach and mapping input states to commands in actuators directly. Thus, any pre-defined control structure becomes unnecessary. The present work aims to investigate the navigation of UAVs using a state-of-the-art method and off-policy method, the Soft Actor-Critic (SAC) of Deep Learning (DL). Our proposed approach employs visual information from the environment and multiple embedded sensors and the Autoencoder (AE) method to reduce the dimensionality of the visual data collected in the environment. We developed our work using the CoppeliaSim simulator, which has a high degree of fidelity concerning the real world. In this scenario, we investigated the aircraft state representation and the resulting navigation in environments with or without obstacles, fixed and mobile. The results showed that the learned policy was able to perform the low-level control of the UAV in all analyzed scenarios. The learned policies have good generalization capabilities. However, as the complexity of the environment increased, we re-used the learned policies from less complex environments with further training needed. |