Object detection and pose estimation from natural features for augmented reality in complex scenes

Detalhes bibliográficos
Ano de defesa: 2016
Autor(a) principal: SIMOES, Francisco Paulo Magalhaes
Orientador(a): TEICHRIEB, Veronica
Banca de defesa: Não Informado pela instituição
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: eng
Instituição de defesa: Universidade Federal de Pernambuco
Programa de Pós-Graduação: Programa de Pos Graduacao em Ciencia da Computacao
Departamento: Não Informado pela instituição
País: Brasil
Palavras-chave em Português:
Link de acesso: https://repositorio.ufpe.br/handle/123456789/22417
Resumo: Alignment of virtual elements to the real world scenes (known as detection and tracking) relying on features that are naturally present on the scene is one of the most important challenges in Augmented Reality. When it goes to complex scenes like industrial scenarios, the problem gets bigger with the lack of features and models, high specularity and others. Based on these problems, this PhD thesis addresses the question “How to improve object detection and pose estimation from natural features for AR when dealing with complex scenes problems?”. In order to answer this question, we need to ask ourselves “What are the challenges that we face when developing a new tracker for real world scenarios?”. We begin to answer these questions by developing a complete tracking system that tackles some characteristics typically found in industrial scenarios. This system was validated in a tracking competition organized by the most important AR conference in the world, called ISMAR. During the contest, two complementary problems to tracking were also discussed: calibration, procedure which puts the virtual information in the same coordinate system of the real world, and 3D reconstruction, which is responsible for creating 3D models of the scene to be used for tracking. Because many trackers need a pre-acquired model of the target objects, the quality of the generated geometric model of the objects influences the tracker, as observed on the tracking contest. Sometimes these models are available but in other cases their acquisition represents a great effort (manually) or cost (laser scanning). Because of this we decided to analyze how difficult it is today to automatically recover 3D geometry from complex 3D scenes by using only video. In our case, we considered an electrical substation as a complex 3D scene. Based on the acquired knowledge from previous experiments, we decided to first tackle the problem of improving the tracking for scenes where we can use recent RGB-D sensors during model generation and tracking. We developed a technique called DARP, Depth Assisted Rectification of Patches, which can improve matching by using rectified features based on patches normals. We analyzed this new technique under different synthetic and real scenes and improved the results over traditional texture based trackers like ORB, DAFT or SIFT. Since model generation is a difficult problem in complex scenes, our second proposed tracking approach does not depend on these geometric models and aims to track texture or textureless objects. We applied a supervised learning technique, called Gradient Boosting Trees (GBTs) to solve the tracking as a linear regression problem. We developed this technique by using image gradients and analyzing their relationship with tracking parameters. We also proposed an improvement over GBTs by using traditional tracking approaches together with them, like intensity or edge based features which turned their piecewise constant function to a more robust piecewise linear function. With the new approach, it was possible to track textureless objects like a black and white map for example.