Detalhes bibliográficos
Ano de defesa: |
2018 |
Autor(a) principal: |
Mamani, Gladys Marleny Hilasaca |
Orientador(a): |
Não Informado pela instituição |
Banca de defesa: |
Não Informado pela instituição |
Tipo de documento: |
Tese
|
Tipo de acesso: |
Acesso aberto |
Idioma: |
eng |
Instituição de defesa: |
Biblioteca Digitais de Teses e Dissertações da USP
|
Programa de Pós-Graduação: |
Não Informado pela instituição
|
Departamento: |
Não Informado pela instituição
|
País: |
Não Informado pela instituição
|
Palavras-chave em Português: |
|
Link de acesso: |
https://www.teses.usp.br/teses/disponiveis/55/55134/tde-01042019-140258/
|
Resumo: |
The goal of Dimensionality Reduction is to transform the data from highdimensional space into visual space preserving the existing relationships of the data in the original space. This abstract representation of complex data enables exploration of data similarities, but brings challenges about the analysis and interpretation for users on mismatching between the visual representation and the user expectation. A possible way to model these understandings is via different features to describe an object, because each feature has its own way to encode characteristic. In this thesis, we propose a visual approach to support users to combine different features that best approach their point of view regarding similarity. Our approach is a two-step strategy, starting from a small sample of the features, where users can easily test different feature combinations and check in real-time the resulting similarity relationships. Once a combination that matches the user expectation is defined, it is propagated to the whole dataset through an affine transformation. A traditional way to visualize data similarities is via scatter plots, however, it suffers from overlap issues. Overlapping hides data distributions, and makes the relationship among data instances difficult to observe, which hampers data exploration. In this work, we present a technique called Distance-preserving Grid (DGrid) to tackle this issue. DGrid employs a binary space partitioning process in combination with Dimensionality Reduction output to create orthogonal regular grid layouts. DGrid ensures non-overlapping instances because each data instance is assigned only to one grid cell. Our results show that DGrid is as precise as the existing state-of-the-art techniques based on grid representations, whereas requiring only a fraction of the running time and computational resources. Despite its simplicity, the quality of the produced layouts and the running times render DGrid as a very attractive method for large datasets. |