Block-based and structure-based techniques for large-scale graph processing and visualization

Detalhes bibliográficos
Ano de defesa: 2015
Autor(a) principal: Colmenares, Hugo Armando Gualdron
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: eng
Instituição de defesa: Biblioteca Digitais de Teses e Dissertações da USP
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: http://www.teses.usp.br/teses/disponiveis/55/55134/tde-23032016-145752/
Resumo: Data analysis techniques can be useful in decision-making processes, when patterns of interest can indicate trends in specific domains. Such trends might support evaluation, definition of alternatives, or prediction of events. Currently, datasets have increased in size and complexity, posing challenges to modern hardware resources. In the case of large datasets that can be represented as graphs, issues of visualization and scalable processing are of current concern. Distributed frameworks are commonly used to deal with this data, but the deployment and the management of computational clusters can be complex, demanding technical and financial resources that can be prohibitive in several scenarios. Therefore, it is desirable to design efficient techniques for processing and visualization of large scale graphs that optimize hardware resources in a single computational node. In this course of action, we developed a visualization technique named StructMatrix to find interesting insights on real-life graphs. In addition, we proposed a graph processing framework M-Flash that used a novel, bimodal block processing strategy (BBP) to boost computation speed by minimizing I/O cost. Our results show that our visualization technique allows an efficient and interactive exploration of big graphs and our framework MFlash significantly outperformed all state-of-the-art approaches based on secondary memory. Our contributions have been validated in peer-review events demonstrating the potential of our finding in fostering the analytical possibilities related to large-graph data domains.