Explorando paralelismo em big data no processamento de séries temporais de imagens de sensoriamento remoto

Detalhes bibliográficos
Ano de defesa: 2019
Autor(a) principal: Oliveira, Sávio Salvarino Teles de lattes
Orientador(a): Martins, Wellington Santos lattes
Banca de defesa: Martins, Wellington Santos, Costa, Fábio Moreira, Carvalho, Sérgio Teixeira de, Silva, Nilton Correia da, Davis Júnior, Clodoveu Augusto
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Universidade Federal de Goiás
Programa de Pós-Graduação: Programa de Pós-graduação em Ciência da Computação em Rede UFG/UFMS (INF)
Departamento: Instituto de Informática - INF (RG)
País: Brasil
Palavras-chave em Português:
Palavras-chave em Inglês:
Área do conhecimento CNPq:
Link de acesso: http://repositorio.bc.ufg.br/tede/handle/tede/10020
Resumo: The surface of planet Earth is changing at an unprecedented rate and the land use and land cover classification using remote sensing time series is now essential for identifying these changes. The TWDTW algorithm stands out in this task, but it has a quadratic complexity and high computational cost, making it difficult to use with Big Data. In this paper we tackle these problems by exploiting parallelism at both the vertical (multicore / manycore) and horizontal (cluster - distributed system) levels, in an integrated way for high performance. In the vertical dimension, we propose a parallel algorithm (P- INDEX) for the calculation of remote sensing indices, and another (P-TWDTW) for the calculation of similarity between time series. The speedup of P-INDEX was up to 9 times relative to the sequential algorithm in processing all images, while P-TWDTW was up to 12 times faster than its C++ centralized version and 246 times faster than the original in R TWDTW algorithm. In addition to enabling the quick calculation of a more sophisticated similarity measure, P- TWDTW also contributed to the generation of meta-characteristics for more robust machine learning methods. This increased the accuracy of the time series classification from 78% using TWDTW with KNN to almost 94% using the meta-characteristics obtained from P-TWDTW with SVM. In the horizontal dimension, we propose a distributed platform (BigSensing) that enables efficient handling of large volumes of remote sensing data. The platform includes a smart query engine that is able to choose, in real time, the best system to filter and retrieve data according to the spatial and temporal constraints of the query, with a nearly 22% reduction in response time over SciDB.