Um arcabouço baseado em componentes para computação paralela de larga escala sobre grafos

Bibliographic Details
Main Author: Rezende, Cenez Araújo de
Publication Date: 2017
Format: Doctoral thesis
Language: por
Source: Repositório Institucional da Universidade Federal do Ceará (UFC)
Download full: http://www.repositorio.ufc.br/handle/riufc/26086
Summary: Faced with the increasing growth of data production to be processed by computer systems, a result of the current technological context and emerging applications of both industrial and scientific interest, researchers and companies have been looking for solutions to leverage large-scale data processing and analysis capacity. In addition to the large volume, many of these data must be processed by high-complexity algorithms, highlighting the inherent difficulties of problems in large graphs (BigGraph), often used to model information from large databases. Although with limitations in graph processing, the MapReduce model has motivated the construction of several high-performance frameworks, in order to meet the demand for efficient large-scale general purpose systems. Such a context has led to the proposal of more specialized solutions, such as Pregel and GAS (Gather, Apply, Scatter), as well as MapReduce extensions to deal with graph processing. However, frameworks that implement these models still have limitations, such as multi-platform constraints and general propose programming models for graphs. In this work, we show how component-oriented parallel programming can deal with MapReduce and conventional Pregel constraints. For that, we have employed HPC shelf, a component-based cloud computing platform for HPC services. On top of this platform, we introduce Gust, a flexible, extensible and adaptable BigGraph framework based on MapReduce. Besides the gains in software architecture, due to the use of a component-oriented approach, we have obtained competitive performance results compared to the state-of-the-art through an experimental study, using estatistical methods to increase confidence.
id UFC-7_8e767e74f4a7fea89e717c269a1b37cf
oai_identifier_str oai:repositorio.ufc.br:riufc/26086
network_acronym_str UFC-7
network_name_str Repositório Institucional da Universidade Federal do Ceará (UFC)
repository_id_str
spelling Rezende, Cenez Araújo deCarvalho Junior, Francisco Heron de2017-09-26T12:43:31Z2017-09-26T12:43:31Z2017REZENDE, Cenez Araújo de. Um arcabouço baseado em componentes para computação paralela de larga escala sobre grafos. 2017. 175 f. Tese (Doutorado em Ciência da Computação) - Universidade Federal do Ceará, Fortaleza, 2017.http://www.repositorio.ufc.br/handle/riufc/26086Faced with the increasing growth of data production to be processed by computer systems, a result of the current technological context and emerging applications of both industrial and scientific interest, researchers and companies have been looking for solutions to leverage large-scale data processing and analysis capacity. In addition to the large volume, many of these data must be processed by high-complexity algorithms, highlighting the inherent difficulties of problems in large graphs (BigGraph), often used to model information from large databases. Although with limitations in graph processing, the MapReduce model has motivated the construction of several high-performance frameworks, in order to meet the demand for efficient large-scale general purpose systems. Such a context has led to the proposal of more specialized solutions, such as Pregel and GAS (Gather, Apply, Scatter), as well as MapReduce extensions to deal with graph processing. However, frameworks that implement these models still have limitations, such as multi-platform constraints and general propose programming models for graphs. In this work, we show how component-oriented parallel programming can deal with MapReduce and conventional Pregel constraints. For that, we have employed HPC shelf, a component-based cloud computing platform for HPC services. On top of this platform, we introduce Gust, a flexible, extensible and adaptable BigGraph framework based on MapReduce. Besides the gains in software architecture, due to the use of a component-oriented approach, we have obtained competitive performance results compared to the state-of-the-art through an experimental study, using estatistical methods to increase confidence.Diante do progressivo crescimento da produção de dados a serem processados por sistemas de computação, produto do contexto tecnológico vigente e de aplicações emergentes tanto de interesse industrial quanto científico, têm-se buscado soluções para alavancar a capacidade de processamento e análise de dados em larga escala. Além de volumosos, estão propícios a serem processados por algoritmos de alta complexidade, destacando as dificuldades inerentes a problemas em grafos grandes (BigGraph), frequentemente usados para modelar informações de grandes bases de dados. O modelo MapReduce, embora com limitações nesse domínio, abriu o caminho para a construção de vários arcabouços de alto desempenho, buscando atender à demanda por eficiente processamento de larga escala com propósito geral. Isso motivou o surgimento de soluções mais especializadas, voltadas a grafos, tais como os modelos Pregel e GAS (Gather, Apply, Scatter), bem como extensões do próprio MapReduce. Contudo, arcabouços que implementam esses modelos possuem ainda limitações, como restrições a multiplataformas e modelos mais gerais de programação. Neste trabalho, mostramos como a programação paralela orientada a componentes pode lidar com as limitações MapReduce e de modelos convencionais Pregel. Isso é feito usando a HPC Shelf, uma plataforma de computação em nuvem baseada em componentes para serviços HPC. Visando essa plataforma, apresentamos o Gust, um arcabouço BigGraph flexível, extensível e adaptável baseado em MapReduce. Através de estudo experimental, os resultados têm sido competitivos com o estado da arte, tanto em desempenho com na engenharia de software paralelo, com base em interesses funcionais e não funcionais.Programação paralelaEngenharia de software baseada em componentesProcessamento de grafosComputação de alto desempenhoNuvem computacionalUm arcabouço baseado em componentes para computação paralela de larga escala sobre grafosA component-oriented framework for large-scale parallel processing of big graphsinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisporreponame:Repositório Institucional da Universidade Federal do Ceará (UFC)instname:Universidade Federal do Ceará (UFC)instacron:UFCinfo:eu-repo/semantics/openAccessLICENSElicense.txtlicense.txttext/plain; charset=utf-81748http://repositorio.ufc.br/bitstream/riufc/26086/2/license.txt8a4605be74aa9ea9d79846c1fba20a33MD52ORIGINAL2017_tese_carezende.pdf2017_tese_carezende.pdfapplication/pdf2364641http://repositorio.ufc.br/bitstream/riufc/26086/3/2017_tese_carezende.pdf9b3282376b757bebf60c1ded80c52aa2MD53riufc/260862019-11-11 19:07:14.187oai:repositorio.ufc.br:riufc/26086Tk9URTogUExBQ0UgWU9VUiBPV04gTElDRU5TRSBIRVJFClRoaXMgc2FtcGxlIGxpY2Vuc2UgaXMgcHJvdmlkZWQgZm9yIGluZm9ybWF0aW9uYWwgcHVycG9zZXMgb25seS4KCk5PTi1FWENMVVNJVkUgRElTVFJJQlVUSU9OIExJQ0VOU0UKCkJ5IHNpZ25pbmcgYW5kIHN1Ym1pdHRpbmcgdGhpcyBsaWNlbnNlLCB5b3UgKHRoZSBhdXRob3Iocykgb3IgY29weXJpZ2h0Cm93bmVyKSBncmFudHMgdG8gRFNwYWNlIFVuaXZlcnNpdHkgKERTVSkgdGhlIG5vbi1leGNsdXNpdmUgcmlnaHQgdG8gcmVwcm9kdWNlLAp0cmFuc2xhdGUgKGFzIGRlZmluZWQgYmVsb3cpLCBhbmQvb3IgZGlzdHJpYnV0ZSB5b3VyIHN1Ym1pc3Npb24gKGluY2x1ZGluZwp0aGUgYWJzdHJhY3QpIHdvcmxkd2lkZSBpbiBwcmludCBhbmQgZWxlY3Ryb25pYyBmb3JtYXQgYW5kIGluIGFueSBtZWRpdW0sCmluY2x1ZGluZyBidXQgbm90IGxpbWl0ZWQgdG8gYXVkaW8gb3IgdmlkZW8uCgpZb3UgYWdyZWUgdGhhdCBEU1UgbWF5LCB3aXRob3V0IGNoYW5naW5nIHRoZSBjb250ZW50LCB0cmFuc2xhdGUgdGhlCnN1Ym1pc3Npb24gdG8gYW55IG1lZGl1bSBvciBmb3JtYXQgZm9yIHRoZSBwdXJwb3NlIG9mIHByZXNlcnZhdGlvbi4KCllvdSBhbHNvIGFncmVlIHRoYXQgRFNVIG1heSBrZWVwIG1vcmUgdGhhbiBvbmUgY29weSBvZiB0aGlzIHN1Ym1pc3Npb24gZm9yCnB1cnBvc2VzIG9mIHNlY3VyaXR5LCBiYWNrLXVwIGFuZCBwcmVzZXJ2YXRpb24uCgpZb3UgcmVwcmVzZW50IHRoYXQgdGhlIHN1Ym1pc3Npb24gaXMgeW91ciBvcmlnaW5hbCB3b3JrLCBhbmQgdGhhdCB5b3UgaGF2ZQp0aGUgcmlnaHQgdG8gZ3JhbnQgdGhlIHJpZ2h0cyBjb250YWluZWQgaW4gdGhpcyBsaWNlbnNlLiBZb3UgYWxzbyByZXByZXNlbnQKdGhhdCB5b3VyIHN1Ym1pc3Npb24gZG9lcyBub3QsIHRvIHRoZSBiZXN0IG9mIHlvdXIga25vd2xlZGdlLCBpbmZyaW5nZSB1cG9uCmFueW9uZSdzIGNvcHlyaWdodC4KCklmIHRoZSBzdWJtaXNzaW9uIGNvbnRhaW5zIG1hdGVyaWFsIGZvciB3aGljaCB5b3UgZG8gbm90IGhvbGQgY29weXJpZ2h0LAp5b3UgcmVwcmVzZW50IHRoYXQgeW91IGhhdmUgb2J0YWluZWQgdGhlIHVucmVzdHJpY3RlZCBwZXJtaXNzaW9uIG9mIHRoZQpjb3B5cmlnaHQgb3duZXIgdG8gZ3JhbnQgRFNVIHRoZSByaWdodHMgcmVxdWlyZWQgYnkgdGhpcyBsaWNlbnNlLCBhbmQgdGhhdApzdWNoIHRoaXJkLXBhcnR5IG93bmVkIG1hdGVyaWFsIGlzIGNsZWFybHkgaWRlbnRpZmllZCBhbmQgYWNrbm93bGVkZ2VkCndpdGhpbiB0aGUgdGV4dCBvciBjb250ZW50IG9mIHRoZSBzdWJtaXNzaW9uLgoKSUYgVEhFIFNVQk1JU1NJT04gSVMgQkFTRUQgVVBPTiBXT1JLIFRIQVQgSEFTIEJFRU4gU1BPTlNPUkVEIE9SIFNVUFBPUlRFRApCWSBBTiBBR0VOQ1kgT1IgT1JHQU5JWkFUSU9OIE9USEVSIFRIQU4gRFNVLCBZT1UgUkVQUkVTRU5UIFRIQVQgWU9VIEhBVkUKRlVMRklMTEVEIEFOWSBSSUdIVCBPRiBSRVZJRVcgT1IgT1RIRVIgT0JMSUdBVElPTlMgUkVRVUlSRUQgQlkgU1VDSApDT05UUkFDVCBPUiBBR1JFRU1FTlQuCgpEU1Ugd2lsbCBjbGVhcmx5IGlkZW50aWZ5IHlvdXIgbmFtZShzKSBhcyB0aGUgYXV0aG9yKHMpIG9yIG93bmVyKHMpIG9mIHRoZQpzdWJtaXNzaW9uLCBhbmQgd2lsbCBub3QgbWFrZSBhbnkgYWx0ZXJhdGlvbiwgb3RoZXIgdGhhbiBhcyBhbGxvd2VkIGJ5IHRoaXMKbGljZW5zZSwgdG8geW91ciBzdWJtaXNzaW9uLgo=Repositório InstitucionalPUBhttp://www.repositorio.ufc.br/ri-oai/requestbu@ufc.br || repositorio@ufc.bropendoar:2019-11-11T22:07:14Repositório Institucional da Universidade Federal do Ceará (UFC) - Universidade Federal do Ceará (UFC)false
dc.title.pt_BR.fl_str_mv Um arcabouço baseado em componentes para computação paralela de larga escala sobre grafos
dc.title.en.pt_BR.fl_str_mv A component-oriented framework for large-scale parallel processing of big graphs
title Um arcabouço baseado em componentes para computação paralela de larga escala sobre grafos
spellingShingle Um arcabouço baseado em componentes para computação paralela de larga escala sobre grafos
Rezende, Cenez Araújo de
Programação paralela
Engenharia de software baseada em componentes
Processamento de grafos
Computação de alto desempenho
Nuvem computacional
title_short Um arcabouço baseado em componentes para computação paralela de larga escala sobre grafos
title_full Um arcabouço baseado em componentes para computação paralela de larga escala sobre grafos
title_fullStr Um arcabouço baseado em componentes para computação paralela de larga escala sobre grafos
title_full_unstemmed Um arcabouço baseado em componentes para computação paralela de larga escala sobre grafos
title_sort Um arcabouço baseado em componentes para computação paralela de larga escala sobre grafos
author Rezende, Cenez Araújo de
author_facet Rezende, Cenez Araújo de
author_role author
dc.contributor.author.fl_str_mv Rezende, Cenez Araújo de
dc.contributor.advisor1.fl_str_mv Carvalho Junior, Francisco Heron de
contributor_str_mv Carvalho Junior, Francisco Heron de
dc.subject.por.fl_str_mv Programação paralela
Engenharia de software baseada em componentes
Processamento de grafos
Computação de alto desempenho
Nuvem computacional
topic Programação paralela
Engenharia de software baseada em componentes
Processamento de grafos
Computação de alto desempenho
Nuvem computacional
description Faced with the increasing growth of data production to be processed by computer systems, a result of the current technological context and emerging applications of both industrial and scientific interest, researchers and companies have been looking for solutions to leverage large-scale data processing and analysis capacity. In addition to the large volume, many of these data must be processed by high-complexity algorithms, highlighting the inherent difficulties of problems in large graphs (BigGraph), often used to model information from large databases. Although with limitations in graph processing, the MapReduce model has motivated the construction of several high-performance frameworks, in order to meet the demand for efficient large-scale general purpose systems. Such a context has led to the proposal of more specialized solutions, such as Pregel and GAS (Gather, Apply, Scatter), as well as MapReduce extensions to deal with graph processing. However, frameworks that implement these models still have limitations, such as multi-platform constraints and general propose programming models for graphs. In this work, we show how component-oriented parallel programming can deal with MapReduce and conventional Pregel constraints. For that, we have employed HPC shelf, a component-based cloud computing platform for HPC services. On top of this platform, we introduce Gust, a flexible, extensible and adaptable BigGraph framework based on MapReduce. Besides the gains in software architecture, due to the use of a component-oriented approach, we have obtained competitive performance results compared to the state-of-the-art through an experimental study, using estatistical methods to increase confidence.
publishDate 2017
dc.date.accessioned.fl_str_mv 2017-09-26T12:43:31Z
dc.date.available.fl_str_mv 2017-09-26T12:43:31Z
dc.date.issued.fl_str_mv 2017
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/doctoralThesis
format doctoralThesis
status_str publishedVersion
dc.identifier.citation.fl_str_mv REZENDE, Cenez Araújo de. Um arcabouço baseado em componentes para computação paralela de larga escala sobre grafos. 2017. 175 f. Tese (Doutorado em Ciência da Computação) - Universidade Federal do Ceará, Fortaleza, 2017.
dc.identifier.uri.fl_str_mv http://www.repositorio.ufc.br/handle/riufc/26086
identifier_str_mv REZENDE, Cenez Araújo de. Um arcabouço baseado em componentes para computação paralela de larga escala sobre grafos. 2017. 175 f. Tese (Doutorado em Ciência da Computação) - Universidade Federal do Ceará, Fortaleza, 2017.
url http://www.repositorio.ufc.br/handle/riufc/26086
dc.language.iso.fl_str_mv por
language por
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.source.none.fl_str_mv reponame:Repositório Institucional da Universidade Federal do Ceará (UFC)
instname:Universidade Federal do Ceará (UFC)
instacron:UFC
instname_str Universidade Federal do Ceará (UFC)
instacron_str UFC
institution UFC
reponame_str Repositório Institucional da Universidade Federal do Ceará (UFC)
collection Repositório Institucional da Universidade Federal do Ceará (UFC)
bitstream.url.fl_str_mv http://repositorio.ufc.br/bitstream/riufc/26086/2/license.txt
http://repositorio.ufc.br/bitstream/riufc/26086/3/2017_tese_carezende.pdf
bitstream.checksum.fl_str_mv 8a4605be74aa9ea9d79846c1fba20a33
9b3282376b757bebf60c1ded80c52aa2
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
repository.name.fl_str_mv Repositório Institucional da Universidade Federal do Ceará (UFC) - Universidade Federal do Ceará (UFC)
repository.mail.fl_str_mv bu@ufc.br || repositorio@ufc.br
_version_ 1847792362334453760