Detalhes bibliográficos
Ano de defesa: |
2020 |
Autor(a) principal: |
Gimenes, Gabriel Perri |
Orientador(a): |
Não Informado pela instituição |
Banca de defesa: |
Não Informado pela instituição |
Tipo de documento: |
Tese
|
Tipo de acesso: |
Acesso aberto |
Idioma: |
eng |
Instituição de defesa: |
Biblioteca Digitais de Teses e Dissertações da USP
|
Programa de Pós-Graduação: |
Não Informado pela instituição
|
Departamento: |
Não Informado pela instituição
|
País: |
Não Informado pela instituição
|
Palavras-chave em Português: |
|
Link de acesso: |
https://www.teses.usp.br/teses/disponiveis/55/55134/tde-08062020-100828/
|
Resumo: |
Since the birth of web 2.0, users no longer just consume but are now active creators of content that is going to be consumed by other users. This new dynamic took data generation to a whole new scale, called planetary-scale or web-scale. Often, this data represents relationships between its elements, such as in social networks, recommendation systems, online boards, email networks, scientific citation networks, and others. Analyzing how information flows and how nodes influence each other in several of such networks is a widely regarded problem; While Belief Propagation, which is the fundamental algorithm for these types of inference, is widely used, it historically lacked convergence guarantees for real-world networks. However, even though recently alternative methods such as LinBP solve the convergence problems of the original algorithm, its scalability when dealing with large-scale problems remains a challenge. Also, several of the works proposed to solve this issue, do so by relying on specific infrastructures such as supercomputers and computational clusters. Motivated by these challenges we propose a new algorithm, called VCBP, that aims to provide a scalable framework for belief propagation on largescale problems, such as when graphs do not fit the main memory. We do so by combining stateof- the-art asynchronous vertex-centric parallel processing with state-of-the-art belief propagation algorithm. Our algorithm maintains the same accuracy rate while achieving performance orders of magnitude higher than former LinBPs implementation. Due to the asynchronous nature of our algorithm, VCBP demands fewer iterations before convergence than any previous algorithm. Additionally, we analyze our algorithm in the task of node classification, achieving significant results over real-world datasets. Our findings indicate that there is unexplored potential in todays widely available modern hardware, specifically concerning parallelism, sparking a shift towards a more cost-efficient and ubiquitous data mining scenario. |