Streaming, distributed, and asynchronous amortized inference

Bibliographic Details
Main Author: Henrique, Tiago da Silva
Publication Date: 2024
Format: Doctoral thesis
Language: eng
Source: Repositório Institucional do FGV (FGV Repositório Digital)
Download full: https://hdl.handle.net/10438/36338
Summary: We address the problem of sampling from an unnormalized distribution defined in a compositional space, i.e., a continuous or discrete set whose elements can be sequentially constructed from an initial state through the application of simple actions. This definition accommodates the space of (directed acyclic) graphs, natural language sentences of bounded size, and Euclidean n-spaces, among others, and is at the core of many applications in (Bayesian) statistics and machine learning. In particular, we focus on Generative Flow Networks (GFlowNets), a family of amortized samplers which cast the problem of sampling as finding a flow assignment in a flow network such that the total flow reaching a sink node equals that node's unnormalized probability. Despite their remarkable success in drug discovery, structure learning, and natural language processing, important questions regarding the scalability, generalization, and limitations of these models remain largely underexplored by the literature. In view of this, this thesis contributes with both methodological and theoretical advances for a better usability and understanding of GFlowNets. From a computational perspective, we design novel algorithms for the non-localized training of GFlowNets. This enables learning these models in a streaming and distributed fashion, which is crucial for managing ever-increasing data sizes and exploiting the architecture of modern computer clusters. The central idea of our methods is to break up the flow assignment problem into easier subproblems solved by separately trained GFlowNets. Once trained, these models are aggregated by a global GFlowNet. To do so efficiently, we also revisit the relationship between GFlowNets and variational inference and devise low-variance estimators for their learning objective's gradients to achieve faster training convergence. Overall, our experiments show that our non-localized procedures often lead to better approximations in a shorter time relatively to a centralized monolithic GFlowNet. Additionally, we demonstrate that the models corresponding to the global minimizers of the proposed surrogate learning objectives sample in proportion to the unnormalized target. This fact raises the questions of when a GFlowNet can reach such a global minimum and how close a trained model is to it. Towards answering them, we first present a family of discrete distributions that cannot be approximated by a GFlowNet when the flow functions are parameterized by 1-WL graph neural networks. Then, we develop a computationally amenable metric to probe the distributional accuracy of GFlowNets. Finally, as GFlowNets rely exclusively on a subgraph of the (potentially huge) flow network to learn a flow assignment, we argue that generalization plays a critical role in their success and derive the first non-vacuous (PAC-Bayesian) statistical guarantees for these models.
id FGV_bdfd9cda58ebf7412c761a209a7e40f3
oai_identifier_str oai:repositorio.fgv.br:10438/36338
network_acronym_str FGV
network_name_str Repositório Institucional do FGV (FGV Repositório Digital)
repository_id_str 3974
spelling Henrique, Tiago da SilvaEscolas::EMApCozman, Fabio GagliardiLaber, Eduardo SanyOliveira, Roberto ImbuzeiroMesquita, Diego2025-01-14T13:08:22Z2025-01-14T13:08:22Z2024-12-20https://hdl.handle.net/10438/36338We address the problem of sampling from an unnormalized distribution defined in a compositional space, i.e., a continuous or discrete set whose elements can be sequentially constructed from an initial state through the application of simple actions. This definition accommodates the space of (directed acyclic) graphs, natural language sentences of bounded size, and Euclidean n-spaces, among others, and is at the core of many applications in (Bayesian) statistics and machine learning. In particular, we focus on Generative Flow Networks (GFlowNets), a family of amortized samplers which cast the problem of sampling as finding a flow assignment in a flow network such that the total flow reaching a sink node equals that node's unnormalized probability. Despite their remarkable success in drug discovery, structure learning, and natural language processing, important questions regarding the scalability, generalization, and limitations of these models remain largely underexplored by the literature. In view of this, this thesis contributes with both methodological and theoretical advances for a better usability and understanding of GFlowNets. From a computational perspective, we design novel algorithms for the non-localized training of GFlowNets. This enables learning these models in a streaming and distributed fashion, which is crucial for managing ever-increasing data sizes and exploiting the architecture of modern computer clusters. The central idea of our methods is to break up the flow assignment problem into easier subproblems solved by separately trained GFlowNets. Once trained, these models are aggregated by a global GFlowNet. To do so efficiently, we also revisit the relationship between GFlowNets and variational inference and devise low-variance estimators for their learning objective's gradients to achieve faster training convergence. Overall, our experiments show that our non-localized procedures often lead to better approximations in a shorter time relatively to a centralized monolithic GFlowNet. Additionally, we demonstrate that the models corresponding to the global minimizers of the proposed surrogate learning objectives sample in proportion to the unnormalized target. This fact raises the questions of when a GFlowNet can reach such a global minimum and how close a trained model is to it. Towards answering them, we first present a family of discrete distributions that cannot be approximated by a GFlowNet when the flow functions are parameterized by 1-WL graph neural networks. Then, we develop a computationally amenable metric to probe the distributional accuracy of GFlowNets. Finally, as GFlowNets rely exclusively on a subgraph of the (potentially huge) flow network to learn a flow assignment, we argue that generalization plays a critical role in their success and derive the first non-vacuous (PAC-Bayesian) statistical guarantees for these models.Nós endereçamos o problema de amostragem de uma distribuição não normalizada definida em um espaço composicional, i.e., um conjunto contínuo ou discreto cujos elementos podem ser construídos sequencialmente a partir de um estado inicial por meio da aplicação de ações simples. Esta definição abrange o espaço de grafos (acíclicos direcionados), sentenças em linguagem natural de tamanho limitado e espaços euclidianos de dimensão n, entre outros, e é central em muitas aplicações em estatística (Bayesiana) e aprendizado de máquina. Em particular, nós focamos em Generative Flow Networks (GFlowNets), uma família de amostradores amortizados que formulam o problema de amostragem como a busca por uma atribuição de fluxo em uma rede de fluxo tal que o volume total chegando a um nó de sumidouro seja igual à probabilidade não normalizada desse nó. Apesar de seu sucesso notável em descoberta de medicamentos, aprendizado de estrutura e processamento de linguagem natural, questões importantes sobre escalabilidade, generalização e limitações desses modelos permanecem amplamente inexploradas na literatura. Assim, esta tese contribui com avanços metodológicos e teóricos para uma melhor usabilidade e compreensão de GFlowNets. Sob uma perspectiva computacional, projetamos novos algoritmos para o treinamento não localizado de GFlowNets. Isso permite o aprendizado desses modelos de forma dinâmica e distribuída, o que é crucial para lidar com o aumento constante no tamanho dos conjuntos de dados e para aproveitar a arquitetura dos modernos e poderosos clusters de computadores. Em resumo, a ideia central de nossos métodos consiste em dividir o problema de atribuição de fluxo em subproblemas mais simples, que são resolvidos por GFlowNets treinadas separadamente. Uma vez treinados, esses modelos são agregados por uma GFlowNet global. Para fazer isso de maneira eficiente, também revisitamos a relação entre GFlowNets e inferência variacional, desenvolvendo estimadores de baixa variância para os gradientes da sua função de perda e, em consequência, acelerando a convergência do treinamento. Além disso, nossos experimentos mostram que nosso procedimento não localizado frequentemente leva a melhores aproximações em um tempo mais curto em relação a uma GFlowNet monolítica e centralizada. Importantemente, também demonstramos que os modelos correspondentes aos minimizadores globais dos objetivos de aprendizado propostos amostram corretamente da distribuição alvo não normalizada. Isso levanta naturalmente as questões de quando uma GFlowNet pode alcançar esse mínimo global e quão próximo está um dado modelo desse ótimo. Para responder a essas perguntas, primeiro construímos uma família explícita de distribuições discretas que não podem ser aproximadas por uma GFlowNet quando as funções de fluxo são parametrizadas por redes neurais de grafos com expressividade 1-WL. Em seguida, desenvolvemos uma métrica computacionalmente viável para investigar a acurácia distribucional das GFlowNets. Por fim, como as GFlowNets utilizam apenas um subgrafo da (geralmente enorme ou infinita) rede de fluxo para aprender uma atribuição de fluxo, nós argumentamos que a generalização desempenha um papel crítico em seu sucesso e derivamos as primeiras garantias estatísticas não vazias para esses modelos.Métodos distribuídosThe works in this thesis were funded by the Fundação Carlos Chagas Filho de Amparo à Pesquisa do Estado do Rio de Janeiro FAPERJ (SEI-260003/000709/2023), the São Paulo Research Foundation FAPESP (2023/00815-6), and the Conselho Nacional de Desenvolvimento Científico e Tecnológico CNPq (404336/2023-0).engInferenceDistributedGFlowNetsInferência bayesianaAprendizado profundo geométricoMétodos distribuídosMatemáticaStreaming, distributed, and asynchronous amortized inferenceinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisinfo:eu-repo/semantics/openAccessreponame:Repositório Institucional do FGV (FGV Repositório Digital)instname:Fundação Getulio Vargas (FGV)instacron:FGVORIGINAL_DSc_thesis__Streaming__Distributed__and_Asynchronous_Generative_Flow_Networks (2).pdf_DSc_thesis__Streaming__Distributed__and_Asynchronous_Generative_Flow_Networks (2).pdfPDFapplication/pdf5448562https://repositorio.fgv.br/bitstreams/ce9d924a-9554-4e37-bff3-d8f4d6dbb833/download2df1490eba01b7f081207541b7f31960MD51LICENSElicense.txtlicense.txttext/plain; charset=utf-85112https://repositorio.fgv.br/bitstreams/387e2a1d-f8c0-412e-9e7e-d931dc3a4836/download2a4b67231f701c416a809246e7a10077MD52TEXT_DSc_thesis__Streaming__Distributed__and_Asynchronous_Generative_Flow_Networks (2).pdf.txt_DSc_thesis__Streaming__Distributed__and_Asynchronous_Generative_Flow_Networks (2).pdf.txtExtracted texttext/plain102528https://repositorio.fgv.br/bitstreams/cd1e23ec-4cac-410f-830f-3050b7db6529/download8556190c4bb53076603b0489a36026e8MD53THUMBNAIL_DSc_thesis__Streaming__Distributed__and_Asynchronous_Generative_Flow_Networks (2).pdf.jpg_DSc_thesis__Streaming__Distributed__and_Asynchronous_Generative_Flow_Networks (2).pdf.jpgGenerated Thumbnailimage/jpeg2761https://repositorio.fgv.br/bitstreams/5e65dc3a-2398-404f-af40-9b146a6a1de4/downloadc5b53c3991c54830946900b533a13dabMD5410438/363382025-01-14 17:00:44.23open.accessoai:repositorio.fgv.br:10438/36338https://repositorio.fgv.brRepositório InstitucionalPRIhttp://bibliotecadigital.fgv.br/dspace-oai/requestopendoar:39742025-01-14T17:00:44Repositório Institucional do FGV (FGV Repositório Digital) - Fundação Getulio Vargas (FGV)falseVGVybW8gZGUgTGljZW5jaWFtZW50bwpIw6EgdW0gw7psdGltbyBwYXNzbzogcGFyYSByZXByb2R1emlyLCB0cmFkdXppciBlIGRpc3RyaWJ1aXIgc3VhIHN1Ym1pc3PDo28gZW0gdG9kbyBvIG11bmRvLCB2b2PDqiBkZXZlIGNvbmNvcmRhciBjb20gb3MgdGVybW9zIGEgc2VndWlyLgoKQ29uY29yZGFyIGNvbSBvIFRlcm1vIGRlIExpY2VuY2lhbWVudG8sIHNlbGVjaW9uYW5kbyAiRXUgY29uY29yZG8gY29tIG8gVGVybW8gZGUgTGljZW5jaWFtZW50byIgZSBjbGlxdWUgZW0gIkZpbmFsaXphciBzdWJtaXNzw6NvIi4KClRFUk1PUyBMSUNFTkNJQU1FTlRPIFBBUkEgQVJRVUlWQU1FTlRPLCBSRVBST0RVw4fDg08gRSBESVZVTEdBw4fDg08gUMOaQkxJQ0EgREUgQ09OVEXDmkRPIMOAIEJJQkxJT1RFQ0EgVklSVFVBTCBGR1YgKHZlcnPDo28gMS4yKQoKMS4gVm9jw6osIHVzdcOhcmlvLWRlcG9zaXRhbnRlIGRhIEJpYmxpb3RlY2EgVmlydHVhbCBGR1YsIGFzc2VndXJhLCBubyBwcmVzZW50ZSBhdG8sIHF1ZSDDqSB0aXR1bGFyIGRvcyBkaXJlaXRvcyBhdXRvcmFpcyBwYXRyaW1vbmlhaXMgZS9vdSBkaXJlaXRvcyBjb25leG9zIHJlZmVyZW50ZXMgw6AgdG90YWxpZGFkZSBkYSBPYnJhIG9yYSBkZXBvc2l0YWRhIGVtIGZvcm1hdG8gZGlnaXRhbCwgYmVtIGNvbW8gZGUgc2V1cyBjb21wb25lbnRlcyBtZW5vcmVzLCBlbSBzZSB0cmF0YW5kbyBkZSBvYnJhIGNvbGV0aXZhLCBjb25mb3JtZSBvIHByZWNlaXR1YWRvIHBlbGEgTGVpIDkuNjEwLzk4IGUvb3UgTGVpIDkuNjA5Lzk4LiBOw6NvIHNlbmRvIGVzdGUgbyBjYXNvLCB2b2PDqiBhc3NlZ3VyYSB0ZXIgb2J0aWRvLCBkaXJldGFtZW50ZSBkb3MgZGV2aWRvcyB0aXR1bGFyZXMsIGF1dG9yaXphw6fDo28gcHLDqXZpYSBlIGV4cHJlc3NhIHBhcmEgbyBkZXDDs3NpdG8gZSBkaXZ1bGdhw6fDo28gZGEgT2JyYSwgYWJyYW5nZW5kbyB0b2RvcyBvcyBkaXJlaXRvcyBhdXRvcmFpcyBlIGNvbmV4b3MgYWZldGFkb3MgcGVsYSBhc3NpbmF0dXJhIGRvcyBwcmVzZW50ZXMgdGVybW9zIGRlIGxpY2VuY2lhbWVudG8sIGRlIG1vZG8gYSBlZmV0aXZhbWVudGUgaXNlbnRhciBhIEZ1bmRhw6fDo28gR2V0dWxpbyBWYXJnYXMgZSBzZXVzIGZ1bmNpb27DoXJpb3MgZGUgcXVhbHF1ZXIgcmVzcG9uc2FiaWxpZGFkZSBwZWxvIHVzbyBuw6NvLWF1dG9yaXphZG8gZG8gbWF0ZXJpYWwgZGVwb3NpdGFkbywgc2VqYSBlbSB2aW5jdWxhw6fDo28gw6AgQmlibGlvdGVjYSBWaXJ0dWFsIEZHViwgc2VqYSBlbSB2aW5jdWxhw6fDo28gYSBxdWFpc3F1ZXIgc2VydmnDp29zIGRlIGJ1c2NhIGUgZGlzdHJpYnVpw6fDo28gZGUgY29udGXDumRvIHF1ZSBmYcOnYW0gdXNvIGRhcyBpbnRlcmZhY2VzIGUgZXNwYcOnbyBkZSBhcm1hemVuYW1lbnRvIHByb3ZpZGVuY2lhZG9zIHBlbGEgRnVuZGHDp8OjbyBHZXR1bGlvIFZhcmdhcyBwb3IgbWVpbyBkZSBzZXVzIHNpc3RlbWFzIGluZm9ybWF0aXphZG9zLgoKMi4gQSBhc3NpbmF0dXJhIGRlc3RhIGxpY2Vuw6dhIHRlbSBjb21vIGNvbnNlccO8w6puY2lhIGEgdHJhbnNmZXLDqm5jaWEsIGEgdMOtdHVsbyBuw6NvLWV4Y2x1c2l2byBlIG7Do28tb25lcm9zbywgaXNlbnRhIGRvIHBhZ2FtZW50byBkZSByb3lhbHRpZXMgb3UgcXVhbHF1ZXIgb3V0cmEgY29udHJhcHJlc3Rhw6fDo28sIHBlY3VuacOhcmlhIG91IG7Do28sIMOgIEZ1bmRhw6fDo28gR2V0dWxpbyBWYXJnYXMsIGRvcyBkaXJlaXRvcyBkZSBhcm1hemVuYXIgZGlnaXRhbG1lbnRlLCByZXByb2R1emlyIGUgZGlzdHJpYnVpciBuYWNpb25hbCBlIGludGVybmFjaW9uYWxtZW50ZSBhIE9icmEsIGluY2x1aW5kby1zZSBvIHNldSByZXN1bW8vYWJzdHJhY3QsIHBvciBtZWlvcyBlbGV0csO0bmljb3MsIG5vIHNpdGUgZGEgQmlibGlvdGVjYSBWaXJ0dWFsIEZHViwgYW8gcMO6YmxpY28gZW0gZ2VyYWwsIGVtIHJlZ2ltZSBkZSBhY2Vzc28gYWJlcnRvLgoKMy4gQSBwcmVzZW50ZSBsaWNlbsOnYSB0YW1iw6ltIGFicmFuZ2UsIG5vcyBtZXNtb3MgdGVybW9zIGVzdGFiZWxlY2lkb3Mgbm8gaXRlbSAyLCBzdXByYSwgcXVhbHF1ZXIgZGlyZWl0byBkZSBjb211bmljYcOnw6NvIGFvIHDDumJsaWNvIGNhYsOtdmVsIGVtIHJlbGHDp8OjbyDDoCBPYnJhIG9yYSBkZXBvc2l0YWRhLCBpbmNsdWluZG8tc2Ugb3MgdXNvcyByZWZlcmVudGVzIMOgIHJlcHJlc2VudGHDp8OjbyBww7pibGljYSBlL291IGV4ZWN1w6fDo28gcMO6YmxpY2EsIGJlbSBjb21vIHF1YWxxdWVyIG91dHJhIG1vZGFsaWRhZGUgZGUgY29tdW5pY2HDp8OjbyBhbyBww7pibGljbyBxdWUgZXhpc3RhIG91IHZlbmhhIGEgZXhpc3Rpciwgbm9zIHRlcm1vcyBkbyBhcnRpZ28gNjggZSBzZWd1aW50ZXMgZGEgTGVpIDkuNjEwLzk4LCBuYSBleHRlbnPDo28gcXVlIGZvciBhcGxpY8OhdmVsIGFvcyBzZXJ2acOnb3MgcHJlc3RhZG9zIGFvIHDDumJsaWNvIHBlbGEgQmlibGlvdGVjYSBWaXJ0dWFsIEZHVi4KCjQuIEVzdGEgbGljZW7Dp2EgYWJyYW5nZSwgYWluZGEsIG5vcyBtZXNtb3MgdGVybW9zIGVzdGFiZWxlY2lkb3Mgbm8gaXRlbSAyLCBzdXByYSwgdG9kb3Mgb3MgZGlyZWl0b3MgY29uZXhvcyBkZSBhcnRpc3RhcyBpbnTDqXJwcmV0ZXMgb3UgZXhlY3V0YW50ZXMsIHByb2R1dG9yZXMgZm9ub2dyw6FmaWNvcyBvdSBlbXByZXNhcyBkZSByYWRpb2RpZnVzw6NvIHF1ZSBldmVudHVhbG1lbnRlIHNlamFtIGFwbGljw6F2ZWlzIGVtIHJlbGHDp8OjbyDDoCBvYnJhIGRlcG9zaXRhZGEsIGVtIGNvbmZvcm1pZGFkZSBjb20gbyByZWdpbWUgZml4YWRvIG5vIFTDrXR1bG8gViBkYSBMZWkgOS42MTAvOTguCgo1LiBTZSBhIE9icmEgZGVwb3NpdGFkYSBmb2kgb3Ugw6kgb2JqZXRvIGRlIGZpbmFuY2lhbWVudG8gcG9yIGluc3RpdHVpw6fDtWVzIGRlIGZvbWVudG8gw6AgcGVzcXVpc2Egb3UgcXVhbHF1ZXIgb3V0cmEgc2VtZWxoYW50ZSwgdm9jw6ogb3UgbyB0aXR1bGFyIGFzc2VndXJhIHF1ZSBjdW1wcml1IHRvZGFzIGFzIG9icmlnYcOnw7VlcyBxdWUgbGhlIGZvcmFtIGltcG9zdGFzIHBlbGEgaW5zdGl0dWnDp8OjbyBmaW5hbmNpYWRvcmEgZW0gcmF6w6NvIGRvIGZpbmFuY2lhbWVudG8sIGUgcXVlIG7Do28gZXN0w6EgY29udHJhcmlhbmRvIHF1YWxxdWVyIGRpc3Bvc2nDp8OjbyBjb250cmF0dWFsIHJlZmVyZW50ZSDDoCBwdWJsaWNhw6fDo28gZG8gY29udGXDumRvIG9yYSBzdWJtZXRpZG8gw6AgQmlibGlvdGVjYSBWaXJ0dWFsIEZHVi4KCjYuIENhc28gYSBPYnJhIG9yYSBkZXBvc2l0YWRhIGVuY29udHJlLXNlIGxpY2VuY2lhZGEgc29iIHVtYSBsaWNlbsOnYSBDcmVhdGl2ZSBDb21tb25zIChxdWFscXVlciB2ZXJzw6NvKSwgc29iIGEgbGljZW7Dp2EgR05VIEZyZWUgRG9jdW1lbnRhdGlvbiBMaWNlbnNlIChxdWFscXVlciB2ZXJzw6NvKSwgb3Ugb3V0cmEgbGljZW7Dp2EgcXVhbGlmaWNhZGEgY29tbyBsaXZyZSBzZWd1bmRvIG9zIGNyaXTDqXJpb3MgZGEgRGVmaW5pdGlvbiBvZiBGcmVlIEN1bHR1cmFsIFdvcmtzIChkaXNwb27DrXZlbCBlbTogaHR0cDovL2ZyZWVkb21kZWZpbmVkLm9yZy9EZWZpbml0aW9uKSBvdSBGcmVlIFNvZnR3YXJlIERlZmluaXRpb24gKGRpc3BvbsOtdmVsIGVtOiBodHRwOi8vd3d3LmdudS5vcmcvcGhpbG9zb3BoeS9mcmVlLXN3Lmh0bWwpLCBvIGFycXVpdm8gcmVmZXJlbnRlIMOgIE9icmEgZGV2ZSBpbmRpY2FyIGEgbGljZW7Dp2EgYXBsaWPDoXZlbCBlbSBjb250ZcO6ZG8gbGVnw612ZWwgcG9yIHNlcmVzIGh1bWFub3MgZSwgc2UgcG9zc8OtdmVsLCB0YW1iw6ltIGVtIG1ldGFkYWRvcyBsZWfDrXZlaXMgcG9yIG3DoXF1aW5hLiBBIGluZGljYcOnw6NvIGRhIGxpY2Vuw6dhIGFwbGljw6F2ZWwgZGV2ZSBzZXIgYWNvbXBhbmhhZGEgZGUgdW0gbGluayBwYXJhIG9zIHRlcm1vcyBkZSBsaWNlbmNpYW1lbnRvIG91IHN1YSBjw7NwaWEgaW50ZWdyYWwuCgpBbyBjb25jbHVpciBhIHByZXNlbnRlIGV0YXBhIGUgYXMgZXRhcGFzIHN1YnNlccO8ZW50ZXMgZG8gcHJvY2Vzc28gZGUgc3VibWlzc8OjbyBkZSBhcnF1aXZvcyDDoCBCaWJsaW90ZWNhIFZpcnR1YWwgRkdWLCB2b2PDqiBhdGVzdGEgcXVlIGxldSBlIGNvbmNvcmRhIGludGVncmFsbWVudGUgY29tIG9zIHRlcm1vcyBhY2ltYSBkZWxpbWl0YWRvcywgYXNzaW5hbmRvLW9zIHNlbSBmYXplciBxdWFscXVlciByZXNlcnZhIGUgbm92YW1lbnRlIGNvbmZpcm1hbmRvIHF1ZSBjdW1wcmUgb3MgcmVxdWlzaXRvcyBpbmRpY2Fkb3Mgbm8gaXRlbSAxLCBzdXByYS4KCkhhdmVuZG8gcXVhbHF1ZXIgZGlzY29yZMOibmNpYSBlbSByZWxhw6fDo28gYW9zIHByZXNlbnRlcyB0ZXJtb3Mgb3UgbsOjbyBzZSB2ZXJpZmljYW5kbyBvIGV4aWdpZG8gbm8gaXRlbSAxLCBzdXByYSwgdm9jw6ogZGV2ZSBpbnRlcnJvbXBlciBpbWVkaWF0YW1lbnRlIG8gcHJvY2Vzc28gZGUgc3VibWlzc8Ojby4gQSBjb250aW51aWRhZGUgZG8gcHJvY2Vzc28gZXF1aXZhbGUgw6AgYXNzaW5hdHVyYSBkZXN0ZSBkb2N1bWVudG8sIGNvbSB0b2RhcyBhcyBjb25zZXHDvMOqbmNpYXMgbmVsZSBwcmV2aXN0YXMsIHN1amVpdGFuZG8tc2UgbyBzaWduYXTDoXJpbyBhIHNhbsOnw7VlcyBjaXZpcyBlIGNyaW1pbmFpcyBjYXNvIG7Do28gc2VqYSB0aXR1bGFyIGRvcyBkaXJlaXRvcyBhdXRvcmFpcyBwYXRyaW1vbmlhaXMgZS9vdSBjb25leG9zIGFwbGljw6F2ZWlzIMOgIE9icmEgZGVwb3NpdGFkYSBkdXJhbnRlIGVzdGUgcHJvY2Vzc28sIG91IGNhc28gbsOjbyB0ZW5oYSBvYnRpZG8gcHLDqXZpYSBlIGV4cHJlc3NhIGF1dG9yaXphw6fDo28gZG8gdGl0dWxhciBwYXJhIG8gZGVww7NzaXRvIGUgdG9kb3Mgb3MgdXNvcyBkYSBPYnJhIGVudm9sdmlkb3MuCgpQYXJhIGEgc29sdcOnw6NvIGRlIHF1YWxxdWVyIGTDunZpZGEgcXVhbnRvIGFvcyB0ZXJtb3MgZGUgbGljZW5jaWFtZW50byBlIG8gcHJvY2Vzc28gZGUgc3VibWlzc8OjbywgY2xpcXVlIG5vIGxpbmsgIkZhbGUgY29ub3NjbyIuCgpTZSB2b2PDqiB0aXZlciBkw7p2aWRhcyBzb2JyZSBlc3RhIGxpY2Vuw6dhLCBwb3IgZmF2b3IgZW50cmUgZW0gY29udGF0byBjb20gb3MgYWRtaW5pc3RyYWRvcmVzIGRvIFJlcG9zaXTDs3Jpby4K
dc.title.eng.fl_str_mv Streaming, distributed, and asynchronous amortized inference
title Streaming, distributed, and asynchronous amortized inference
spellingShingle Streaming, distributed, and asynchronous amortized inference
Henrique, Tiago da Silva
Inference
Distributed
GFlowNets
Inferência bayesiana
Aprendizado profundo geométrico
Métodos distribuídos
Matemática
title_short Streaming, distributed, and asynchronous amortized inference
title_full Streaming, distributed, and asynchronous amortized inference
title_fullStr Streaming, distributed, and asynchronous amortized inference
title_full_unstemmed Streaming, distributed, and asynchronous amortized inference
title_sort Streaming, distributed, and asynchronous amortized inference
author Henrique, Tiago da Silva
author_facet Henrique, Tiago da Silva
author_role author
dc.contributor.unidadefgv.por.fl_str_mv Escolas::EMAp
dc.contributor.member.none.fl_str_mv Cozman, Fabio Gagliardi
Laber, Eduardo Sany
Oliveira, Roberto Imbuzeiro
dc.contributor.author.fl_str_mv Henrique, Tiago da Silva
dc.contributor.advisor1.fl_str_mv Mesquita, Diego
contributor_str_mv Mesquita, Diego
dc.subject.eng.fl_str_mv Inference
Distributed
GFlowNets
topic Inference
Distributed
GFlowNets
Inferência bayesiana
Aprendizado profundo geométrico
Métodos distribuídos
Matemática
dc.subject.por.fl_str_mv Inferência bayesiana
Aprendizado profundo geométrico
Métodos distribuídos
dc.subject.area.por.fl_str_mv Matemática
description We address the problem of sampling from an unnormalized distribution defined in a compositional space, i.e., a continuous or discrete set whose elements can be sequentially constructed from an initial state through the application of simple actions. This definition accommodates the space of (directed acyclic) graphs, natural language sentences of bounded size, and Euclidean n-spaces, among others, and is at the core of many applications in (Bayesian) statistics and machine learning. In particular, we focus on Generative Flow Networks (GFlowNets), a family of amortized samplers which cast the problem of sampling as finding a flow assignment in a flow network such that the total flow reaching a sink node equals that node's unnormalized probability. Despite their remarkable success in drug discovery, structure learning, and natural language processing, important questions regarding the scalability, generalization, and limitations of these models remain largely underexplored by the literature. In view of this, this thesis contributes with both methodological and theoretical advances for a better usability and understanding of GFlowNets. From a computational perspective, we design novel algorithms for the non-localized training of GFlowNets. This enables learning these models in a streaming and distributed fashion, which is crucial for managing ever-increasing data sizes and exploiting the architecture of modern computer clusters. The central idea of our methods is to break up the flow assignment problem into easier subproblems solved by separately trained GFlowNets. Once trained, these models are aggregated by a global GFlowNet. To do so efficiently, we also revisit the relationship between GFlowNets and variational inference and devise low-variance estimators for their learning objective's gradients to achieve faster training convergence. Overall, our experiments show that our non-localized procedures often lead to better approximations in a shorter time relatively to a centralized monolithic GFlowNet. Additionally, we demonstrate that the models corresponding to the global minimizers of the proposed surrogate learning objectives sample in proportion to the unnormalized target. This fact raises the questions of when a GFlowNet can reach such a global minimum and how close a trained model is to it. Towards answering them, we first present a family of discrete distributions that cannot be approximated by a GFlowNet when the flow functions are parameterized by 1-WL graph neural networks. Then, we develop a computationally amenable metric to probe the distributional accuracy of GFlowNets. Finally, as GFlowNets rely exclusively on a subgraph of the (potentially huge) flow network to learn a flow assignment, we argue that generalization plays a critical role in their success and derive the first non-vacuous (PAC-Bayesian) statistical guarantees for these models.
publishDate 2024
dc.date.issued.fl_str_mv 2024-12-20
dc.date.accessioned.fl_str_mv 2025-01-14T13:08:22Z
dc.date.available.fl_str_mv 2025-01-14T13:08:22Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/doctoralThesis
format doctoralThesis
status_str publishedVersion
dc.identifier.uri.fl_str_mv https://hdl.handle.net/10438/36338
url https://hdl.handle.net/10438/36338
dc.language.iso.fl_str_mv eng
language eng
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.source.none.fl_str_mv reponame:Repositório Institucional do FGV (FGV Repositório Digital)
instname:Fundação Getulio Vargas (FGV)
instacron:FGV
instname_str Fundação Getulio Vargas (FGV)
instacron_str FGV
institution FGV
reponame_str Repositório Institucional do FGV (FGV Repositório Digital)
collection Repositório Institucional do FGV (FGV Repositório Digital)
bitstream.url.fl_str_mv https://repositorio.fgv.br/bitstreams/ce9d924a-9554-4e37-bff3-d8f4d6dbb833/download
https://repositorio.fgv.br/bitstreams/387e2a1d-f8c0-412e-9e7e-d931dc3a4836/download
https://repositorio.fgv.br/bitstreams/cd1e23ec-4cac-410f-830f-3050b7db6529/download
https://repositorio.fgv.br/bitstreams/5e65dc3a-2398-404f-af40-9b146a6a1de4/download
bitstream.checksum.fl_str_mv 2df1490eba01b7f081207541b7f31960
2a4b67231f701c416a809246e7a10077
8556190c4bb53076603b0489a36026e8
c5b53c3991c54830946900b533a13dab
bitstream.checksumAlgorithm.fl_str_mv MD5
MD5
MD5
MD5
repository.name.fl_str_mv Repositório Institucional do FGV (FGV Repositório Digital) - Fundação Getulio Vargas (FGV)
repository.mail.fl_str_mv
_version_ 1827846403105226752