Streaming, distributed, and asynchronous amortized inference

Henrique, Tiago da Silva

Streaming, distributed, and asynchronous amortized inference

Bibliographic Details
Main Author:	Henrique, Tiago da Silva
Publication Date:	2024
Format:	Doctoral thesis
Language:	eng
Source:	Repositório Institucional do FGV (FGV Repositório Digital)
Download full:	https://hdl.handle.net/10438/36338
Summary:	We address the problem of sampling from an unnormalized distribution defined in a compositional space, i.e., a continuous or discrete set whose elements can be sequentially constructed from an initial state through the application of simple actions. This definition accommodates the space of (directed acyclic) graphs, natural language sentences of bounded size, and Euclidean n-spaces, among others, and is at the core of many applications in (Bayesian) statistics and machine learning. In particular, we focus on Generative Flow Networks (GFlowNets), a family of amortized samplers which cast the problem of sampling as finding a flow assignment in a flow network such that the total flow reaching a sink node equals that node's unnormalized probability. Despite their remarkable success in drug discovery, structure learning, and natural language processing, important questions regarding the scalability, generalization, and limitations of these models remain largely underexplored by the literature. In view of this, this thesis contributes with both methodological and theoretical advances for a better usability and understanding of GFlowNets. From a computational perspective, we design novel algorithms for the non-localized training of GFlowNets. This enables learning these models in a streaming and distributed fashion, which is crucial for managing ever-increasing data sizes and exploiting the architecture of modern computer clusters. The central idea of our methods is to break up the flow assignment problem into easier subproblems solved by separately trained GFlowNets. Once trained, these models are aggregated by a global GFlowNet. To do so efficiently, we also revisit the relationship between GFlowNets and variational inference and devise low-variance estimators for their learning objective's gradients to achieve faster training convergence. Overall, our experiments show that our non-localized procedures often lead to better approximations in a shorter time relatively to a centralized monolithic GFlowNet. Additionally, we demonstrate that the models corresponding to the global minimizers of the proposed surrogate learning objectives sample in proportion to the unnormalized target. This fact raises the questions of when a GFlowNet can reach such a global minimum and how close a trained model is to it. Towards answering them, we first present a family of discrete distributions that cannot be approximated by a GFlowNet when the flow functions are parameterized by 1-WL graph neural networks. Then, we develop a computationally amenable metric to probe the distributional accuracy of GFlowNets. Finally, as GFlowNets rely exclusively on a subgraph of the (potentially huge) flow network to learn a flow assignment, we argue that generalization plays a critical role in their success and derive the first non-vacuous (PAC-Bayesian) statistical guarantees for these models.

Item metadata

id	FGV_bdfd9cda58ebf7412c761a209a7e40f3
oai_identifier_str	oai:repositorio.fgv.br:10438/36338
network_acronym_str	FGV
network_name_str	Repositório Institucional do FGV (FGV Repositório Digital)
repository_id_str	3974
spelling	Henrique, Tiago da SilvaEscolas::EMApCozman, Fabio GagliardiLaber, Eduardo SanyOliveira, Roberto ImbuzeiroMesquita, Diego2025-01-14T13:08:22Z2025-01-14T13:08:22Z2024-12-20https://hdl.handle.net/10438/36338We address the problem of sampling from an unnormalized distribution defined in a compositional space, i.e., a continuous or discrete set whose elements can be sequentially constructed from an initial state through the application of simple actions. This definition accommodates the space of (directed acyclic) graphs, natural language sentences of bounded size, and Euclidean n-spaces, among others, and is at the core of many applications in (Bayesian) statistics and machine learning. In particular, we focus on Generative Flow Networks (GFlowNets), a family of amortized samplers which cast the problem of sampling as finding a flow assignment in a flow network such that the total flow reaching a sink node equals that node's unnormalized probability. Despite their remarkable success in drug discovery, structure learning, and natural language processing, important questions regarding the scalability, generalization, and limitations of these models remain largely underexplored by the literature. In view of this, this thesis contributes with both methodological and theoretical advances for a better usability and understanding of GFlowNets. From a computational perspective, we design novel algorithms for the non-localized training of GFlowNets. This enables learning these models in a streaming and distributed fashion, which is crucial for managing ever-increasing data sizes and exploiting the architecture of modern computer clusters. The central idea of our methods is to break up the flow assignment problem into easier subproblems solved by separately trained GFlowNets. Once trained, these models are aggregated by a global GFlowNet. To do so efficiently, we also revisit the relationship between GFlowNets and variational inference and devise low-variance estimators for their learning objective's gradients to achieve faster training convergence. Overall, our experiments show that our non-localized procedures often lead to better approximations in a shorter time relatively to a centralized monolithic GFlowNet. Additionally, we demonstrate that the models corresponding to the global minimizers of the proposed surrogate learning objectives sample in proportion to the unnormalized target. This fact raises the questions of when a GFlowNet can reach such a global minimum and how close a trained model is to it. Towards answering them, we first present a family of discrete distributions that cannot be approximated by a GFlowNet when the flow functions are parameterized by 1-WL graph neural networks. Then, we develop a computationally amenable metric to probe the distributional accuracy of GFlowNets. Finally, as GFlowNets rely exclusively on a subgraph of the (potentially huge) flow network to learn a flow assignment, we argue that generalization plays a critical role in their success and derive the first non-vacuous (PAC-Bayesian) statistical guarantees for these models.Nós endereçamos o problema de amostragem de uma distribuição não normalizada definida em um espaço composicional, i.e., um conjunto contínuo ou discreto cujos elementos podem ser construídos sequencialmente a partir de um estado inicial por meio da aplicação de ações simples. Esta definição abrange o espaço de grafos (acíclicos direcionados), sentenças em linguagem natural de tamanho limitado e espaços euclidianos de dimensão n, entre outros, e é central em muitas aplicações em estatística (Bayesiana) e aprendizado de máquina. Em particular, nós focamos em Generative Flow Networks (GFlowNets), uma família de amostradores amortizados que formulam o problema de amostragem como a busca por uma atribuição de fluxo em uma rede de fluxo tal que o volume total chegando a um nó de sumidouro seja igual à probabilidade não normalizada desse nó. Apesar de seu sucesso notável em descoberta de medicamentos, aprendizado de estrutura e processamento de linguagem natural, questões importantes sobre escalabilidade, generalização e limitações desses modelos permanecem amplamente inexploradas na literatura. Assim, esta tese contribui com avanços metodológicos e teóricos para uma melhor usabilidade e compreensão de GFlowNets. Sob uma perspectiva computacional, projetamos novos algoritmos para o treinamento não localizado de GFlowNets. Isso permite o aprendizado desses modelos de forma dinâmica e distribuída, o que é crucial para lidar com o aumento constante no tamanho dos conjuntos de dados e para aproveitar a arquitetura dos modernos e poderosos clusters de computadores. Em resumo, a ideia central de nossos métodos consiste em dividir o problema de atribuição de fluxo em subproblemas mais simples, que são resolvidos por GFlowNets treinadas separadamente. Uma vez treinados, esses modelos são agregados por uma GFlowNet global. Para fazer isso de maneira eficiente, também revisitamos a relação entre GFlowNets e inferência variacional, desenvolvendo estimadores de baixa variância para os gradientes da sua função de perda e, em consequência, acelerando a convergência do treinamento. Além disso, nossos experimentos mostram que nosso procedimento não localizado frequentemente leva a melhores aproximações em um tempo mais curto em relação a uma GFlowNet monolítica e centralizada. Importantemente, também demonstramos que os modelos correspondentes aos minimizadores globais dos objetivos de aprendizado propostos amostram corretamente da distribuição alvo não normalizada. Isso levanta naturalmente as questões de quando uma GFlowNet pode alcançar esse mínimo global e quão próximo está um dado modelo desse ótimo. Para responder a essas perguntas, primeiro construímos uma família explícita de distribuições discretas que não podem ser aproximadas por uma GFlowNet quando as funções de fluxo são parametrizadas por redes neurais de grafos com expressividade 1-WL. Em seguida, desenvolvemos uma métrica computacionalmente viável para investigar a acurácia distribucional das GFlowNets. Por fim, como as GFlowNets utilizam apenas um subgrafo da (geralmente enorme ou infinita) rede de fluxo para aprender uma atribuição de fluxo, nós argumentamos que a generalização desempenha um papel crítico em seu sucesso e derivamos as primeiras garantias estatísticas não vazias para esses modelos.Métodos distribuídosThe works in this thesis were funded by the Fundação Carlos Chagas Filho de Amparo à Pesquisa do Estado do Rio de Janeiro FAPERJ (SEI-260003/000709/2023), the São Paulo Research Foundation FAPESP (2023/00815-6), and the Conselho Nacional de Desenvolvimento Científico e Tecnológico CNPq (404336/2023-0).engInferenceDistributedGFlowNetsInferência bayesianaAprendizado profundo geométricoMétodos distribuídosMatemáticaStreaming, distributed, and asynchronous amortized inferenceinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisinfo:eu-repo/semantics/openAccessreponame:Repositório Institucional do FGV (FGV Repositório Digital)instname:Fundação Getulio Vargas (FGV)instacron:FGVORIGINAL_DSc_thesis__Streaming__Distributed__and_Asynchronous_Generative_Flow_Networks (2).pdf_DSc_thesis__Streaming__Distributed__and_Asynchronous_Generative_Flow_Networks (2).pdfPDFapplication/pdf5448562https://repositorio.fgv.br/bitstreams/ce9d924a-9554-4e37-bff3-d8f4d6dbb833/download2df1490eba01b7f081207541b7f31960MD51LICENSElicense.txtlicense.txttext/plain; charset=utf-85112https://repositorio.fgv.br/bitstreams/387e2a1d-f8c0-412e-9e7e-d931dc3a4836/download2a4b67231f701c416a809246e7a10077MD52TEXT_DSc_thesis__Streaming__Distributed__and_Asynchronous_Generative_Flow_Networks (2).pdf.txt_DSc_thesis__Streaming__Distributed__and_Asynchronous_Generative_Flow_Networks (2).pdf.txtExtracted texttext/plain102528https://repositorio.fgv.br/bitstreams/cd1e23ec-4cac-410f-830f-3050b7db6529/download8556190c4bb53076603b0489a36026e8MD53THUMBNAIL_DSc_thesis__Streaming__Distributed__and_Asynchronous_Generative_Flow_Networks (2).pdf.jpg_DSc_thesis__Streaming__Distributed__and_Asynchronous_Generative_Flow_Networks (2).pdf.jpgGenerated Thumbnailimage/jpeg2761https://repositorio.fgv.br/bitstreams/5e65dc3a-2398-404f-af40-9b146a6a1de4/downloadc5b53c3991c54830946900b533a13dabMD5410438/363382025-01-14 17:00:44.23open.accessoai:repositorio.fgv.br:10438/36338https://repositorio.fgv.brRepositório InstitucionalPRIhttp://bibliotecadigital.fgv.br/dspace-oai/requestopendoar:39742025-01-14T17:00:44Repositório Institucional do FGV (FGV Repositório Digital) - Fundação Getulio Vargas (FGV)falseVGVybW8gZGUgTGljZW5jaWFtZW50bwpIw6EgdW0gw7psdGltbyBwYXNzbzogcGFyYSByZXByb2R1emlyLCB0cmFkdXppciBlIGRpc3RyaWJ1aXIgc3VhIHN1Ym1pc3PDo28gZW0gdG9kbyBvIG11bmRvLCB2b2PDqiBkZXZlIGNvbmNvcmRhciBjb20gb3MgdGVybW9zIGEgc2VndWlyLgoKQ29uY29yZGFyIGNvbSBvIFRlcm1vIGRlIExpY2VuY2lhbWVudG8sIHNlbGVjaW9uYW5kbyAiRXUgY29uY29yZG8gY29tIG8gVGVybW8gZGUgTGljZW5jaWFtZW50byIgZSBjbGlxdWUgZW0gIkZpbmFsaXphciBzdWJtaXNzw6NvIi4KClRFUk1PUyBMSUNFTkNJQU1FTlRPIFBBUkEgQVJRVUlWQU1FTlRPLCBSRVBST0RVw4fDg08gRSBESVZVTEdBw4fDg08gUMOaQkxJQ0EgREUgQ09OVEXDmkRPIMOAIEJJQkxJT1RFQ0EgVklSVFVBTCBGR1YgKHZlcnPDo28gMS4yKQoKMS4gVm9jw6osIHVzdcOhcmlvLWRlcG9zaXRhbnRlIGRhIEJpYmxpb3RlY2EgVmlydHVhbCBGR1YsIGFzc2VndXJhLCBubyBwcmVzZW50ZSBhdG8sIHF1ZSDDqSB0aXR1bGFyIGRvcyBkaXJlaXRvcyBhdXRvcmFpcyBwYXRyaW1vbmlhaXMgZS9vdSBkaXJlaXRvcyBjb25leG9zIHJlZmVyZW50ZXMgw6AgdG90YWxpZGFkZSBkYSBPYnJhIG9yYSBkZXBvc2l0YWRhIGVtIGZvcm1hdG8gZGlnaXRhbCwgYmVtIGNvbW8gZGUgc2V1cyBjb21wb25lbnRlcyBtZW5vcmVzLCBlbSBzZSB0cmF0YW5kbyBkZSBvYnJhIGNvbGV0aXZhLCBjb25mb3JtZSBvIHByZWNlaXR1YWRvIHBlbGEgTGVpIDkuNjEwLzk4IGUvb3UgTGVpIDkuNjA5Lzk4LiBOw6NvIHNlbmRvIGVzdGUgbyBjYXNvLCB2b2PDqiBhc3NlZ3VyYSB0ZXIgb2J0aWRvLCBkaXJldGFtZW50ZSBkb3MgZGV2aWRvcyB0aXR1bGFyZXMsIGF1dG9yaXphw6fDo28gcHLDqXZpYSBlIGV4cHJlc3NhIHBhcmEgbyBkZXDDs3NpdG8gZSBkaXZ1bGdhw6fDo28gZGEgT2JyYSwgYWJyYW5nZW5kbyB0b2RvcyBvcyBkaXJlaXRvcyBhdXRvcmFpcyBlIGNvbmV4b3MgYWZldGFkb3MgcGVsYSBhc3NpbmF0dXJhIGRvcyBwcmVzZW50ZXMgdGVybW9zIGRlIGxpY2VuY2lhbWVudG8sIGRlIG1vZG8gYSBlZmV0aXZhbWVudGUgaXNlbnRhciBhIEZ1bmRhw6fDo28gR2V0dWxpbyBWYXJnYXMgZSBzZXVzIGZ1bmNpb27DoXJpb3MgZGUgcXVhbHF1ZXIgcmVzcG9uc2FiaWxpZGFkZSBwZWxvIHVzbyBuw6NvLWF1dG9yaXphZG8gZG8gbWF0ZXJpYWwgZGVwb3NpdGFkbywgc2VqYSBlbSB2aW5jdWxhw6fDo28gw6AgQmlibGlvdGVjYSBWaXJ0dWFsIEZHViwgc2VqYSBlbSB2aW5jdWxhw6fDo28gYSBxdWFpc3F1ZXIgc2VydmnDp29zIGRlIGJ1c2NhIGUgZGlzdHJpYnVpw6fDo28gZGUgY29udGXDumRvIHF1ZSBmYcOnYW0gdXNvIGRhcyBpbnRlcmZhY2VzIGUgZXNwYcOnbyBkZSBhcm1hemVuYW1lbnRvIHByb3ZpZGVuY2lhZG9zIHBlbGEgRnVuZGHDp8OjbyBHZXR1bGlvIFZhcmdhcyBwb3IgbWVpbyBkZSBzZXVzIHNpc3RlbWFzIGluZm9ybWF0aXphZG9zLgoKMi4gQSBhc3NpbmF0dXJhIGRlc3RhIGxpY2Vuw6dhIHRlbSBjb21vIGNvbnNlccO8w6puY2lhIGEgdHJhbnNmZXLDqm5jaWEsIGEgdMOtdHVsbyBuw6NvLWV4Y2x1c2l2byBlIG7Do28tb25lcm9zbywgaXNlbnRhIGRvIHBhZ2FtZW50byBkZSByb3lhbHRpZXMgb3UgcXVhbHF1ZXIgb3V0cmEgY29udHJhcHJlc3Rhw6fDo28sIHBlY3VuacOhcmlhIG91IG7Do28sIMOgIEZ1bmRhw6fDo28gR2V0dWxpbyBWYXJnYXMsIGRvcyBkaXJlaXRvcyBkZSBhcm1hemVuYXIgZGlnaXRhbG1lbnRlLCByZXByb2R1emlyIGUgZGlzdHJpYnVpciBuYWNpb25hbCBlIGludGVybmFjaW9uYWxtZW50ZSBhIE9icmEsIGluY2x1aW5kby1zZSBvIHNldSByZXN1bW8vYWJzdHJhY3QsIHBvciBtZWlvcyBlbGV0csO0bmljb3MsIG5vIHNpdGUgZGEgQmlibGlvdGVjYSBWaXJ0dWFsIEZHViwgYW8gcMO6YmxpY28gZW0gZ2VyYWwsIGVtIHJlZ2ltZSBkZSBhY2Vzc28gYWJlcnRvLgoKMy4gQSBwcmVzZW50ZSBsaWNlbsOnYSB0YW1iw6ltIGFicmFuZ2UsIG5vcyBtZXNtb3MgdGVybW9zIGVzdGFiZWxlY2lkb3Mgbm8gaXRlbSAyLCBzdXByYSwgcXVhbHF1ZXIgZGlyZWl0byBkZSBjb211bmljYcOnw6NvIGFvIHDDumJsaWNvIGNhYsOtdmVsIGVtIHJlbGHDp8OjbyDDoCBPYnJhIG9yYSBkZXBvc2l0YWRhLCBpbmNsdWluZG8tc2Ugb3MgdXNvcyByZWZlcmVudGVzIMOgIHJlcHJlc2VudGHDp8OjbyBww7pibGljYSBlL291IGV4ZWN1w6fDo28gcMO6YmxpY2EsIGJlbSBjb21vIHF1YWxxdWVyIG91dHJhIG1vZGFsaWRhZGUgZGUgY29tdW5pY2HDp8OjbyBhbyBww7pibGljbyBxdWUgZXhpc3RhIG91IHZlbmhhIGEgZXhpc3Rpciwgbm9zIHRlcm1vcyBkbyBhcnRpZ28gNjggZSBzZWd1aW50ZXMgZGEgTGVpIDkuNjEwLzk4LCBuYSBleHRlbnPDo28gcXVlIGZvciBhcGxpY8OhdmVsIGFvcyBzZXJ2acOnb3MgcHJlc3RhZG9zIGFvIHDDumJsaWNvIHBlbGEgQmlibGlvdGVjYSBWaXJ0dWFsIEZHVi4KCjQuIEVzdGEgbGljZW7Dp2EgYWJyYW5nZSwgYWluZGEsIG5vcyBtZXNtb3MgdGVybW9zIGVzdGFiZWxlY2lkb3Mgbm8gaXRlbSAyLCBzdXByYSwgdG9kb3Mgb3MgZGlyZWl0b3MgY29uZXhvcyBkZSBhcnRpc3RhcyBpbnTDqXJwcmV0ZXMgb3UgZXhlY3V0YW50ZXMsIHByb2R1dG9yZXMgZm9ub2dyw6FmaWNvcyBvdSBlbXByZXNhcyBkZSByYWRpb2RpZnVzw6NvIHF1ZSBldmVudHVhbG1lbnRlIHNlamFtIGFwbGljw6F2ZWlzIGVtIHJlbGHDp8OjbyDDoCBvYnJhIGRlcG9zaXRhZGEsIGVtIGNvbmZvcm1pZGFkZSBjb20gbyByZWdpbWUgZml4YWRvIG5vIFTDrXR1bG8gViBkYSBMZWkgOS42MTAvOTguCgo1LiBTZSBhIE9icmEgZGVwb3NpdGFkYSBmb2kgb3Ugw6kgb2JqZXRvIGRlIGZpbmFuY2lhbWVudG8gcG9yIGluc3RpdHVpw6fDtWVzIGRlIGZvbWVudG8gw6AgcGVzcXVpc2Egb3UgcXVhbHF1ZXIgb3V0cmEgc2VtZWxoYW50ZSwgdm9jw6ogb3UgbyB0aXR1bGFyIGFzc2VndXJhIHF1ZSBjdW1wcml1IHRvZGFzIGFzIG9icmlnYcOnw7VlcyBxdWUgbGhlIGZvcmFtIGltcG9zdGFzIHBlbGEgaW5zdGl0dWnDp8OjbyBmaW5hbmNpYWRvcmEgZW0gcmF6w6NvIGRvIGZpbmFuY2lhbWVudG8sIGUgcXVlIG7Do28gZXN0w6EgY29udHJhcmlhbmRvIHF1YWxxdWVyIGRpc3Bvc2nDp8OjbyBjb250cmF0dWFsIHJlZmVyZW50ZSDDoCBwdWJsaWNhw6fDo28gZG8gY29udGXDumRvIG9yYSBzdWJtZXRpZG8gw6AgQmlibGlvdGVjYSBWaXJ0dWFsIEZHVi4KCjYuIENhc28gYSBPYnJhIG9yYSBkZXBvc2l0YWRhIGVuY29udHJlLXNlIGxpY2VuY2lhZGEgc29iIHVtYSBsaWNlbsOnYSBDcmVhdGl2ZSBDb21tb25zIChxdWFscXVlciB2ZXJzw6NvKSwgc29iIGEgbGljZW7Dp2EgR05VIEZyZWUgRG9jdW1lbnRhdGlvbiBMaWNlbnNlIChxdWFscXVlciB2ZXJzw6NvKSwgb3Ugb3V0cmEgbGljZW7Dp2EgcXVhbGlmaWNhZGEgY29tbyBsaXZyZSBzZWd1bmRvIG9zIGNyaXTDqXJpb3MgZGEgRGVmaW5pdGlvbiBvZiBGcmVlIEN1bHR1cmFsIFdvcmtzIChkaXNwb27DrXZlbCBlbTogaHR0cDovL2ZyZWVkb21kZWZpbmVkLm9yZy9EZWZpbml0aW9uKSBvdSBGcmVlIFNvZnR3YXJlIERlZmluaXRpb24gKGRpc3BvbsOtdmVsIGVtOiBodHRwOi8vd3d3LmdudS5vcmcvcGhpbG9zb3BoeS9mcmVlLXN3Lmh0bWwpLCBvIGFycXVpdm8gcmVmZXJlbnRlIMOgIE9icmEgZGV2ZSBpbmRpY2FyIGEgbGljZW7Dp2EgYXBsaWPDoXZlbCBlbSBjb250ZcO6ZG8gbGVnw612ZWwgcG9yIHNlcmVzIGh1bWFub3MgZSwgc2UgcG9zc8OtdmVsLCB0YW1iw6ltIGVtIG1ldGFkYWRvcyBsZWfDrXZlaXMgcG9yIG3DoXF1aW5hLiBBIGluZGljYcOnw6NvIGRhIGxpY2Vuw6dhIGFwbGljw6F2ZWwgZGV2ZSBzZXIgYWNvbXBhbmhhZGEgZGUgdW0gbGluayBwYXJhIG9zIHRlcm1vcyBkZSBsaWNlbmNpYW1lbnRvIG91IHN1YSBjw7NwaWEgaW50ZWdyYWwuCgpBbyBjb25jbHVpciBhIHByZXNlbnRlIGV0YXBhIGUgYXMgZXRhcGFzIHN1YnNlccO8ZW50ZXMgZG8gcHJvY2Vzc28gZGUgc3VibWlzc8OjbyBkZSBhcnF1aXZvcyDDoCBCaWJsaW90ZWNhIFZpcnR1YWwgRkdWLCB2b2PDqiBhdGVzdGEgcXVlIGxldSBlIGNvbmNvcmRhIGludGVncmFsbWVudGUgY29tIG9zIHRlcm1vcyBhY2ltYSBkZWxpbWl0YWRvcywgYXNzaW5hbmRvLW9zIHNlbSBmYXplciBxdWFscXVlciByZXNlcnZhIGUgbm92YW1lbnRlIGNvbmZpcm1hbmRvIHF1ZSBjdW1wcmUgb3MgcmVxdWlzaXRvcyBpbmRpY2Fkb3Mgbm8gaXRlbSAxLCBzdXByYS4KCkhhdmVuZG8gcXVhbHF1ZXIgZGlzY29yZMOibmNpYSBlbSByZWxhw6fDo28gYW9zIHByZXNlbnRlcyB0ZXJtb3Mgb3UgbsOjbyBzZSB2ZXJpZmljYW5kbyBvIGV4aWdpZG8gbm8gaXRlbSAxLCBzdXByYSwgdm9jw6ogZGV2ZSBpbnRlcnJvbXBlciBpbWVkaWF0YW1lbnRlIG8gcHJvY2Vzc28gZGUgc3VibWlzc8Ojby4gQSBjb250aW51aWRhZGUgZG8gcHJvY2Vzc28gZXF1aXZhbGUgw6AgYXNzaW5hdHVyYSBkZXN0ZSBkb2N1bWVudG8sIGNvbSB0b2RhcyBhcyBjb25zZXHDvMOqbmNpYXMgbmVsZSBwcmV2aXN0YXMsIHN1amVpdGFuZG8tc2UgbyBzaWduYXTDoXJpbyBhIHNhbsOnw7VlcyBjaXZpcyBlIGNyaW1pbmFpcyBjYXNvIG7Do28gc2VqYSB0aXR1bGFyIGRvcyBkaXJlaXRvcyBhdXRvcmFpcyBwYXRyaW1vbmlhaXMgZS9vdSBjb25leG9zIGFwbGljw6F2ZWlzIMOgIE9icmEgZGVwb3NpdGFkYSBkdXJhbnRlIGVzdGUgcHJvY2Vzc28sIG91IGNhc28gbsOjbyB0ZW5oYSBvYnRpZG8gcHLDqXZpYSBlIGV4cHJlc3NhIGF1dG9yaXphw6fDo28gZG8gdGl0dWxhciBwYXJhIG8gZGVww7NzaXRvIGUgdG9kb3Mgb3MgdXNvcyBkYSBPYnJhIGVudm9sdmlkb3MuCgpQYXJhIGEgc29sdcOnw6NvIGRlIHF1YWxxdWVyIGTDunZpZGEgcXVhbnRvIGFvcyB0ZXJtb3MgZGUgbGljZW5jaWFtZW50byBlIG8gcHJvY2Vzc28gZGUgc3VibWlzc8OjbywgY2xpcXVlIG5vIGxpbmsgIkZhbGUgY29ub3NjbyIuCgpTZSB2b2PDqiB0aXZlciBkw7p2aWRhcyBzb2JyZSBlc3RhIGxpY2Vuw6dhLCBwb3IgZmF2b3IgZW50cmUgZW0gY29udGF0byBjb20gb3MgYWRtaW5pc3RyYWRvcmVzIGRvIFJlcG9zaXTDs3Jpby4K
dc.title.eng.fl_str_mv	Streaming, distributed, and asynchronous amortized inference
title	Streaming, distributed, and asynchronous amortized inference
spellingShingle	Streaming, distributed, and asynchronous amortized inference Henrique, Tiago da Silva Inference Distributed GFlowNets Inferência bayesiana Aprendizado profundo geométrico Métodos distribuídos Matemática
title_short	Streaming, distributed, and asynchronous amortized inference
title_full	Streaming, distributed, and asynchronous amortized inference
title_fullStr	Streaming, distributed, and asynchronous amortized inference
title_full_unstemmed	Streaming, distributed, and asynchronous amortized inference
title_sort	Streaming, distributed, and asynchronous amortized inference
author	Henrique, Tiago da Silva
author_facet	Henrique, Tiago da Silva
author_role	author
dc.contributor.unidadefgv.por.fl_str_mv	Escolas::EMAp
dc.contributor.member.none.fl_str_mv	Cozman, Fabio Gagliardi Laber, Eduardo Sany Oliveira, Roberto Imbuzeiro
dc.contributor.author.fl_str_mv	Henrique, Tiago da Silva
dc.contributor.advisor1.fl_str_mv	Mesquita, Diego
contributor_str_mv	Mesquita, Diego
dc.subject.eng.fl_str_mv	Inference Distributed GFlowNets
topic	Inference Distributed GFlowNets Inferência bayesiana Aprendizado profundo geométrico Métodos distribuídos Matemática
dc.subject.por.fl_str_mv	Inferência bayesiana Aprendizado profundo geométrico Métodos distribuídos
dc.subject.area.por.fl_str_mv	Matemática
description	We address the problem of sampling from an unnormalized distribution defined in a compositional space, i.e., a continuous or discrete set whose elements can be sequentially constructed from an initial state through the application of simple actions. This definition accommodates the space of (directed acyclic) graphs, natural language sentences of bounded size, and Euclidean n-spaces, among others, and is at the core of many applications in (Bayesian) statistics and machine learning. In particular, we focus on Generative Flow Networks (GFlowNets), a family of amortized samplers which cast the problem of sampling as finding a flow assignment in a flow network such that the total flow reaching a sink node equals that node's unnormalized probability. Despite their remarkable success in drug discovery, structure learning, and natural language processing, important questions regarding the scalability, generalization, and limitations of these models remain largely underexplored by the literature. In view of this, this thesis contributes with both methodological and theoretical advances for a better usability and understanding of GFlowNets. From a computational perspective, we design novel algorithms for the non-localized training of GFlowNets. This enables learning these models in a streaming and distributed fashion, which is crucial for managing ever-increasing data sizes and exploiting the architecture of modern computer clusters. The central idea of our methods is to break up the flow assignment problem into easier subproblems solved by separately trained GFlowNets. Once trained, these models are aggregated by a global GFlowNet. To do so efficiently, we also revisit the relationship between GFlowNets and variational inference and devise low-variance estimators for their learning objective's gradients to achieve faster training convergence. Overall, our experiments show that our non-localized procedures often lead to better approximations in a shorter time relatively to a centralized monolithic GFlowNet. Additionally, we demonstrate that the models corresponding to the global minimizers of the proposed surrogate learning objectives sample in proportion to the unnormalized target. This fact raises the questions of when a GFlowNet can reach such a global minimum and how close a trained model is to it. Towards answering them, we first present a family of discrete distributions that cannot be approximated by a GFlowNet when the flow functions are parameterized by 1-WL graph neural networks. Then, we develop a computationally amenable metric to probe the distributional accuracy of GFlowNets. Finally, as GFlowNets rely exclusively on a subgraph of the (potentially huge) flow network to learn a flow assignment, we argue that generalization plays a critical role in their success and derive the first non-vacuous (PAC-Bayesian) statistical guarantees for these models.
publishDate	2024
dc.date.issued.fl_str_mv	2024-12-20
dc.date.accessioned.fl_str_mv	2025-01-14T13:08:22Z
dc.date.available.fl_str_mv	2025-01-14T13:08:22Z
dc.type.status.fl_str_mv	info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv	info:eu-repo/semantics/doctoralThesis
format	doctoralThesis
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	https://hdl.handle.net/10438/36338
url	https://hdl.handle.net/10438/36338
dc.language.iso.fl_str_mv	eng
language	eng
dc.rights.driver.fl_str_mv	info:eu-repo/semantics/openAccess
eu_rights_str_mv	openAccess
dc.source.none.fl_str_mv	reponame:Repositório Institucional do FGV (FGV Repositório Digital) instname:Fundação Getulio Vargas (FGV) instacron:FGV
instname_str	Fundação Getulio Vargas (FGV)
instacron_str	FGV
institution	FGV
reponame_str	Repositório Institucional do FGV (FGV Repositório Digital)
collection	Repositório Institucional do FGV (FGV Repositório Digital)
bitstream.url.fl_str_mv	https://repositorio.fgv.br/bitstreams/ce9d924a-9554-4e37-bff3-d8f4d6dbb833/download https://repositorio.fgv.br/bitstreams/387e2a1d-f8c0-412e-9e7e-d931dc3a4836/download https://repositorio.fgv.br/bitstreams/cd1e23ec-4cac-410f-830f-3050b7db6529/download https://repositorio.fgv.br/bitstreams/5e65dc3a-2398-404f-af40-9b146a6a1de4/download
bitstream.checksum.fl_str_mv	2df1490eba01b7f081207541b7f31960 2a4b67231f701c416a809246e7a10077 8556190c4bb53076603b0489a36026e8 c5b53c3991c54830946900b533a13dab
bitstream.checksumAlgorithm.fl_str_mv	MD5 MD5 MD5 MD5
repository.name.fl_str_mv	Repositório Institucional do FGV (FGV Repositório Digital) - Fundação Getulio Vargas (FGV)
repository.mail.fl_str_mv
_version_	1827846403105226752

Streaming, distributed, and asynchronous amortized inference

Similar Items