Streaming, distributed, and asynchronous amortized inference
Main Author: | |
---|---|
Publication Date: | 2024 |
Format: | Doctoral thesis |
Language: | eng |
Source: | Repositório Institucional do FGV (FGV Repositório Digital) |
Download full: | https://hdl.handle.net/10438/36338 |
Summary: | We address the problem of sampling from an unnormalized distribution defined in a compositional space, i.e., a continuous or discrete set whose elements can be sequentially constructed from an initial state through the application of simple actions. This definition accommodates the space of (directed acyclic) graphs, natural language sentences of bounded size, and Euclidean n-spaces, among others, and is at the core of many applications in (Bayesian) statistics and machine learning. In particular, we focus on Generative Flow Networks (GFlowNets), a family of amortized samplers which cast the problem of sampling as finding a flow assignment in a flow network such that the total flow reaching a sink node equals that node's unnormalized probability. Despite their remarkable success in drug discovery, structure learning, and natural language processing, important questions regarding the scalability, generalization, and limitations of these models remain largely underexplored by the literature. In view of this, this thesis contributes with both methodological and theoretical advances for a better usability and understanding of GFlowNets. From a computational perspective, we design novel algorithms for the non-localized training of GFlowNets. This enables learning these models in a streaming and distributed fashion, which is crucial for managing ever-increasing data sizes and exploiting the architecture of modern computer clusters. The central idea of our methods is to break up the flow assignment problem into easier subproblems solved by separately trained GFlowNets. Once trained, these models are aggregated by a global GFlowNet. To do so efficiently, we also revisit the relationship between GFlowNets and variational inference and devise low-variance estimators for their learning objective's gradients to achieve faster training convergence. Overall, our experiments show that our non-localized procedures often lead to better approximations in a shorter time relatively to a centralized monolithic GFlowNet. Additionally, we demonstrate that the models corresponding to the global minimizers of the proposed surrogate learning objectives sample in proportion to the unnormalized target. This fact raises the questions of when a GFlowNet can reach such a global minimum and how close a trained model is to it. Towards answering them, we first present a family of discrete distributions that cannot be approximated by a GFlowNet when the flow functions are parameterized by 1-WL graph neural networks. Then, we develop a computationally amenable metric to probe the distributional accuracy of GFlowNets. Finally, as GFlowNets rely exclusively on a subgraph of the (potentially huge) flow network to learn a flow assignment, we argue that generalization plays a critical role in their success and derive the first non-vacuous (PAC-Bayesian) statistical guarantees for these models. |
id |
FGV_bdfd9cda58ebf7412c761a209a7e40f3 |
---|---|
oai_identifier_str |
oai:repositorio.fgv.br:10438/36338 |
network_acronym_str |
FGV |
network_name_str |
Repositório Institucional do FGV (FGV Repositório Digital) |
repository_id_str |
3974 |
spelling |
Henrique, Tiago da SilvaEscolas::EMApCozman, Fabio GagliardiLaber, Eduardo SanyOliveira, Roberto ImbuzeiroMesquita, Diego2025-01-14T13:08:22Z2025-01-14T13:08:22Z2024-12-20https://hdl.handle.net/10438/36338We address the problem of sampling from an unnormalized distribution defined in a compositional space, i.e., a continuous or discrete set whose elements can be sequentially constructed from an initial state through the application of simple actions. This definition accommodates the space of (directed acyclic) graphs, natural language sentences of bounded size, and Euclidean n-spaces, among others, and is at the core of many applications in (Bayesian) statistics and machine learning. In particular, we focus on Generative Flow Networks (GFlowNets), a family of amortized samplers which cast the problem of sampling as finding a flow assignment in a flow network such that the total flow reaching a sink node equals that node's unnormalized probability. Despite their remarkable success in drug discovery, structure learning, and natural language processing, important questions regarding the scalability, generalization, and limitations of these models remain largely underexplored by the literature. In view of this, this thesis contributes with both methodological and theoretical advances for a better usability and understanding of GFlowNets. From a computational perspective, we design novel algorithms for the non-localized training of GFlowNets. This enables learning these models in a streaming and distributed fashion, which is crucial for managing ever-increasing data sizes and exploiting the architecture of modern computer clusters. The central idea of our methods is to break up the flow assignment problem into easier subproblems solved by separately trained GFlowNets. Once trained, these models are aggregated by a global GFlowNet. To do so efficiently, we also revisit the relationship between GFlowNets and variational inference and devise low-variance estimators for their learning objective's gradients to achieve faster training convergence. Overall, our experiments show that our non-localized procedures often lead to better approximations in a shorter time relatively to a centralized monolithic GFlowNet. Additionally, we demonstrate that the models corresponding to the global minimizers of the proposed surrogate learning objectives sample in proportion to the unnormalized target. This fact raises the questions of when a GFlowNet can reach such a global minimum and how close a trained model is to it. Towards answering them, we first present a family of discrete distributions that cannot be approximated by a GFlowNet when the flow functions are parameterized by 1-WL graph neural networks. Then, we develop a computationally amenable metric to probe the distributional accuracy of GFlowNets. Finally, as GFlowNets rely exclusively on a subgraph of the (potentially huge) flow network to learn a flow assignment, we argue that generalization plays a critical role in their success and derive the first non-vacuous (PAC-Bayesian) statistical guarantees for these models.Nós endereçamos o problema de amostragem de uma distribuição não normalizada definida em um espaço composicional, i.e., um conjunto contínuo ou discreto cujos elementos podem ser construídos sequencialmente a partir de um estado inicial por meio da aplicação de ações simples. Esta definição abrange o espaço de grafos (acíclicos direcionados), sentenças em linguagem natural de tamanho limitado e espaços euclidianos de dimensão n, entre outros, e é central em muitas aplicações em estatística (Bayesiana) e aprendizado de máquina. Em particular, nós focamos em Generative Flow Networks (GFlowNets), uma família de amostradores amortizados que formulam o problema de amostragem como a busca por uma atribuição de fluxo em uma rede de fluxo tal que o volume total chegando a um nó de sumidouro seja igual à probabilidade não normalizada desse nó. Apesar de seu sucesso notável em descoberta de medicamentos, aprendizado de estrutura e processamento de linguagem natural, questões importantes sobre escalabilidade, generalização e limitações desses modelos permanecem amplamente inexploradas na literatura. Assim, esta tese contribui com avanços metodológicos e teóricos para uma melhor usabilidade e compreensão de GFlowNets. Sob uma perspectiva computacional, projetamos novos algoritmos para o treinamento não localizado de GFlowNets. Isso permite o aprendizado desses modelos de forma dinâmica e distribuída, o que é crucial para lidar com o aumento constante no tamanho dos conjuntos de dados e para aproveitar a arquitetura dos modernos e poderosos clusters de computadores. Em resumo, a ideia central de nossos métodos consiste em dividir o problema de atribuição de fluxo em subproblemas mais simples, que são resolvidos por GFlowNets treinadas separadamente. Uma vez treinados, esses modelos são agregados por uma GFlowNet global. Para fazer isso de maneira eficiente, também revisitamos a relação entre GFlowNets e inferência variacional, desenvolvendo estimadores de baixa variância para os gradientes da sua função de perda e, em consequência, acelerando a convergência do treinamento. Além disso, nossos experimentos mostram que nosso procedimento não localizado frequentemente leva a melhores aproximações em um tempo mais curto em relação a uma GFlowNet monolítica e centralizada. Importantemente, também demonstramos que os modelos correspondentes aos minimizadores globais dos objetivos de aprendizado propostos amostram corretamente da distribuição alvo não normalizada. Isso levanta naturalmente as questões de quando uma GFlowNet pode alcançar esse mínimo global e quão próximo está um dado modelo desse ótimo. Para responder a essas perguntas, primeiro construímos uma família explícita de distribuições discretas que não podem ser aproximadas por uma GFlowNet quando as funções de fluxo são parametrizadas por redes neurais de grafos com expressividade 1-WL. Em seguida, desenvolvemos uma métrica computacionalmente viável para investigar a acurácia distribucional das GFlowNets. Por fim, como as GFlowNets utilizam apenas um subgrafo da (geralmente enorme ou infinita) rede de fluxo para aprender uma atribuição de fluxo, nós argumentamos que a generalização desempenha um papel crítico em seu sucesso e derivamos as primeiras garantias estatísticas não vazias para esses modelos.Métodos distribuídosThe works in this thesis were funded by the Fundação Carlos Chagas Filho de Amparo à Pesquisa do Estado do Rio de Janeiro FAPERJ (SEI-260003/000709/2023), the São Paulo Research Foundation FAPESP (2023/00815-6), and the Conselho Nacional de Desenvolvimento Científico e Tecnológico CNPq (404336/2023-0).engInferenceDistributedGFlowNetsInferência bayesianaAprendizado profundo geométricoMétodos distribuídosMatemáticaStreaming, distributed, and asynchronous amortized inferenceinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/doctoralThesisinfo:eu-repo/semantics/openAccessreponame:Repositório Institucional do FGV (FGV Repositório Digital)instname:Fundação Getulio Vargas (FGV)instacron:FGVORIGINAL_DSc_thesis__Streaming__Distributed__and_Asynchronous_Generative_Flow_Networks (2).pdf_DSc_thesis__Streaming__Distributed__and_Asynchronous_Generative_Flow_Networks (2).pdfPDFapplication/pdf5448562https://repositorio.fgv.br/bitstreams/ce9d924a-9554-4e37-bff3-d8f4d6dbb833/download2df1490eba01b7f081207541b7f31960MD51LICENSElicense.txtlicense.txttext/plain; charset=utf-85112https://repositorio.fgv.br/bitstreams/387e2a1d-f8c0-412e-9e7e-d931dc3a4836/download2a4b67231f701c416a809246e7a10077MD52TEXT_DSc_thesis__Streaming__Distributed__and_Asynchronous_Generative_Flow_Networks (2).pdf.txt_DSc_thesis__Streaming__Distributed__and_Asynchronous_Generative_Flow_Networks (2).pdf.txtExtracted texttext/plain102528https://repositorio.fgv.br/bitstreams/cd1e23ec-4cac-410f-830f-3050b7db6529/download8556190c4bb53076603b0489a36026e8MD53THUMBNAIL_DSc_thesis__Streaming__Distributed__and_Asynchronous_Generative_Flow_Networks (2).pdf.jpg_DSc_thesis__Streaming__Distributed__and_Asynchronous_Generative_Flow_Networks (2).pdf.jpgGenerated Thumbnailimage/jpeg2761https://repositorio.fgv.br/bitstreams/5e65dc3a-2398-404f-af40-9b146a6a1de4/downloadc5b53c3991c54830946900b533a13dabMD5410438/363382025-01-14 17:00:44.23open.accessoai:repositorio.fgv.br:10438/36338https://repositorio.fgv.brRepositório InstitucionalPRIhttp://bibliotecadigital.fgv.br/dspace-oai/requestopendoar:39742025-01-14T17:00:44Repositório Institucional do FGV (FGV Repositório Digital) - Fundação Getulio Vargas (FGV)falseVGVybW8gZGUgTGljZW5jaWFtZW50bwpIw6EgdW0gw7psdGltbyBwYXNzbzogcGFyYSByZXByb2R1emlyLCB0cmFkdXppciBlIGRpc3RyaWJ1aXIgc3VhIHN1Ym1pc3PDo28gZW0gdG9kbyBvIG11bmRvLCB2b2PDqiBkZXZlIGNvbmNvcmRhciBjb20gb3MgdGVybW9zIGEgc2VndWlyLgoKQ29uY29yZGFyIGNvbSBvIFRlcm1vIGRlIExpY2VuY2lhbWVudG8sIHNlbGVjaW9uYW5kbyAiRXUgY29uY29yZG8gY29tIG8gVGVybW8gZGUgTGljZW5jaWFtZW50byIgZSBjbGlxdWUgZW0gIkZpbmFsaXphciBzdWJtaXNzw6NvIi4KClRFUk1PUyBMSUNFTkNJQU1FTlRPIFBBUkEgQVJRVUlWQU1FTlRPLCBSRVBST0RVw4fDg08gRSBESVZVTEdBw4fDg08gUMOaQkxJQ0EgREUgQ09OVEXDmkRPIMOAIEJJQkxJT1RFQ0EgVklSVFVBTCBGR1YgKHZlcnPDo28gMS4yKQoKMS4gVm9jw6osIHVzdcOhcmlvLWRlcG9zaXRhbnRlIGRhIEJpYmxpb3RlY2EgVmlydHVhbCBGR1YsIGFzc2VndXJhLCBubyBwcmVzZW50ZSBhdG8sIHF1ZSDDqSB0aXR1bGFyIGRvcyBkaXJlaXRvcyBhdXRvcmFpcyBwYXRyaW1vbmlhaXMgZS9vdSBkaXJlaXRvcyBjb25leG9zIHJlZmVyZW50ZXMgw6AgdG90YWxpZGFkZSBkYSBPYnJhIG9yYSBkZXBvc2l0YWRhIGVtIGZvcm1hdG8gZGlnaXRhbCwgYmVtIGNvbW8gZGUgc2V1cyBjb21wb25lbnRlcyBtZW5vcmVzLCBlbSBzZSB0cmF0YW5kbyBkZSBvYnJhIGNvbGV0aXZhLCBjb25mb3JtZSBvIHByZWNlaXR1YWRvIHBlbGEgTGVpIDkuNjEwLzk4IGUvb3UgTGVpIDkuNjA5Lzk4LiBOw6NvIHNlbmRvIGVzdGUgbyBjYXNvLCB2b2PDqiBhc3NlZ3VyYSB0ZXIgb2J0aWRvLCBkaXJldGFtZW50ZSBkb3MgZGV2aWRvcyB0aXR1bGFyZXMsIGF1dG9yaXphw6fDo28gcHLDqXZpYSBlIGV4cHJlc3NhIHBhcmEgbyBkZXDDs3NpdG8gZSBkaXZ1bGdhw6fDo28gZGEgT2JyYSwgYWJyYW5nZW5kbyB0b2RvcyBvcyBkaXJlaXRvcyBhdXRvcmFpcyBlIGNvbmV4b3MgYWZldGFkb3MgcGVsYSBhc3NpbmF0dXJhIGRvcyBwcmVzZW50ZXMgdGVybW9zIGRlIGxpY2VuY2lhbWVudG8sIGRlIG1vZG8gYSBlZmV0aXZhbWVudGUgaXNlbnRhciBhIEZ1bmRhw6fDo28gR2V0dWxpbyBWYXJnYXMgZSBzZXVzIGZ1bmNpb27DoXJpb3MgZGUgcXVhbHF1ZXIgcmVzcG9uc2FiaWxpZGFkZSBwZWxvIHVzbyBuw6NvLWF1dG9yaXphZG8gZG8gbWF0ZXJpYWwgZGVwb3NpdGFkbywgc2VqYSBlbSB2aW5jdWxhw6fDo28gw6AgQmlibGlvdGVjYSBWaXJ0dWFsIEZHViwgc2VqYSBlbSB2aW5jdWxhw6fDo28gYSBxdWFpc3F1ZXIgc2VydmnDp29zIGRlIGJ1c2NhIGUgZGlzdHJpYnVpw6fDo28gZGUgY29udGXDumRvIHF1ZSBmYcOnYW0gdXNvIGRhcyBpbnRlcmZhY2VzIGUgZXNwYcOnbyBkZSBhcm1hemVuYW1lbnRvIHByb3ZpZGVuY2lhZG9zIHBlbGEgRnVuZGHDp8OjbyBHZXR1bGlvIFZhcmdhcyBwb3IgbWVpbyBkZSBzZXVzIHNpc3RlbWFzIGluZm9ybWF0aXphZG9zLgoKMi4gQSBhc3NpbmF0dXJhIGRlc3RhIGxpY2Vuw6dhIHRlbSBjb21vIGNvbnNlccO8w6puY2lhIGEgdHJhbnNmZXLDqm5jaWEsIGEgdMOtdHVsbyBuw6NvLWV4Y2x1c2l2byBlIG7Do28tb25lcm9zbywgaXNlbnRhIGRvIHBhZ2FtZW50byBkZSByb3lhbHRpZXMgb3UgcXVhbHF1ZXIgb3V0cmEgY29udHJhcHJlc3Rhw6fDo28sIHBlY3VuacOhcmlhIG91IG7Do28sIMOgIEZ1bmRhw6fDo28gR2V0dWxpbyBWYXJnYXMsIGRvcyBkaXJlaXRvcyBkZSBhcm1hemVuYXIgZGlnaXRhbG1lbnRlLCByZXByb2R1emlyIGUgZGlzdHJpYnVpciBuYWNpb25hbCBlIGludGVybmFjaW9uYWxtZW50ZSBhIE9icmEsIGluY2x1aW5kby1zZSBvIHNldSByZXN1bW8vYWJzdHJhY3QsIHBvciBtZWlvcyBlbGV0csO0bmljb3MsIG5vIHNpdGUgZGEgQmlibGlvdGVjYSBWaXJ0dWFsIEZHViwgYW8gcMO6YmxpY28gZW0gZ2VyYWwsIGVtIHJlZ2ltZSBkZSBhY2Vzc28gYWJlcnRvLgoKMy4gQSBwcmVzZW50ZSBsaWNlbsOnYSB0YW1iw6ltIGFicmFuZ2UsIG5vcyBtZXNtb3MgdGVybW9zIGVzdGFiZWxlY2lkb3Mgbm8gaXRlbSAyLCBzdXByYSwgcXVhbHF1ZXIgZGlyZWl0byBkZSBjb211bmljYcOnw6NvIGFvIHDDumJsaWNvIGNhYsOtdmVsIGVtIHJlbGHDp8OjbyDDoCBPYnJhIG9yYSBkZXBvc2l0YWRhLCBpbmNsdWluZG8tc2Ugb3MgdXNvcyByZWZlcmVudGVzIMOgIHJlcHJlc2VudGHDp8OjbyBww7pibGljYSBlL291IGV4ZWN1w6fDo28gcMO6YmxpY2EsIGJlbSBjb21vIHF1YWxxdWVyIG91dHJhIG1vZGFsaWRhZGUgZGUgY29tdW5pY2HDp8OjbyBhbyBww7pibGljbyBxdWUgZXhpc3RhIG91IHZlbmhhIGEgZXhpc3Rpciwgbm9zIHRlcm1vcyBkbyBhcnRpZ28gNjggZSBzZWd1aW50ZXMgZGEgTGVpIDkuNjEwLzk4LCBuYSBleHRlbnPDo28gcXVlIGZvciBhcGxpY8OhdmVsIGFvcyBzZXJ2acOnb3MgcHJlc3RhZG9zIGFvIHDDumJsaWNvIHBlbGEgQmlibGlvdGVjYSBWaXJ0dWFsIEZHVi4KCjQuIEVzdGEgbGljZW7Dp2EgYWJyYW5nZSwgYWluZGEsIG5vcyBtZXNtb3MgdGVybW9zIGVzdGFiZWxlY2lkb3Mgbm8gaXRlbSAyLCBzdXByYSwgdG9kb3Mgb3MgZGlyZWl0b3MgY29uZXhvcyBkZSBhcnRpc3RhcyBpbnTDqXJwcmV0ZXMgb3UgZXhlY3V0YW50ZXMsIHByb2R1dG9yZXMgZm9ub2dyw6FmaWNvcyBvdSBlbXByZXNhcyBkZSByYWRpb2RpZnVzw6NvIHF1ZSBldmVudHVhbG1lbnRlIHNlamFtIGFwbGljw6F2ZWlzIGVtIHJlbGHDp8OjbyDDoCBvYnJhIGRlcG9zaXRhZGEsIGVtIGNvbmZvcm1pZGFkZSBjb20gbyByZWdpbWUgZml4YWRvIG5vIFTDrXR1bG8gViBkYSBMZWkgOS42MTAvOTguCgo1LiBTZSBhIE9icmEgZGVwb3NpdGFkYSBmb2kgb3Ugw6kgb2JqZXRvIGRlIGZpbmFuY2lhbWVudG8gcG9yIGluc3RpdHVpw6fDtWVzIGRlIGZvbWVudG8gw6AgcGVzcXVpc2Egb3UgcXVhbHF1ZXIgb3V0cmEgc2VtZWxoYW50ZSwgdm9jw6ogb3UgbyB0aXR1bGFyIGFzc2VndXJhIHF1ZSBjdW1wcml1IHRvZGFzIGFzIG9icmlnYcOnw7VlcyBxdWUgbGhlIGZvcmFtIGltcG9zdGFzIHBlbGEgaW5zdGl0dWnDp8OjbyBmaW5hbmNpYWRvcmEgZW0gcmF6w6NvIGRvIGZpbmFuY2lhbWVudG8sIGUgcXVlIG7Do28gZXN0w6EgY29udHJhcmlhbmRvIHF1YWxxdWVyIGRpc3Bvc2nDp8OjbyBjb250cmF0dWFsIHJlZmVyZW50ZSDDoCBwdWJsaWNhw6fDo28gZG8gY29udGXDumRvIG9yYSBzdWJtZXRpZG8gw6AgQmlibGlvdGVjYSBWaXJ0dWFsIEZHVi4KCjYuIENhc28gYSBPYnJhIG9yYSBkZXBvc2l0YWRhIGVuY29udHJlLXNlIGxpY2VuY2lhZGEgc29iIHVtYSBsaWNlbsOnYSBDcmVhdGl2ZSBDb21tb25zIChxdWFscXVlciB2ZXJzw6NvKSwgc29iIGEgbGljZW7Dp2EgR05VIEZyZWUgRG9jdW1lbnRhdGlvbiBMaWNlbnNlIChxdWFscXVlciB2ZXJzw6NvKSwgb3Ugb3V0cmEgbGljZW7Dp2EgcXVhbGlmaWNhZGEgY29tbyBsaXZyZSBzZWd1bmRvIG9zIGNyaXTDqXJpb3MgZGEgRGVmaW5pdGlvbiBvZiBGcmVlIEN1bHR1cmFsIFdvcmtzIChkaXNwb27DrXZlbCBlbTogaHR0cDovL2ZyZWVkb21kZWZpbmVkLm9yZy9EZWZpbml0aW9uKSBvdSBGcmVlIFNvZnR3YXJlIERlZmluaXRpb24gKGRpc3BvbsOtdmVsIGVtOiBodHRwOi8vd3d3LmdudS5vcmcvcGhpbG9zb3BoeS9mcmVlLXN3Lmh0bWwpLCBvIGFycXVpdm8gcmVmZXJlbnRlIMOgIE9icmEgZGV2ZSBpbmRpY2FyIGEgbGljZW7Dp2EgYXBsaWPDoXZlbCBlbSBjb250ZcO6ZG8gbGVnw612ZWwgcG9yIHNlcmVzIGh1bWFub3MgZSwgc2UgcG9zc8OtdmVsLCB0YW1iw6ltIGVtIG1ldGFkYWRvcyBsZWfDrXZlaXMgcG9yIG3DoXF1aW5hLiBBIGluZGljYcOnw6NvIGRhIGxpY2Vuw6dhIGFwbGljw6F2ZWwgZGV2ZSBzZXIgYWNvbXBhbmhhZGEgZGUgdW0gbGluayBwYXJhIG9zIHRlcm1vcyBkZSBsaWNlbmNpYW1lbnRvIG91IHN1YSBjw7NwaWEgaW50ZWdyYWwuCgpBbyBjb25jbHVpciBhIHByZXNlbnRlIGV0YXBhIGUgYXMgZXRhcGFzIHN1YnNlccO8ZW50ZXMgZG8gcHJvY2Vzc28gZGUgc3VibWlzc8OjbyBkZSBhcnF1aXZvcyDDoCBCaWJsaW90ZWNhIFZpcnR1YWwgRkdWLCB2b2PDqiBhdGVzdGEgcXVlIGxldSBlIGNvbmNvcmRhIGludGVncmFsbWVudGUgY29tIG9zIHRlcm1vcyBhY2ltYSBkZWxpbWl0YWRvcywgYXNzaW5hbmRvLW9zIHNlbSBmYXplciBxdWFscXVlciByZXNlcnZhIGUgbm92YW1lbnRlIGNvbmZpcm1hbmRvIHF1ZSBjdW1wcmUgb3MgcmVxdWlzaXRvcyBpbmRpY2Fkb3Mgbm8gaXRlbSAxLCBzdXByYS4KCkhhdmVuZG8gcXVhbHF1ZXIgZGlzY29yZMOibmNpYSBlbSByZWxhw6fDo28gYW9zIHByZXNlbnRlcyB0ZXJtb3Mgb3UgbsOjbyBzZSB2ZXJpZmljYW5kbyBvIGV4aWdpZG8gbm8gaXRlbSAxLCBzdXByYSwgdm9jw6ogZGV2ZSBpbnRlcnJvbXBlciBpbWVkaWF0YW1lbnRlIG8gcHJvY2Vzc28gZGUgc3VibWlzc8Ojby4gQSBjb250aW51aWRhZGUgZG8gcHJvY2Vzc28gZXF1aXZhbGUgw6AgYXNzaW5hdHVyYSBkZXN0ZSBkb2N1bWVudG8sIGNvbSB0b2RhcyBhcyBjb25zZXHDvMOqbmNpYXMgbmVsZSBwcmV2aXN0YXMsIHN1amVpdGFuZG8tc2UgbyBzaWduYXTDoXJpbyBhIHNhbsOnw7VlcyBjaXZpcyBlIGNyaW1pbmFpcyBjYXNvIG7Do28gc2VqYSB0aXR1bGFyIGRvcyBkaXJlaXRvcyBhdXRvcmFpcyBwYXRyaW1vbmlhaXMgZS9vdSBjb25leG9zIGFwbGljw6F2ZWlzIMOgIE9icmEgZGVwb3NpdGFkYSBkdXJhbnRlIGVzdGUgcHJvY2Vzc28sIG91IGNhc28gbsOjbyB0ZW5oYSBvYnRpZG8gcHLDqXZpYSBlIGV4cHJlc3NhIGF1dG9yaXphw6fDo28gZG8gdGl0dWxhciBwYXJhIG8gZGVww7NzaXRvIGUgdG9kb3Mgb3MgdXNvcyBkYSBPYnJhIGVudm9sdmlkb3MuCgpQYXJhIGEgc29sdcOnw6NvIGRlIHF1YWxxdWVyIGTDunZpZGEgcXVhbnRvIGFvcyB0ZXJtb3MgZGUgbGljZW5jaWFtZW50byBlIG8gcHJvY2Vzc28gZGUgc3VibWlzc8OjbywgY2xpcXVlIG5vIGxpbmsgIkZhbGUgY29ub3NjbyIuCgpTZSB2b2PDqiB0aXZlciBkw7p2aWRhcyBzb2JyZSBlc3RhIGxpY2Vuw6dhLCBwb3IgZmF2b3IgZW50cmUgZW0gY29udGF0byBjb20gb3MgYWRtaW5pc3RyYWRvcmVzIGRvIFJlcG9zaXTDs3Jpby4K |
dc.title.eng.fl_str_mv |
Streaming, distributed, and asynchronous amortized inference |
title |
Streaming, distributed, and asynchronous amortized inference |
spellingShingle |
Streaming, distributed, and asynchronous amortized inference Henrique, Tiago da Silva Inference Distributed GFlowNets Inferência bayesiana Aprendizado profundo geométrico Métodos distribuídos Matemática |
title_short |
Streaming, distributed, and asynchronous amortized inference |
title_full |
Streaming, distributed, and asynchronous amortized inference |
title_fullStr |
Streaming, distributed, and asynchronous amortized inference |
title_full_unstemmed |
Streaming, distributed, and asynchronous amortized inference |
title_sort |
Streaming, distributed, and asynchronous amortized inference |
author |
Henrique, Tiago da Silva |
author_facet |
Henrique, Tiago da Silva |
author_role |
author |
dc.contributor.unidadefgv.por.fl_str_mv |
Escolas::EMAp |
dc.contributor.member.none.fl_str_mv |
Cozman, Fabio Gagliardi Laber, Eduardo Sany Oliveira, Roberto Imbuzeiro |
dc.contributor.author.fl_str_mv |
Henrique, Tiago da Silva |
dc.contributor.advisor1.fl_str_mv |
Mesquita, Diego |
contributor_str_mv |
Mesquita, Diego |
dc.subject.eng.fl_str_mv |
Inference Distributed GFlowNets |
topic |
Inference Distributed GFlowNets Inferência bayesiana Aprendizado profundo geométrico Métodos distribuídos Matemática |
dc.subject.por.fl_str_mv |
Inferência bayesiana Aprendizado profundo geométrico Métodos distribuídos |
dc.subject.area.por.fl_str_mv |
Matemática |
description |
We address the problem of sampling from an unnormalized distribution defined in a compositional space, i.e., a continuous or discrete set whose elements can be sequentially constructed from an initial state through the application of simple actions. This definition accommodates the space of (directed acyclic) graphs, natural language sentences of bounded size, and Euclidean n-spaces, among others, and is at the core of many applications in (Bayesian) statistics and machine learning. In particular, we focus on Generative Flow Networks (GFlowNets), a family of amortized samplers which cast the problem of sampling as finding a flow assignment in a flow network such that the total flow reaching a sink node equals that node's unnormalized probability. Despite their remarkable success in drug discovery, structure learning, and natural language processing, important questions regarding the scalability, generalization, and limitations of these models remain largely underexplored by the literature. In view of this, this thesis contributes with both methodological and theoretical advances for a better usability and understanding of GFlowNets. From a computational perspective, we design novel algorithms for the non-localized training of GFlowNets. This enables learning these models in a streaming and distributed fashion, which is crucial for managing ever-increasing data sizes and exploiting the architecture of modern computer clusters. The central idea of our methods is to break up the flow assignment problem into easier subproblems solved by separately trained GFlowNets. Once trained, these models are aggregated by a global GFlowNet. To do so efficiently, we also revisit the relationship between GFlowNets and variational inference and devise low-variance estimators for their learning objective's gradients to achieve faster training convergence. Overall, our experiments show that our non-localized procedures often lead to better approximations in a shorter time relatively to a centralized monolithic GFlowNet. Additionally, we demonstrate that the models corresponding to the global minimizers of the proposed surrogate learning objectives sample in proportion to the unnormalized target. This fact raises the questions of when a GFlowNet can reach such a global minimum and how close a trained model is to it. Towards answering them, we first present a family of discrete distributions that cannot be approximated by a GFlowNet when the flow functions are parameterized by 1-WL graph neural networks. Then, we develop a computationally amenable metric to probe the distributional accuracy of GFlowNets. Finally, as GFlowNets rely exclusively on a subgraph of the (potentially huge) flow network to learn a flow assignment, we argue that generalization plays a critical role in their success and derive the first non-vacuous (PAC-Bayesian) statistical guarantees for these models. |
publishDate |
2024 |
dc.date.issued.fl_str_mv |
2024-12-20 |
dc.date.accessioned.fl_str_mv |
2025-01-14T13:08:22Z |
dc.date.available.fl_str_mv |
2025-01-14T13:08:22Z |
dc.type.status.fl_str_mv |
info:eu-repo/semantics/publishedVersion |
dc.type.driver.fl_str_mv |
info:eu-repo/semantics/doctoralThesis |
format |
doctoralThesis |
status_str |
publishedVersion |
dc.identifier.uri.fl_str_mv |
https://hdl.handle.net/10438/36338 |
url |
https://hdl.handle.net/10438/36338 |
dc.language.iso.fl_str_mv |
eng |
language |
eng |
dc.rights.driver.fl_str_mv |
info:eu-repo/semantics/openAccess |
eu_rights_str_mv |
openAccess |
dc.source.none.fl_str_mv |
reponame:Repositório Institucional do FGV (FGV Repositório Digital) instname:Fundação Getulio Vargas (FGV) instacron:FGV |
instname_str |
Fundação Getulio Vargas (FGV) |
instacron_str |
FGV |
institution |
FGV |
reponame_str |
Repositório Institucional do FGV (FGV Repositório Digital) |
collection |
Repositório Institucional do FGV (FGV Repositório Digital) |
bitstream.url.fl_str_mv |
https://repositorio.fgv.br/bitstreams/ce9d924a-9554-4e37-bff3-d8f4d6dbb833/download https://repositorio.fgv.br/bitstreams/387e2a1d-f8c0-412e-9e7e-d931dc3a4836/download https://repositorio.fgv.br/bitstreams/cd1e23ec-4cac-410f-830f-3050b7db6529/download https://repositorio.fgv.br/bitstreams/5e65dc3a-2398-404f-af40-9b146a6a1de4/download |
bitstream.checksum.fl_str_mv |
2df1490eba01b7f081207541b7f31960 2a4b67231f701c416a809246e7a10077 8556190c4bb53076603b0489a36026e8 c5b53c3991c54830946900b533a13dab |
bitstream.checksumAlgorithm.fl_str_mv |
MD5 MD5 MD5 MD5 |
repository.name.fl_str_mv |
Repositório Institucional do FGV (FGV Repositório Digital) - Fundação Getulio Vargas (FGV) |
repository.mail.fl_str_mv |
|
_version_ |
1827846403105226752 |