Bayesian network quantization method and structural learning

Detalhes bibliográficos
Ano de defesa: 2024
Autor(a) principal: Ribeiro, Rafael Rodrigues Mendes
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: eng
Instituição de defesa: Biblioteca Digitais de Teses e Dissertações da USP
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: https://www.teses.usp.br/teses/disponiveis/18/18153/tde-08032024-101119/
Resumo: Bayesian Networks (BNs) are versatile models for capturing complex relationships, widely applied in diverse fields. This study focuses on discrete variable BNs. Modeling quality depends on adequate data volume, especially for constructing conditional probability tables (CPTs). The quantity of required data varies with the chosen BN Directed Acyclic Graph (DAG). Structural learning of the BN involves an NP-hard problem with a super-exponential DAG search space. This thesis proposes investigating multi-objective optimization in BN structural learning (BNSL) to balance conflicting criteria. The approach utilizes Pareto sets and multi-objective Genetic Algorithms (GAs). To perform BNSL, a parallel GA with automatic parameter adjustment is developed, called Adaptive Genetic Algorithm with Varying Population Size (AGAVaPS). This proposed algorithm is thoroughly tested on different applications and BNSL. AGAVaPS is found to be a good algorithm to be used in BNSL, performing better than HillClimbing and Tabu Search for some of the metrics measured. The study also explores the impact of data quantization on the BNSL search space. It also introduces a quantization method called CPT Limit-Based Quantization (CLBQ) that balances model quality, data fidelity, and structure score. The effectiveness of this method is tested and its capability of being used in search and score BNSL is investigated. CLBQ is found to be a good quantization algorithm, choosing quantization that has a good mean squared error and modeling well the variables\' distributions. Also, CLBQ is suitable to be used on BNSL.