Detalhes bibliográficos
Ano de defesa: |
2023 |
Autor(a) principal: |
Santos, Talysson Manoel de Oliveira |
Orientador(a): |
Não Informado pela instituição |
Banca de defesa: |
Não Informado pela instituição |
Tipo de documento: |
Tese
|
Tipo de acesso: |
Acesso aberto |
Idioma: |
eng |
Instituição de defesa: |
Biblioteca Digitais de Teses e Dissertações da USP
|
Programa de Pós-Graduação: |
Não Informado pela instituição
|
Departamento: |
Não Informado pela instituição
|
País: |
Não Informado pela instituição
|
Palavras-chave em Português: |
|
Link de acesso: |
https://www.teses.usp.br/teses/disponiveis/18/18153/tde-20092023-105645/
|
Resumo: |
Knowledge discovery in time series datasets is a subject of great interest and importance in academics and industry. For such purpose, a set of theories and computational tools have been proposed and used to extract useful information from time series to assist in decision-making in different areas. Among the possibilities, Bayesian network is a probabilistic graphical model representing a set of random variables and their conditional statistical dependencies via a directed acyclic graph (DAG). This doctoral research proposes a methodology for dealing with time series based on evolving discrete Dynamic Bayesian Networks (EDBN) by an analytical threshold for selecting directed edges by the occurrence frequency as new datasets are collected. In this proposal, as new datasets are collected, the algorithm learns the structure of a DBN by using a score metric and the hill-climbing method and then uses the analytical threshold for selecting the directed edges between the nodes by the occurrence frequency. The developed method smoothly converges to a robust model and constantly adapts to the arrival of new data, obtaining more reliable network models. The discrete model is chosen to be a non-parametric approach that can be adequate for different data behaviour without manual modifications, i.e., totally data-driven. The proposal was evaluated by dealing with real datasets of time series in data imputation and CO2 emissions forecasting during energy generation, which are two contexts that have received a lot of attention from researchers in recent years. Evaluating the results against widely used imputation methods, the proposed approach proved capable of handling data imputation in time series datasets for missing completely at random and for missing not at random. In the context of CO2 emissions forecasting in multi-source power generation systems, real datasets of Belgium, Germany, Portugal, and Spain were used. The proposed approach showed to be capable of dealing with CO2 emissions forecasting in the systems evaluated in this study. Comparing the results against a traditional DBN that not evolve the structure over time, the proposal developed was superior highlighting a contribution of performance improvement. The proposed method was also better when compared to other traditional methods. Moreover, the model also is computationally efficient, making the proposal a good option for embedding such an approach for dealing with time series in online applications. |