Algoritmos para seleção de variáveis em modelos Markovianos ocultos não-homogêneos

Detalhes bibliográficos
Ano de defesa: 2024
Autor(a) principal: Sabillón Lee, Gustavo Alexis
Orientador(a): Zuanetti, Daiane lattes
Banca de defesa: Não Informado pela instituição
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: eng
Instituição de defesa: Universidade Federal de São Carlos
Câmpus São Carlos
Programa de Pós-Graduação: Programa Interinstitucional de Pós-Graduação em Estatística - PIPGEs
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Palavras-chave em Inglês:
Área do conhecimento CNPq:
Link de acesso: https://repositorio.ufscar.br/handle/20.500.14289/21170
Resumo: Non-homogeneous hidden Markov models are a statistical paradigm in which a sequence of non-observable states generates a sequence of observable. Transitions between the non-observable states are controlled by transition coefficients and covariates. Because variable selection has been hardly explored for this model, the central purpose of this thesis is to propose variable selection methods which improve predictive performance of the model. We propose two versions of the LASSO for the non-homogeneous hidden Markov model, the Global LASSO and Individual LASSO. The proposed methods are tested in a simulation study, to analyze their performance under controlled conditions. Evaluation metrics used are the mean squared prediction error, non-observable sequence prediction accuracy and coefficient shrinkage efficiency. Regarding the mean squared prediction error, the proposals consistently show better predictive performance than ARIMA and Penalized Linear Regression. They show very good performance when predicting the non-observable state sequence which generates the observable values. In terms of coefficient shrinkage efficiency, the proposals show excellent performance in all simulation scenarios when selecting variables via coefficient shrinkage. This gain in predictive performance as well as the ability to perform variable selection makes the proposed methods an interesting option to apply with the model. Finally, the methods are applied to characterize and predict the rainfall regime in the city of São Carlos, Brazil, displaying good performance when predicting rainfall quantities in the region as well as selecting relevant covariates for the model.