Algoritmos para seleção de variáveis em modelos Markovianos ocultos não-homogêneos
Ano de defesa: | 2024 |
---|---|
Autor(a) principal: | |
Orientador(a): | |
Banca de defesa: | |
Tipo de documento: | Tese |
Tipo de acesso: | Acesso aberto |
Idioma: | eng |
Instituição de defesa: |
Universidade Federal de São Carlos
Câmpus São Carlos |
Programa de Pós-Graduação: |
Programa Interinstitucional de Pós-Graduação em Estatística - PIPGEs
|
Departamento: |
Não Informado pela instituição
|
País: |
Não Informado pela instituição
|
Palavras-chave em Português: | |
Palavras-chave em Inglês: | |
Área do conhecimento CNPq: | |
Link de acesso: | https://repositorio.ufscar.br/handle/20.500.14289/21170 |
Resumo: | Non-homogeneous hidden Markov models are a statistical paradigm in which a sequence of non-observable states generates a sequence of observable. Transitions between the non-observable states are controlled by transition coefficients and covariates. Because variable selection has been hardly explored for this model, the central purpose of this thesis is to propose variable selection methods which improve predictive performance of the model. We propose two versions of the LASSO for the non-homogeneous hidden Markov model, the Global LASSO and Individual LASSO. The proposed methods are tested in a simulation study, to analyze their performance under controlled conditions. Evaluation metrics used are the mean squared prediction error, non-observable sequence prediction accuracy and coefficient shrinkage efficiency. Regarding the mean squared prediction error, the proposals consistently show better predictive performance than ARIMA and Penalized Linear Regression. They show very good performance when predicting the non-observable state sequence which generates the observable values. In terms of coefficient shrinkage efficiency, the proposals show excellent performance in all simulation scenarios when selecting variables via coefficient shrinkage. This gain in predictive performance as well as the ability to perform variable selection makes the proposed methods an interesting option to apply with the model. Finally, the methods are applied to characterize and predict the rainfall regime in the city of São Carlos, Brazil, displaying good performance when predicting rainfall quantities in the region as well as selecting relevant covariates for the model. |