Small and time-efficient distribution-free predictive regions

Detalhes bibliográficos
Ano de defesa: 2023
Autor(a) principal: Reis, Victor Candido
Orientador(a): Izbicki, Rafael lattes
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: eng
Instituição de defesa: Universidade Federal de São Carlos
Câmpus São Carlos
Programa de Pós-Graduação: Programa Interinstitucional de Pós-Graduação em Estatística - PIPGEs
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Inglês:
Área do conhecimento CNPq:
Link de acesso: https://repositorio.ufscar.br/handle/ufscar/18077
Resumo: Predicting a target variable (response) is often the main objective of many studies and investigations. In such scenarios, there are usually other variables, known as covariates, that are more readily available and can assist in the prediction process. Regression and classification methods aim to utilize the statistical associations between all available information to model the variable of interest. During such modeling, there is a significant emphasis on estimating regions that describe the fluctuations of the response, allowing for the quantification of the uncertainty of point estimates. Conformal prediction methods (VOVK; GAMMERMAN; SHAFER, 2005) are a class of methods that aim to provide regions with general shapes and high probability guarantees, assuming only exchangeability, which is a weaker assumption than independent and identically distributed data. This allows for extensive use in various applications. New methodologies have been developed to improve the theoretical properties and applicability of the original ideas, with a practical perspective on execution and computational cost. Motivated by this context, this work aims to enrich the class of conformal prediction methods, with a particular focus on regression problems and proposes a new method that better utilizes available information, provides greater generality in the format of the regions, and is more efficient in terms of computational cost. The proposed method was compared with previous works using simulation studies, and it achieved competitive results.