Avaliação do lasso e métodos alternativos em modelos de regressão logística

Detalhes bibliográficos
Ano de defesa: 2021
Autor(a) principal: Alcântara Junior, Gilberto Pereira de
Orientador(a): Pereira, Gustavo Henrique de Araujo lattes
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Universidade Federal de São Carlos
Câmpus São Carlos
Programa de Pós-Graduação: Programa Interinstitucional de Pós-Graduação em Estatística - PIPGEs
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Palavras-chave em Inglês:
Área do conhecimento CNPq:
Link de acesso: https://repositorio.ufscar.br/handle/20.500.14289/14052
Resumo: Logistic regression has always been an important tool not only in the area of statistics, but also in several other areas such as economic, biological and medical. In many of these areas it is common to encounter problems of high dimensionality, in which the number of covariates to be tested is greater than the sample size. Classic estimation methods present certain problems in high dimensionality. One of the ways to solve this problem is the estimation by methods of penalty, as the lasso proposed by Tibshirani (1996). Despite the many works done on the application of lasso in the logistic regression model, none of them presents a complete study of simulation of the method's prediction performance using some traditional measure of performance evaluation. There are also no studies in the literature that compare the performance of other possible combinations made from lasso, such as lasso to select covariates and estimation via maximum likelihood, or selection via stepwise and estimation via lasso. In this work an extensive simulation study is presented under several scenarios created in order to study and compare the performance of the lasso and 3 other techniques combined in the logistic regression model. Several examples of applications in which the logistic model can be used were also studied and analyzed. Through the results obtained both by the simulations and by the applications, in relation to the predictive power, it was possible to verify that the lasso stood out or had similar performance to the other methods in all the presented scenarios. Regarding the comparison of the adjusted model with the real one, none of the methods considered stands out in all scenarios and in relation to all aspects analyzed.