Influential diagnostics for location parameter within GAMLSS

Detalhes bibliográficos
Ano de defesa: 2021
Autor(a) principal: SILVA, Lucas Araújo da
Orientador(a): DE BASTIANI, Fernanda
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso embargado
Idioma: eng
Instituição de defesa: Universidade Federal de Pernambuco
Programa de Pós-Graduação: Programa de Pos Graduacao em Estatistica
Departamento: Não Informado pela instituição
País: Brasil
Palavras-chave em Português:
Link de acesso: https://repositorio.ufpe.br/handle/123456789/40169
Resumo: Modelling the functional relationship between a variable response and a set of explana tory variables is at the core of the regression problems in statistics. Several studies have proposed different models. More recently, generalized additive models for scale and shape lo cation (GAMLSS) have gained attention for generalizing other already popular models such as the linear model, the generalized linear models, semiparametric models and the generalized additive models, and allowing any parametric distribution to model the response variable. In addition, all distribution parameters can be modeled with linear, non-linear or smoothing func tions for explanatory variables. Various tools of influence diagnostics have been proposed in the literature, and this work shows some of these tools and proposes techniques to detect possible influential observations in the GAMLSS model class. This work considers several measures of influence such as: the generalized Cook distance, the likelihood distance, the adjusted Peña measure, differences in the generalized Akaike information criterion and the Kim measure for simulated data and applications. It is also proposed algorithms to obtain the reference values of these measures using bootstrap, adapting for the other measures the procedure suggested by (KIM; PARK; KIM, 2002). The study is still limited to situations where we model the lo cation parameter (in general the mean) of the response variable, whether or not we have smoothing additives, in this case univariate penalized splines were used as a smoother, since the Peña and Kim measures need to calculate the matrix of smoothing that varies according to the smoothed covariate and the smoother in question. For the simulation studies, several scenarios were considered with some relevant distributions and several sample sizes, taking into account continuous and discrete distributions as well. Analysis of real data illustrates the approached methodology.