Uma abordagem baseada em similaridade empírica para o estimador de Kaplan-Meier

Detalhes bibliográficos
Ano de defesa: 2024
Autor(a) principal: Beneyto, Isabel de Castro
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Não Informado pela instituição
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: http://repositorio.ufc.br/handle/riufc/79198
Resumo: The Kaplan-Meier estimator is widely used for estimating the survival curve of a population, incorporating the possibility of censored data. Although common in survival studies, especially in clinical and epidemiological research, the estimator does not directly consider covariates in its predictions. In this work, we propose adapting the Kaplan-Meier estimator, called the similarity-based Kaplan-Meier estimator. This adaptation includes a similarity quantifier in the standard formula of the estimator, allowing for the assignment of weights to covariates. These weights are estimated from the data using a predefined similarity function and are obtained through empirical maximum likelihood. We demonstrate the application of this method to predict conditional survival curves through simulations in different settings and scenarios. Analyses were performed evaluating the estimated weights, the variability of estimated failure times through standard deviation, and the estimator’s performance on two evaluation metrics: concordance index (CI) and Brier Score (BS). We conducted simulations using two forms for the similarity function: exponential (EX) and fractional (FR). We evaluated the effect of normalizing the estimated weights, the choice between mean and median time to estimate failure time from the survival curve, and different distance measures to compose the similarity function. To assess the performance of the estimator in different contexts, we estimated the weights in different data samples, varying the size and censoring rate. Finally, we also performed direct comparisons between the performance of the proposed estimator and various statistical and machine learning methods that are referenced in the context of survival analysis. The results indicate that the estimator demonstrates competitive and consistent performance. Compared to statistical methods, the proposed estimator stands out because it does not assume a specific distribution or proportional hazards. On the other hand, compared to machine learning algorithms, we achieve a simpler interpretation of the estimated parameters and avoid the overfitting issues often associated with overly complex models.