A atualização do valor crítico interfere na performance do procedimento Data Snooping?

Detalhes bibliográficos
Ano de defesa: 2022
Autor(a) principal: Bonimani, Maria Luisa Silva
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Universidade Federal de Uberlândia
Brasil
Programa de Pós-graduação em Agricultura e Informações Geoespaciais
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: https://repositorio.ufu.br/handle/123456789/34337
http://doi.org/10.14393/ufu.di.2022.149
Resumo: In the age of Big Data, detecting outlier in the data set has become one of the most important activities. In Geodesy, Data Snooping is the most widely used procedure for identifying outliers. To control the type I error rate, that is, false positives, critical values must be obtained using the Monte Carlo method. However, so far, studies have been conducted without considering the update of the critical value of the iterative process of Data Snooping. Since to effectively control the type I error rate the critical value must be updated every time an observation is identified as an outlier and removed from the data set. Here we investigate whether updating the critical value interferes with the performance of the Data Snooping procedure and calculate the critical value using the Monte Carlo, Artificial Neural Network and Šidák correction methods. For this experiment, we considered a closed leveling network with a maximum correlation between residuals of 41.46%. Considering significance levels less than or equal to 10% (α' ≤ 10%), updating the critical value does not show significant differences when compared to the non-updated critical values, presenting a maximum difference of ΔKSBPNN=0,0389 (α = 0,001), ΔKsid=0,0507(α = 0,001) e ΔKMC=0,0256 (α = 0,1) for the case of 1 exclusion, and a maximum difference of ΔKSBPNN=0,1023 (α = 0,001), ΔKsid=0,1353 (α = 0,001) e ΔKMC=0,0773 (α = 0,001) for the case of 2 exclusions. Updating the critical value also does not cause significant differences in the correct outlier identification rates showing a maximum ΔP_CI < 0,5%. In this way, the experiments showed that updating the critical value does not cause significant effects on the performance of Data Snooping for significance levels less than or equal to 10% (α' ≤ 10%).