Reamostragem em redes neurais com aplicação a dados espaciais
Ano de defesa: | 2021 |
---|---|
Autor(a) principal: | |
Orientador(a): | |
Banca de defesa: | |
Tipo de documento: | Dissertação |
Tipo de acesso: | Acesso aberto |
Idioma: | por |
Instituição de defesa: |
Universidade Federal de Uberlândia
Brasil Programa de Pós-graduação em Agricultura e Informações Geoespaciais |
Programa de Pós-Graduação: |
Não Informado pela instituição
|
Departamento: |
Não Informado pela instituição
|
País: |
Não Informado pela instituição
|
Palavras-chave em Português: | |
Link de acesso: | https://repositorio.ufu.br/handle/123456789/33033 http://doi.org/10.14393/ufu.di.2021.507 |
Resumo: | In the development of artificial neural networks (ANN), the available dataset is divided into three categories: training, validation and testing. However, an important problem arises here: how can we trust the prediction provided by a single ANN? Due to the randomness related to the ANN itself (architecture, initialization and training procedure), usually, there is no better choice. To capture the intrinsic randomness of RNA, we present an approach based on the Jackknife method of statistical resampling. The classic Jackknife consists in removing an observation from the available dataset (n) and using the (n – 1) remaining samples in the estimation process. This process is repeated for each individual observation. At the end, there will be n estimates from different samples. In the case of neural networks, each individual observation is selected to compose the test set, while the rest of the sample is destined for network training. In this case, the number of neural networks is equal to the size of the available data. However, we extend the idea by replicating this procedure a certain number of times. Therefore, due to the random characteristic of the neural network, predictions vary for the same sampling point. Therefore, due to the random characteristic of the neural network, predictions vary for the same sampling point. Consequently, we can describe the distribution of each individual prediction. Therefore, the proposed method provides interval predictions instead of the traditional point prediction. The proposed method was applied and tested using hydrogen potential (pH), exchangeable calcium (Ca2+) and phosphorus concentration (P) data obtained through the analysis of 118 georeferenced soil points. The results showed that the 60% reduction in the available dataset offers compatible accuracy compared to the full dataset and, therefore, a higher cost of sampling in the field would not be necessary. The resampling method spatially characterizes the points of greater and lesser accuracy and uncertainty. In external validation, i.e., analyzing data that did not participate in the resampling, we observed that the success rate is higher when using interval prediction rather than using average prediction. Although we restrict it to the neural network model, the proposed method can also be extended to other modern statistics tools, such as Kriging, Least Squares Collocation, and so on. |