Aplicação de algoritmos de aprendizagem de máquina na identificação de registros espúrios no Cadastro Ambiental Rural

Detalhes bibliográficos
Ano de defesa: 2022
Autor(a) principal: Borges, Fernando Elias de Melo
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Universidade Federal de Lavras
Programa de Pós-Graduação em Engenharia de Sistemas e Automação
UFLA
brasil
Departamento de Engenharia
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: http://repositorio.ufla.br/jspui/handle/1/55058
Resumo: The Rural Environmental Registry (CAR) is a mandatory electronic public registry for all rural properties in the Brazilian territory, integrating environmental information from the properties, helping with the environmental monitoring and contributing to actions to combat deforestation. However, a large number of registrations are made erroneously, generating inconsistent data, leading these to be cancelled and/or to request rectifications for the correct completion of the registration. Performing these analyses, identifying the incorrectly completed registries (spurious) manually, has a great cost, given the need for specialized labor, requiring a large amount of time, due to the immense amount of rural properties in Brazil. In this context, this work aims to provide a smart machine learning-based system that allows to check and classify CAR records into spurious and non- spurious (or cancelled and approved) registries in a fast and effective way. To do this, methodologies involving the entire pipeline of an application involving data science and machine learning have been applied. From pre-processing, with attribute cleaning and selection, followed by training and validation of the classifiers, and finally the use of interpretable machine learning algorithms with the goal of evaluating how each attribute impacted the decision making by the classifiers. Six classification models were applied and their results evaluated according to each preprocessing format, and a classifier interpretation model was used to compare the internal interpretations of models that have interpretability. The predictive results show classification performance rates above 90% for all evaluation measures used in the validation set, and the interpretations listed the variables that most influence automatic classification. Thus, the method proved to be viable for application in a real scenario applied to the Rural Environmental Registry.