Aplicação de algoritmos de aprendizagem de máquina na identificação de registros espúrios no Cadastro Ambiental Rural
Ano de defesa: | 2022 |
---|---|
Autor(a) principal: | |
Orientador(a): | |
Banca de defesa: | |
Tipo de documento: | Dissertação |
Tipo de acesso: | Acesso aberto |
Idioma: | por |
Instituição de defesa: |
Universidade Federal de Lavras
Programa de Pós-Graduação em Engenharia de Sistemas e Automação UFLA brasil Departamento de Engenharia |
Programa de Pós-Graduação: |
Não Informado pela instituição
|
Departamento: |
Não Informado pela instituição
|
País: |
Não Informado pela instituição
|
Palavras-chave em Português: | |
Link de acesso: | http://repositorio.ufla.br/jspui/handle/1/55058 |
Resumo: | The Rural Environmental Registry (CAR) is a mandatory electronic public registry for all rural properties in the Brazilian territory, integrating environmental information from the properties, helping with the environmental monitoring and contributing to actions to combat deforestation. However, a large number of registrations are made erroneously, generating inconsistent data, leading these to be cancelled and/or to request rectifications for the correct completion of the registration. Performing these analyses, identifying the incorrectly completed registries (spurious) manually, has a great cost, given the need for specialized labor, requiring a large amount of time, due to the immense amount of rural properties in Brazil. In this context, this work aims to provide a smart machine learning-based system that allows to check and classify CAR records into spurious and non- spurious (or cancelled and approved) registries in a fast and effective way. To do this, methodologies involving the entire pipeline of an application involving data science and machine learning have been applied. From pre-processing, with attribute cleaning and selection, followed by training and validation of the classifiers, and finally the use of interpretable machine learning algorithms with the goal of evaluating how each attribute impacted the decision making by the classifiers. Six classification models were applied and their results evaluated according to each preprocessing format, and a classifier interpretation model was used to compare the internal interpretations of models that have interpretability. The predictive results show classification performance rates above 90% for all evaluation measures used in the validation set, and the interpretations listed the variables that most influence automatic classification. Thus, the method proved to be viable for application in a real scenario applied to the Rural Environmental Registry. |