Modelagem preditiva da produtividade da soja utilizando aprendizado de máquina

Detalhes bibliográficos
Ano de defesa: 2024
Autor(a) principal: Pereira, Roney Peterson lattes
Orientador(a): Guedes, Luciana Pagliosa Carvalho lattes
Banca de defesa: Guedes, Luciana Pagliosa Carvalho lattes, Opazo, Miguel Angel Uribe lattes, Brun, André Luiz lattes, Dalposso, Gustavo Henrique lattes, Ló, Thiago Berticelli lattes
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Universidade Estadual do Oeste do Paraná
Cascavel
Programa de Pós-Graduação: Programa de Pós-Graduação em Engenharia Agrícola
Departamento: Centro de Ciências Exatas e Tecnológicas
País: Brasil
Palavras-chave em Português:
Área do conhecimento CNPq:
Link de acesso: https://tede.unioeste.br/handle/tede/7468
Resumo: This thesis explores the application of machine learning algorithms to predict soybean productivity in a commercial area in the state of Paraná, Brazil. Soybean productivity is influenced by factors such as climatic conditions and chemical attributes of the soil. Historical productivity data, meteorological information, and soil chemical analyses were used to develop predictive models. In the first article, machine learning algorithms such as Random Forest, Gradient Boosting Machine (GBM), Support Vector Regression (SVR), K-Nearest Neighbors (KNN), and Multi-Layer Perceptron (MLP) were tested. The Random Forest algorithm achieved the best performance, with a root mean square error of 0.446 and a coefficient of determination R2 of 0.824. The most influential variables were iron content (Fe), potassium content (K), precipitation in the month of October, and precipitation in the month of February. In the second article, the analysis was expanded to include meteorological variables. Preprocessing techniques such as Pearson’s linear correlation, variance inflation factor (VIF), and the Boruta method were applied to remove multicollinearity. Machine learning algorithms were tuned and compared, including Random Forest, Extra Trees Regressor, CatBoost, AdaBoost, and LightGBM. After refinements, the Random Forest model stood out with a root mean square error of 0.407 and an R2 of 0.837. The SHAP (SHapley Additive exPlanations) analysis revealed that meteorological variables have a significant impact on soybean productivity, with varying influences from the remaining chemical attributes. Finally, in both articles, Post-Plots were constructed to compare actual and predicted productivity.