Aprendizagem de máquina para classificação de populações de soja para variáveis industriais com base em caracteres agronômicos

Detalhes bibliográficos
Ano de defesa: 2022
Autor(a) principal: Maik Oliveira Silva
Orientador(a): Larissa Pereira Ribeiro Teodoro
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Fundação Universidade Federal de Mato Grosso do Sul
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Brasil
Palavras-chave em Português:
Link de acesso: https://repositorio.ufms.br/handle/123456789/4485
Resumo: Soybean is an important food alternative in human nutrition, because it has a high protein quality. The great current challenge of genetic improvement programs is to increase grain yield and protein content and at least maintain oil content. Hence the importance of getting the oil or protein content with a high percentage of correct classification. One of the promising approaches for classifying variables and/or complex data sets is machine learning (AM). The objective was to classify groups of soybean genotypes according to industrial variables based on agronomic characters using AM techniques. The experiment was conducted in the 2019/2020 harvest at the Federal University of Mato Grosso do Sul, in Chapadão do Sul and at the State University of Mato Grosso do Sul, in Aquidauana. A randomized block design with two replications and 206 F2 soybean populations was used. The agronomic characters evaluated were: days to maturation (DM), height of insertion of the first pod (AIV, cm), plant height (AP, cm), number of branches (NR), diameter of the main stem (DHP, cm), mass of one hundred grains (MCG, g) and grain yield (PROD, kg ha-1). The models tested were: support vector machine (SVM), artificial neural networks (ANN), decision tree models J48 and REPTree (RT) and random forest (RF). Using AM techniques, accurate models were generated for classifying more complex variables that require more time to obtain them as oil and proteins in soybeans, based on agronomic traits, which are easier to measure. RF was the technique with the best performance and can be used to contribute to soybean breeding programs by classifying genotypes for industrial traits such as oil and protein content.