Novel Bayesian networks for genomic prediction of developmental traits in biomass sorghum

Detalhes bibliográficos
Ano de defesa: 2019
Autor(a) principal: Santos, Jhonathan Pedroso Rigal dos
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: eng
Instituição de defesa: Biblioteca Digitais de Teses e Dissertações da USP
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: http://www.teses.usp.br/teses/disponiveis/11/11137/tde-12092019-153123/
Resumo: Sorghum (Sorghum bicolor L. Moench spp.) is a bioenergy crop with several appealing biological features to be explored in plant breeding for increasing efficiency in bioenergy production. The possibility to connect the influence of quantitative trait loci over time and between traits highlight the Bayesian networks as a powerful probabilistic framework to design novel genomic prediction models. In this study, we phenotyped a diverse panel of 869 sorghum lines in four different environments (2 locations in 2 years) with biweekly measurements from 30 days after planting (DAP) to 120 DAP for plant height and dry biomass at the end of the season. Genotyping-by-sequencing was performed, resulting in the scoring of 100,435 biallelic SNP markers. We developed and evaluated several genomic pre- diction models: Bayesian Network (BN), Pleiotropic Bayesian Network (PBN), and Dynamic Bayesian Network (DBN). Assumptions for BN, PBN, and DBN were independence, dependence between traits, and dependence between time points, respectively. For benchmarking, we used multivariate GBLUP models that considered only time points for plant height (MTi- GBLUP), and both time points for plant height and dry biomass (MTr-GBLUP) modeling unstructured variance-covariance matrix for genetic effects and residuals. Coincidence indices (CI) were computed for understanding the success in selecting for dry biomass using plant height measurements, as well as a coincidence index based on lines (CIL) using the posterior draws from the Bayesian networks to understand genetic plasticity over time. In the 5-fold cross-validation scheme, prediction accuracies ranged from 0.48 (PBN) to 0.51 (MTr- GBLUP) for dry biomass and from 0.47 (DBN-DAP120) to 0.74 (MTi-GBLUP-DAP60) for plant height. The forward-chaining cross-validation showed a substantial increment in prediction accuracies when using the DBN model, with r = 0.6 (train on slice 30:45 to predict 120 DAP) to 0.94 (train on slice 30:90 to predict 105 DAP) compared to the BN and PBN, and similar to multivariate GBLUP models. Both the CI and CIL indices showed that the ranking of promising inbred lines changed minimally after 45 DAP for plant height. These results suggest that 45 DAP is an optimal developmental stage for imposing the two-level indirect selection framework, where indirect selection for plant height at the end of the season (first-level target trait) can be done based on its ranking with 45 DAP (secondary trait) as well as for dry biomass (second-level target trait). With the advance of robotic technologies for field-based phenotyping, the development of novel approaches such as the two-level indirect selection framework will be imperative to boost genetic gain per unit of time.