Caracterização de sítios polimórficos e sequências repetitivas, e estabelecimento de coleção nuclear de caiaué [Elaeis oleifera (Kunth) Cortés]

Detalhes bibliográficos
Ano de defesa: 2015
Autor(a) principal: Ferreira Filho, Jaire Alves
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Programa de Pós-graduação em Biotecnologia Vegetal
UFLA
brasil
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: http://repositorio.ufla.br/jspui/handle/1/10285
Resumo: The objectives of this study were to characterize polymorphic sites and repetitions and establish a core collection for American oil palm (Elaeis oleifera). The genome draft used in this study had a 130X coverage by Illumina Hiseq 2000 and was compared with the publicly available draft of E. oleifera, as well as with the also publicly available genome of E. guineensis, through Nucmer software. In silico search was made to identify regions of tandem repeats and transposable elements in this genome draft. A bank of sequences, generated by DArTSeq platform for genotypes of E. oleifera, was mapped against the public E. guineensis genome using BWA software. The SAMtools software package was used to identify SNPs. The gene models of date palm (Phoenix dactylifera) were mapped on the genome of American oil palm. For the design of core collections, we used the strategy of maximizing the diversity (M) with 500 loci SNPs markers based on genotyping by sequencing. 68.24 and 72.83% of the draft analyzed was aligned against the E. oleifera genomes and E. guineensis, respectively. A total of 328,879 and 618,284 of tandem repeats and transposable elements loci were identified, respectively. It was possible to characterize 17,412/2,370 PAVs/SNPs, and 25,203 gene models, with single position in the genome. Core collections models were obtained with 37, 55, 109, 127, 138, 276, 26, and 16 individuals. As a result of the optimal adjustment of the validated parameters maintained while taking the least number of accessions, the model of 109 individuals (20% of entire collection) was chosen as the ideal to establish the core collection of E. oleifera. The draft of E. oleifera generated by Embrapa sampled much of the genomes to which it was compared, representing much of this highly complex genome with an affordable cost of sequencing technology. More than half (55%) of the draft consists of repetitions, especially retrotransposons. The identification of these regions rich on repetitive sequences will contribute to adjustments in the strategy to generate to further sequence this genome. The set of PAVs/SNPs mapped markers provide a substantially uniform coverage throughout the genome and gene regions of E. guineensis. The core collection model generated in this study will allow an improvement of the strategy to more efficiently conserve the germoplasm of American oil palm.