Sequenciamento e caracterização parcial do genoma de cagaiteira (Eugenia dysenterica DC.)

Detalhes bibliográficos
Ano de defesa: 2016
Autor(a) principal: Ribeiro, Stela Barros lattes
Orientador(a): Coelho, Alexandre Siqueira Guedes lattes
Banca de defesa: Zucchi, Maria Imaculada, Soares, Thannya Nascimento, Coelho, Alexandre Siqueira Guedes
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Universidade Federal de Goiás
Programa de Pós-Graduação: Programa de Pós-graduação em Genética e Melhoramento de Plantas (EAEA)
Departamento: Escola de Agronomia e Engenharia de Alimentos - EAEA (RG)
País: Brasil
Palavras-chave em Português:
Palavras-chave em Inglês:
Área do conhecimento CNPq:
Link de acesso: http://repositorio.bc.ufg.br/tede/handle/tede/6232
Resumo: The development of genomic analysis technologies, mainly the next generation sequencing platforms (NGS), has enabled to obtain a large amount of DNA sequencing information. The association between NGS data and cutting edge computational tools affords access to whole genome information for different organisms, through whole genome assembly (or partial) and structural and functional characterization. The cagaiteira tree (E. dysenterica DC.) is one of the Cerrado native species with potential utilization in crop production systems, due to its products exploration: fruits, leaves and bark. Besides, it has ecological importance for food availability to local fauna. Despite the efforts made, little is known about the organization and genetic structure of the cagaiteira tree. The previous researches take into account a reduced number of molecular markers applied to mating systems studies and effects of micro evolutionary events in populations. In this study we obtained an assembly and a partial characterization of E. dysenterica genome, regarding number, structure and function of genes and repetitive DNA. We obtained DNA sequences for five individuals from different populations using Illumina MiSeq sequencing platform. The quality control was performed with FasQc and Trimmomatic. We assembled the reads using dipSPAdes and used blastn and Samtools to verify the assembly quality. We used Repeat Masker, Repeat Modeler and QDD to identify and characterize the repetitive DNA content. For gene prediction and annotation we used AUGUSTUS and Blast2GO. The raw DNA sequences amounted 8.64 Gb, distributed in 63,017,960 reads. After trimming for low quality, the amount decreased to 5.63 Gb, distributed in 59,415,168 reads. After filtering for organellar DNA and contigs smaller than 500 bp, we assembled 130,243 contigs, representing 56.7% (~250 Mb) of estimated E.dysenterica genome size (~442 Mb). About 35.3% of genome assembled comprised repetitive regions, of which 27.1% are transposable elements (most LTR retrotransposons). We identified 55,491 microsatellite regions, 46,701 mononucleotides and 8,403 dinucleotides. The T/A motif was the most common follow by A/T and GA/TC. We predicted 60,171 gene fragments and 228,510 transcripts. We observed a gene density of 1 gene per 7.3 Kb and an average of 3.8 transcripts per gene. This study makes the cagaiteira tree the first native plant species from Cerrado of which genome was widely sampled and characterized using NGS data.