Applying next generation sequence data in animal breeding
Ano de defesa: | 2020 |
---|---|
Autor(a) principal: | |
Orientador(a): | |
Banca de defesa: | |
Tipo de documento: | Tese |
Tipo de acesso: | Acesso aberto |
Idioma: | eng |
Instituição de defesa: |
Universidade Federal de Viçosa
Zootecnia |
Programa de Pós-Graduação: |
Não Informado pela instituição
|
Departamento: |
Não Informado pela instituição
|
País: |
Não Informado pela instituição
|
Palavras-chave em Português: | |
Link de acesso: | https://locus.ufv.br//handle/123456789/29393 |
Resumo: | In Animal Breeding there is a history of using phenotypic data of the best animals from one generation to produce the next one. However, it is known that the heritability of interest traits depends on a genetic basis and can be explained by the DNA. In quantitative genetics, the basic model says that the phenotype is the sum of the genotype (genetic composition of the individual) and the environment, however, different genotypes can perform superior or inferior results depending on the environment in which they are inserted, with that, there was the need to include an interaction factor at the genotype and the environment. The development of Next Generation Sequencing (NGS) has brought advances in the understanding of complex traits in animal breeding, with the Whole Genome Sequencing (WGS) of the domestic animals it was possible to better understand how causal and structural variants have influenced the phenotype. With the use of NGS it is possible to quantify the mRNA that is being expressed of a certain gene that influences a certain trait (transcriptome - RNA-Seq), discover epigenetic changes (ChIP-Seq) and, then, better understand the molecular mechanisms that modulate the trait and tissue in question; increase the density of Simple Nucleotide Polymorphism Markers (SNP) with WGS and then discover more significant and more accurate genomic associations. With the advent of NGS came the need for the development of computational power and new tools capable of analyzing and storing the amount of data generated. As a result, the development of software capable of performing quality control, mapping sequences against the reference genome and quantifying the transcripts (in the case of RNA-Seq) has become necessary and, dealing with this type of analysis can be difficult for those unfamiliar with computer systems. For this reason, a user- friendly pipeline was developed to analyze RNA-Seq data, carrying out quality control, mapping and sequence counting. BAQCOM (Bioinformatics Analysis of Quality Control and Mapping) has been shown to be efficient and fast. Using RNA-Seq data, it was aimed to know the global gene expression profile involved in Bacterial Chondronecrosis with Osteomyelitis (BCO) in chicken, which is developed in the bone growth plates of the femur and tibia followed by opportunistic bacteria. Using tibia samples from six commercial broilers (three affected by BCO and three healthy), 192 differentially expressed genes (FDR <0.05) were found (63 upregulated and 129 downregulated). 26 genes and seven transcription factors were found downregulated, explaining BCO in tibia, concluding that BCO in chickens may be caused by the low expression of genes related to bone growth and that bacterial proliferation seems to be a secondary process. Another scenario for the use of NGS is the fine mapping of QTL (quantitative Traits Loci) regions. A QTL region on chromosome 5 associated with backfat thickness was found in four commercial pig lines (synthetic based on large-white, Pietrain, Landrace and Large-White). Backfat is important for feed efficiency and meat quality, as well as being a storage of energy. Although the Genome Wide Association Study (GWAS) has been widely used in association with phenotypes, there is still a need to reduce the associated region if the goal is to find causal mutations. For this reason, it was aimed to reduce this QTL region using data from WGS, RNA-Seq and ChIP- Seq. From the most significant SNP of GWAS (leadSNP - SSC5:66103958), haplotypes were phased and haplotypes of size 41 SNPs (20 <leadSNP> 20) were selected. The most frequent haplotype was selected among the four breeds, so it was possible to identify a region of 5 SNPs (2 <leadSNP> 2) cross breed. Fine mapping was carried out using this region, first WGS was used and three candidate variants were identified (SSC5:66097445, SSC5:66099282 and SSC5:66103958). In this region between SSC5:66097445-66103958 epigenetic marks H3K27me3 and H3K4me3 were found using ChIP-Seq data from pig alveolar macrophages, indicating regulation of gene expression during the prenatal development period. From this result, RNA-Seq from embryos and fetuses were used, where it was possible to find high expression of the CCND2 gene, which is related to the development and differentiation of adipose tissue, being a strong candidate for a causal gene for backfat. It is also possible to conclude that in this region there may be regulatory elements involved in the expression of the CCND2 gene in embryonic development and that the epigenetic impact during embryonic life can impact the production of traits in adult life. In this thesis, sequencing, genotyping, phenotype, transcriptome and ChIP-Seq data were used, integrating them all to narrow down regions of interest in the genome, thus resulting and proposing tools to improve animal breeding in the future. Keywords: Genomics. Whole Genome Sequencing. Transcriptome. ChIP-Sequencing. Phenotype. |