A evolução da complexidade biológica em Eukarya: funções biológicas e domínios de proteínas associados ao número de tipos celulares diferentes

Detalhes bibliográficos
Ano de defesa: 2019
Autor(a) principal: Dalbert Benjamim da Costa
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Universidade Federal de Minas Gerais
Brasil
ICB - INSTITUTO DE CIÊNCIAS BIOLOGICAS
Programa de Pós-Graduação em Genética
UFMG
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: http://hdl.handle.net/1843/33945
Resumo: During the course of biological evolution, organisms with different degrees of complexity have arisen. For practical purposes, the number of distinct cell types has been commonly used as a proxy for biological complexity. Also during the course of evolution, new proteins emerged in Eukarya as the result of de novo gene evolution, gene duplications followed by divergence and, in several cases, functional domain shuffling. We used a statistical comparative genomics approach to study the evolution of biological complexity in Eukarya by searching for biological functions (represented as the frequency of protein domains and gene functions coded in a wide range of eukaryotic genomes) associated with their number of cell types. We selected 41 high-quality non-redundant eucaryotic proteomes in terms of gene repertoire completeness as estimated by BUSCO and, for each proteome was annotated to identify protein domains (Pfam) and biological functions (Gene Ontology - GO - terms) using InterProScan. We compute two classes of association metrics for the frequencies of each Pfam/GO term and the number of cell types. One class consists on traditional Spearman correlation, while the other is corrected to take into account the common ancestry relationships across species data, therefore correcting for this bias. For each linear model we computed p-values, and we applied multiple hypothesis correction (BH methods) to take into account the multiple-comparison problem. We considered as positive models with corrected p-values smaller than 0.05 resulting in 256 Pfam domains and 304 GO terms significantly associated with biological complexity. Among these sets we found several domains that play important roles in extracellular matrix processes, cell-cell interaction, transcription factors, hormones, regulatory processes and key factors for cell differentiation and body development processes. Taken together, our approach highlights important biological processes associated with the increase of complexity in Eukarya, suggesting their importance for the establishment of extant biological complexity.