Ancestralidade e co-regulação de genes codificadores de proteínas humanas

Detalhes bibliográficos
Ano de defesa: 2017
Autor(a) principal: Kátia de Paiva Lopes
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Universidade Federal de Minas Gerais
UFMG
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: http://hdl.handle.net/1843/BUOS-APTQ3E
Resumo: For deduction and quantification of transcriptomic analyses, some technologies were created, among them, there are those based on clone sequencing analysis, like EST (Expressed Sequence Tags), hybridization, such as microarrays and, NGS deep sequencing, asRNAseq. To study the origin of genes expressed in different tissues and organs, we analyzed data obtained from these three approaches. Data from Unigene, Gene Expression Omnibus (GEO) and Human Protein Atlas (HPA) were comprised into eight local databases. Next, accessing the orthologous groups of human genes, given by the UniRef Enriched Kegg Orthology (UEKO) and Orthologous Matrix (OMA) databases, we estimated the gene ages using the Lowest Common Ancestor (LCA) algorithm. Thus, we were able to determine the time of appearance of tissue expressed genes aiming to depict the human organs evolution.The global analysis of the organism, revealed eight distinct hallmarks along the timescale (i.e. eight major steps), showing that the housekeeping (HK) genes are more ancient than the tissue-enriched (TE) genes. The functional enrichment analysis found coherent groups ofterms and annotations assigned to the genes placed at each evolutionary stage. Next, a coexpression analysis was performed calculating the pair-wise Spearman correlation of all genes along 116 samples from HPA, and only selecting as positive gene-pairs, the ones that had acorrelation coefficient 0.85. As result, we ended with a robust network that includes 2,298 proteins and 20,005 interactions. In this network, the algorithm MCODE from Cytoscape revealed the existence of 11 major subnetworks that had a clear enrichment in certain groups or modules of highly coexpressed proteins, showing a tendency to include proteins of the same evolutionary age. Finally, for analysis of tissue-specific (TS) genes, we used thee different strategies: (1) by tissue clusterization; (2) by tissue classification according to phenotypic categories and; (3) using eight common tissues from the four databases used in this step: HPA (32 tissues), IBM (16), Fantom (56) and Gtex (53). Or results showed that,when all expressed genes are used, the analysis lack the tissue specific signature, approaching the distribution appearance of the entire repertoire of genes. Thus, to distinguish the organs origins, we examined the time of appearance of only tissue specific genes or genes withindistinct groups, such as elevated genes. Therefore, the approach that obtained the highest concordance of results, presented the tissues ordered by their gene ages in the following order: first brain, then heart, kidney, colon, ovary, prostate, lung and testis.