Desenvolvimento das ferramentas SeedServer, para agrupamento de seqüências protéicas homólogas e U-MAGE, para propagação de ontologia funcional

Detalhes bibliográficos
Ano de defesa: 2013
Autor(a) principal: Rafael Lucas Muniz Guedes
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Universidade Federal de Minas Gerais
UFMG
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: http://hdl.handle.net/1843/BUOS-9BNJT2
Resumo: With advances in sequencing technologies, an important contribution is the organization of secondary databases, where existing knowledge is organized based on biological information. Grouping of homologous genes and assigning terms of functional ontology involving already known genes are ways to speed the analysis of data from new sequencing. We report the development of SeedServer, U-MAGE and applications. The integration of databases and a program capable of clustering homologous proteins including sequences derived from incomplete genomes, together with validation methods and comparison of secondary structures resulted in the development of SeedServer tool. Groups of homologous sequences chosen from user interest are generated with the aid of a web interface where you can also download the grouped sequences, get taxonomy reports and estimate the origin of the gene in question by determining the lowest common ancestor. The program SeedServer after being tested and evaluated was then used in a study of amino acid heterotrophy by forming groups of homologous enzymes present in the essential amino acids biosynthetic pathways, showing a scenario called the Great Genomic Deletion in different groups of eukaryotes and prokaryotes. Following that event may be the loss of assimilative capacity of nitrogen, an essential component in the formation of amino acids. Phylogenetic studies showed a higher rate of mutation among the enzymes remaining in incomplete pathways when compared with others from complete pathways. Additionally, to improve the quality of functional annotation of protein sequences, we created the tool called U-MAGE (UniRef50 Matrices for Annotation of Gene Ontology Entries) that uses as the basis of propagation of functional ontology terms the coverage between sequences within a UniRef50 organized in matrices. The U-MAGE demonstrated a significant qualitative improvement in functional ontology annotation of various organisms. Both tools SeedServer and U-MAGE contribute to the acceleration of the information spread from known proteins, a challenge to the current Bioinformatics to face the intense production of new sequences