Automation tool for taxonomic classification in the family Geminiviridae

Detalhes bibliográficos
Ano de defesa: 2018
Autor(a) principal: Gomes, Ruither Arthur Loch
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: eng
Instituição de defesa: Universidade Federal de Viçosa
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: http://www.locus.ufv.br/handle/123456789/25202
Resumo: Pathogenic microorganisms have the potential to cause serious problems for humankind. Their precise taxonomic classification is an important step for understanding and combating the diseases caused by them. Several technologies were created to make it easier to classify microorganisms, and with the emergence of high-throughput sequencing technologies this process has been hugely accelerated. However, this led to another problem, because extremely large volumes of genetic sequence information are generated, making the bioinformatic analysis of sequences a time-consuming process. In the case of viruses classified in the family Geminiviridae, this problem is compounded by the large amount of new sequences that are deposited in public databases. Geminiviruses are responsible for large losses of production in economically important crops worldwide, which makes them the focus of much research leading to the constant discovery of new viruses. Although there are several ways of performing the taxonomic classification of microorganisms, the use of the percentage of identity obtained from the alignment between individuals has been increasingly applied. In the case of viruses with small genomes, the use of percent identities obtained from pairwise alignments has been applied for decades, so that several algorithms have already been created to accomplish this goal. However, none of the algorithms developed until today carries out the classification of the virus, leaving to the researcher the work of deciding the taxonomic classification, one virus at a time. Here we present a tool that will carry out the classification of viruses in the Geminiviridae. This tool is capable of acquiring the sequences as they are added to public databases or receiving the sequences given by the user. It then filters the added sequences to eliminate those already classified and parses the remaining sequences based on their percentage of pairwise identity with classified viruses . It also updates the values of taxonomic demarcation thresholds used to classify species and strains. Using this tool, it was possible to analyze all viruses added to public databases from January 2017 until July 2018. A total of 27 new species were identified. We also suggest revised demarcation thresholds for the genera Becurtovirus, Capulavirus, Curtovirus, Grablovirus and Mastrevirus.