Develop a pipeline to call SNPs from whole genome sequences of Toxoplasma gondii and integrate with conventional genotyping methods

Detalhes bibliográficos
Ano de defesa: 2019
Autor(a) principal: Castro, Bruno Bello Pede
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: eng
Instituição de defesa: Biblioteca Digitais de Teses e Dissertações da USP
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: https://www.teses.usp.br/teses/disponiveis/10/10134/tde-28112019-114932/
Resumo: Toxoplasmosis is a parasitic disease caused by Toxoplasma gondii. Toxoplasma gondii is an intracellular parasite that is related to Plasmodium falciparum, the agent that causes malaria in human. Toxoplasma gondii infects all warm-blooded vertebrates, including mammals and birds. Recent advances in DNA sequencing technologies have made it possible to obtain and use whole genome sequences to genotype any organism, including T. gondii. In the past, PCR-RFLP and MLST are the most common methods to genotype and identify T. gondii and invaluable database is generated over the last decades using these methods. However, the conventional PCR-RFLP and MLST data cannot be easily integrated with the whole genome sequence typing. The objective of this work is to develop a pipeline to map reads coming from a whole genome sequencing to identify SNPs (Single Nucleotide Polymorphisms), and to integrate the data with PCR-RFLP and MLST data. In this work, we used sequencing data from a total of 62 T. gondii isolates from various locations around the world. From these sequences, improved data for phylogenetic analysis were generated using the SplitsTree4 software and population genetics data through the FastStructure tool. In addition, other tools that work in conjunction with the pipeline were developed, making it possible to extract genomic sequences for the 10 PCR-RFLP markers and eight introns for MLST, which were used for genetic analysis of T. gondii in the literature. To make these tools available to the research community; we integrate all software and instruction set used in Perl scripts into a virtual machine, making it possible to perform Bioinformatics tasks from any personal computer, regardless of the operating system running. For this, we use multiplatform virtualization software, VirtualBox. Implement of these tools will facility molecular genetics and population genetics of T. gondii. These tools can be easily modified to work with other organisms as needed.