ScreenVar - a biclustering-based methodology for evaluating structural variants

Detalhes bibliográficos
Ano de defesa: 2017
Autor(a) principal: NASCIMENTO JÚNIOR, Francisco do
Orientador(a): GUIMARÃES, Katia Silva
Banca de defesa: Não Informado pela instituição
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: eng
Instituição de defesa: Universidade Federal de Pernambuco
Programa de Pós-Graduação: Programa de Pos Graduacao em Ciencia da Computacao
Departamento: Não Informado pela instituição
País: Brasil
Palavras-chave em Português:
Link de acesso: https://repositorio.ufpe.br/handle/123456789/25375
Resumo: The importance of structural variants as a source of phenotypic variation has grown in recent years. At the same time, the number of tools that detect structural variations using Next- Generation Sequencing (NGS) has increased considerably with the dramatic drop in the cost of sequencing in last ten years. Then evaluating properly the detected structural variants has been featured prominently due to the uncertainty of such alterations, bringing important implications for researchers and clinicians on scrutinizing thoroughly the human genome. These trends have raised interest about careful procedures for assessing the outcomes from variant calling tools. Here, we characterize the relevant technical details of the detection of structural variants, which can affect the accuracy of detection methods and also we discuss the most important caveats related to the tool evaluation process. This study emphasizes common assumptions, a variety of possible limitations, and valuable insights extracted from the state-of-the-art in CNV (Copy Number Variation) detection tools. Among such points, a frequently mentioned and extremely important is the lack of a gold standard of structural variants, and its impact on the evaluation of existing detection tools. Next, this document describes a biclustering-based methodology to screen a collection of structural variants and provide a set of reliable events, based on a defined equivalence criterion, that is supported by different studies. Finally, we carry out experiments with the proposed methodology using as input data the Database of Genomic Variants (DGV). We found relevant groups of equivalent variants across different studies. In summary, this thesis shows that there is an alternative approach to solving the open problem of the lack of gold standard for evaluating structural variants.