Efficient bayesian methods for mixture models with genetic applications

Detalhes bibliográficos
Ano de defesa: 2016
Autor(a) principal: Zuanetti, Daiane Aparecida
Orientador(a): Milan, Luis Aparecido lattes
Banca de defesa: Não Informado pela instituição
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Universidade Federal de São Carlos
Câmpus São Carlos
Programa de Pós-Graduação: Programa Interinstitucional de Pós-Graduação em Estatística - PIPGEs
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Inglês:
Área do conhecimento CNPq:
Link de acesso: https://repositorio.ufscar.br/handle/20.500.14289/8426
Resumo: We propose Bayesian methods for selecting and estimating di erent types of mixture models which are widely used in Genetics and Molecular Biology. We speci cally propose data-driven selection and estimation methods for a generalized mixture model, which accommodates the usual (independent) and the rst-order (dependent) models in one framework, and QTL (quantitative trait locus) mapping models for independent and pedigree data. For clustering genes through a mixture model, we propose three nonparametric Bayesian methods: a marginal nested Dirichlet process (NDP), which is able to cluster distributions and, a predictive recursion clustering scheme (PRC) and a subset nonparametric Bayesian (SNOB) clustering algorithm for clustering big data. We analyze and compare the performance of the proposed methods and traditional procedures of selection, estimation and clustering in simulated and real data sets. The proposed methods are more exible, improve the convergence of the algorithms and provide more accurate estimates in many situations. In addition, we propose methods for predicting nonobservable QTLs genotypes and missing parents and improve the Mendelian probability of inheritance of nonfounder genotype using conditional independence structures. We also suggest applying diagnostic measures to check the goodness of t of QTL mapping models.