Desenvolvimento de multiclassificadores e de um sistema de identificação de resistência do HIV-1 aos antirretrovirais
Ano de defesa: | 2018 |
---|---|
Autor(a) principal: | |
Orientador(a): | |
Banca de defesa: | |
Tipo de documento: | Tese |
Tipo de acesso: | Acesso aberto |
Idioma: | por |
Instituição de defesa: |
Universidade Federal do Rio de Janeiro
Brasil Instituto Alberto Luiz Coimbra de Pós-Graduação e Pesquisa de Engenharia Programa de Pós-Graduação em Engenharia Biomédica UFRJ |
Programa de Pós-Graduação: |
Não Informado pela instituição
|
Departamento: |
Não Informado pela instituição
|
País: |
Não Informado pela instituição
|
Palavras-chave em Português: | |
Link de acesso: | http://hdl.handle.net/11422/12052 |
Resumo: | Many genotypic interpretation algorithms have been elaborated to detect HIV resistance to antiretrovirals (ARV). However, these systems have shown discordances in classification, generating different predictions of the therapeutic response. In clinical practice, genotypic assays are performed by Sanger sequencing, a technique with limited sensitivity, detecting only HIV variants present in more than 15-20% of the viral population. New DNA sequencing techniques, such as new generation sequencing (NGS), have been used in HIV genotypic resistance assays. These techniques can identify HIV-1 drug resistance mutations present at low frequencies not detectable by current HIV-1 genotyping. This study aimed to develop ensemble classifiers from interpretation algorithms and to implement an integrated environment capable of identifying the HIV-1 resistance mutations and the levels of susceptibility to ARVs from raw NGS data. Three different strategies were used to develop the ensemble classifiers: majority voting (MV), choice of the best genotypic interpretation system (MS) and stacking technique, with na¨ıve Bayes (NB) and k-NN as meta-classifiers. In general, NB and MS obtained the best results, with NB showing a statistically superior performance to at least one of the other three strategies for four drugs. The integrated environment was called SIRA-HIV, and it was implemented in the R language. The system performs a complete evaluation of the NGS data, providing to the user a list of amino acids and their frequencies found in the regions analyzed, and the HIV-1 resistance classification to ARVs according to two cut-offs. |