Aprendizado de máquina construtivo e classificação hierárquica multirrótulo aplicados à geração de moléculas

Detalhes bibliográficos
Ano de defesa: 2023
Autor(a) principal: Silva, Rodney
Orientador(a): Cerri, Ricardo lattes
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Universidade Federal de São Carlos
Câmpus São Carlos
Programa de Pós-Graduação: Programa de Pós-Graduação em Ciência da Computação - PPGCC
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Área do conhecimento CNPq:
Link de acesso: https://repositorio.ufscar.br/handle/20.500.14289/19103
Resumo: One of the goals of Medicinal Chemistry is to discover new molecules with drug-like characteristics, which is challenging because the search space is discrete, unstructured, and enormous. In recent years, computation has been used as an auxiliary tool in chemical research, and one of the fields of computer science that has gained visibility and applied in various areas of knowledge in recent years is Machine Learning. The field of Machine Learning can be divided into several areas of study. In this research, two fields of Machine Learning are addressed: Constructive Machine Learning and Hierarchical Multi-label Classification. This work explores how Constructive Machine Learning can learn the intrinsic rules of molecule databases and generate instances with similar characteristics to these. The chosen Constructive Machine Learning methods for the study can be divided into two types, those that use the SMILES molecular representation and the methods that use graphs to represent molecules. Considering the different possibilities for evaluating methods and generated molecules, this work proposes the use of hierarchical classification in the evaluation process. Using a hierarchical classifier previously trained on molecule datasets, the generated molecules are classified into a taxonomy. In this way, the relevance of the generated molecules to existing taxonomies can be verified. This work also proposes a measure of dissimilarity between two groups of molecules, the hierarchical distance, which takes into account the taxonomy of the molecules present in these groups to determine the dissimilarity between them.