Classificação de imagens de exames de endoscopia por cápsula utilizando transformers

Detalhes bibliográficos
Ano de defesa: 2023
Autor(a) principal: LIMA, Daniel Lopes Soares lattes
Orientador(a): PAIVA, Anselmo Cardoso de lattes
Banca de defesa: PAIVA, Anselmo Cardoso de lattes, CUNHA, António Manuel Trigueiros da Silva lattes, QUINTANILHA, Darlan Bruno Pontes lattes, SILVA, Augusto Marques Ferreira da
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Universidade Federal do Maranhão
Programa de Pós-Graduação: PROGRAMA DE PÓS-GRADUAÇÃO EM CIÊNCIA DA COMPUTAÇÃO/CCET
Departamento: DEPARTAMENTO DE INFORMÁTICA/CCET
País: Brasil
Palavras-chave em Português:
Palavras-chave em Inglês:
Área do conhecimento CNPq:
Link de acesso: https://tedebc.ufma.br/jspui/handle/tede/4635
Resumo: Inflammatory bowel diseases have a high incidence rate in the population, being one of the leading causes of hospitalization. Videos obtained through endoscopic capsules are essential for evaluating anomalies in the gastrointestinal tract. However, due to their duration, which can reach 10 hours, they demand great attention from the medical specialist in their analysis. Machine learning techniques have been successfully applied in developing computer-aided diagnostic systems since the 1990s, where Convolutional Neural Networks (CNNs) have become very successful for pattern recognition in images. CNNs use convolutions to extract features from the analyzed data, operating in a fixed- size window and thus having problems capturing pixel-level relationships considering the spatial and temporal domains. Otherwise, Transformers use attention mechanisms, where data is structured in a vector space that can aggregate information from adjacent data to determine meaning in a given context. This work proposes a computational method for analyzing images extracted from videos obtained by endoscopic capsules, using a transformer-based model that helps diagnose of gastrointestinal tract abnormalities. The proposed methodology was applied on 41511 WCE images from the Kvasir-Capsule dataset. In the experiments performed for the classification task of 11 classes, the best results were achieved by the DeiT model, which registered average rates of 99.75% of accuracy, 98.17% of precision, 98.31% of sensitivity and 98.06% of f1-score.