Classificação de imagens de exames de endoscopia por cápsula utilizando transformers

LIMA, Daniel Lopes Soares

Classificação de imagens de exames de endoscopia por cápsula utilizando transformers

Detalhes bibliográficos
Ano de defesa:	2023
Autor(a) principal:	LIMA, Daniel Lopes Soares
Orientador(a):	PAIVA, Anselmo Cardoso de
Banca de defesa:	PAIVA, Anselmo Cardoso de , CUNHA, António Manuel Trigueiros da Silva , QUINTANILHA, Darlan Bruno Pontes , SILVA, Augusto Marques Ferreira da
Tipo de documento:	Dissertação
Tipo de acesso:	Acesso aberto
Idioma:	por
Instituição de defesa:	Universidade Federal do Maranhão
Programa de Pós-Graduação:	PROGRAMA DE PÓS-GRADUAÇÃO EM CIÊNCIA DA COMPUTAÇÃO/CCET
Departamento:	DEPARTAMENTO DE INFORMÁTICA/CCET
País:	Brasil
Palavras-chave em Português:	Trato Gastrointestinal; WCE; classificação; transformers; ViT; DeiT.
Palavras-chave em Inglês:	GI Tract; WCE; classification; transformers; ViT; DeiT.
Área do conhecimento CNPq:	Ciências da Computação
Link de acesso:	https://tedebc.ufma.br/jspui/handle/tede/4635
Resumo:	Inflammatory bowel diseases have a high incidence rate in the population, being one of the leading causes of hospitalization. Videos obtained through endoscopic capsules are essential for evaluating anomalies in the gastrointestinal tract. However, due to their duration, which can reach 10 hours, they demand great attention from the medical specialist in their analysis. Machine learning techniques have been successfully applied in developing computer-aided diagnostic systems since the 1990s, where Convolutional Neural Networks (CNNs) have become very successful for pattern recognition in images. CNNs use convolutions to extract features from the analyzed data, operating in a fixed- size window and thus having problems capturing pixel-level relationships considering the spatial and temporal domains. Otherwise, Transformers use attention mechanisms, where data is structured in a vector space that can aggregate information from adjacent data to determine meaning in a given context. This work proposes a computational method for analyzing images extracted from videos obtained by endoscopic capsules, using a transformer-based model that helps diagnose of gastrointestinal tract abnormalities. The proposed methodology was applied on 41511 WCE images from the Kvasir-Capsule dataset. In the experiments performed for the classification task of 11 classes, the best results were achieved by the DeiT model, which registered average rates of 99.75% of accuracy, 98.17% of precision, 98.31% of sensitivity and 98.06% of f1-score.

Classificação de imagens de exames de endoscopia por cápsula utilizando transformers

Registros relacionados