Convnets na caracterização, recuperação e ranqueamento de células

Detalhes bibliográficos
Ano de defesa: 2018
Autor(a) principal: Araújo, Flávio Henrique Duarte de
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Não Informado pela instituição
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: http://www.repositorio.ufc.br/handle/riufc/34765
Resumo: The goal of this scientific work is to investigate and develop computational methods to assist the cytopathologists in the analysis of Pap smear slides. First, we developed algorithms to identify abnormal cervical cells and rank images according to the probability of that image field to contain abnormalities. To support the development and evaluation of these algorithms, we created a database with real cell images of the conventional Pap test. By using this database, we trained a Convolutional Neural Network (CNN or Convnets) to identify abnormal cell regions. Then, a post-processing step based on the mathematical morphology removed the false positives. After this step, we calculated the average area of the segmented regions of each image and performed a ranking list of images in the decreasing order of the average area. The images with the highest values were tagged with a high probability to have abnormal cells. Experiments confirmed that the proposed method was more accurate (MAP=0.936) and faster (with about 4.75 seconds per image) than other algorithms in the literature. As an essential part of this research, we developed a system of Content-Based Image Retrieval (CBIR) that allows the cytopathologists to compare new exams with other similar exams. In this system, we used CNNs for feature extraction, principal component analysis to reduce the signature dimensionality and locality sensitive hashing to search for similar images. We also created an application with an intuitive graphical user interface named pyCBIR. As we observed that this application worked appropriately to other images with different characteristics of the cervical cells, we performed tests and evaluation of the accuracy and processing time by using databases with images from microscopy, microtomography, atomic diffraction patterns, and materials photographs. According to this evaluation, we recommend the pyCBIR as potentially applicable to big volume of data, such as images from exams of the Brazilian health system, because it works with millions of samples. Also, this application can be used in databases with hundreds and thousands of images and even for unlabeled databases, since it allows transfer learning.