Detalhes bibliográficos
Ano de defesa: |
2023 |
Autor(a) principal: |
Ivamoto, Victor Soares |
Orientador(a): |
Não Informado pela instituição |
Banca de defesa: |
Não Informado pela instituição |
Tipo de documento: |
Dissertação
|
Tipo de acesso: |
Acesso aberto |
Idioma: |
eng |
Instituição de defesa: |
Biblioteca Digitais de Teses e Dissertações da USP
|
Programa de Pós-Graduação: |
Não Informado pela instituição
|
Departamento: |
Não Informado pela instituição
|
País: |
Não Informado pela instituição
|
Palavras-chave em Português: |
|
Link de acesso: |
https://www.teses.usp.br/teses/disponiveis/100/100131/tde-06022024-184049/
|
Resumo: |
Face image inpainting is a challenging problem in computer vision with several practical uses, and is employed in many image preprocessing applications. In the past few years, deep learning has made great breakthroughs in the field of image inpainting. The impressive results achieved by Generative Adversarial Networks in image processing increased the attention of the scientific community in recent years around facial inpainting. Recent network architecture developments are the two-stage networks using coarse to fine approach, landmarks, semantic segmentation map, and edge maps that guide the inpainting process. Moreover, improved convolutions enlarge the receptive field and filter the values passed to the next layer, and attention layers create relationships between local and distant information. The objective of this project is to evaluate and compare the efficacy of the baseline models identified in the literature on face datasets with three occlusion types. To this end, a comparative study was performed among the baseline models to identify the advantages and disadvantages of each model on three types of facial occlusions. A literature review gathered the baseline models, face datasets, occlusions, and evaluation metrics. The baseline models were DF1, DF2, EdgeConnect, GLCI, GMCNN, PConv, and SIIGM. The datasets were CelebA and CelebA-HQ. The occlusions were a square mask centered in the image, an irregular shape mask and a facial mask (MTF). The evaluation metrics were PSNR, SSIM and LPIPS. The comparative study consisted in two experiments, one training the baseline models from scratch, and the other using pretrained models. Both experiments followed the same testing procedure. GMCNN achieved the best overall results with the square mask, DF2 was the best model with the irregular mask, and both models were the best with the MTF mask. PConv results were very bad in all experiments, except for irregular mask with the pretrained model. |