Implementation of faster R-CNN applied to the datasets COCO and PASCAL VOC

Detalhes bibliográficos
Ano de defesa: 2019
Autor(a) principal: Pinto, Pedro de Carvalho Cayres
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: eng
Instituição de defesa: Universidade Federal do Rio de Janeiro
Brasil
Instituto Alberto Luiz Coimbra de Pós-Graduação e Pesquisa de Engenharia
Programa de Pós-Graduação em Engenharia Elétrica
UFRJ
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: http://hdl.handle.net/11422/21714
Resumo: This dissertation presents implementations of two object detection systems, Faster R-CNN and Faster R-CNN with FPN, based on convolutional neural net- works. There is a brief introduction to machine learning, followed by an explanation of the image classification task, where the VGG-16 and ResNet-101 architectures are presented, as well as detailed explanations of the object detection task and the meth- ods that led to the development of Faster R-CNN. Next, the implementation of the algorithms is discussed thoroughly, specifying the parameters and the framework used to build the networks, and mentioning differences with the original. Then, three experiments are performed, using the COCO and PASCAL VOC datasets for training and testing, and the results, on the mean average precision (mAP) metric, are compared with the original counterparts of the methods. The obtained results are discussed and some considerations are made about the inference time of the im- plementations. Finally, detection examples of the most accurate implementation are presented. The FPN detector achieved 38.1% mAP@[.5, .95] and 61.1% mAP@0.5 on the COCO test-dev set (a more recent model, RetinaNet with ResNeXt-101-FPN, achieves 40.8% mAP@[.5, .95] and 61.1% mAP@0.5 on the COCO test-dev set). The code is available at: https://gitlab.com/pedrocayres/faster_rcnn_pytorch.