Classificação de documentos da administração pública utilizando inteligência artificial

Detalhes bibliográficos
Ano de defesa: 2024
Autor(a) principal: Carvalho, Rogerio Rodrigues lattes
Orientador(a): Costa, Ronaldo Martins da lattes
Banca de defesa: Costa, Ronaldo Martins da, Souza, Rodrigo Gonçalves de, Silva, Nádia Félix Felipe da
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Universidade Federal de Goiás
Programa de Pós-Graduação: Programa de Pós-graduação em Ciência da Computação (INF)
Departamento: Instituto de Informática - INF (RMG)
País: Brasil
Palavras-chave em Português:
Palavras-chave em Inglês:
Área do conhecimento CNPq:
Link de acesso: http://repositorio.bc.ufg.br/tede/handle/tede/13362
Resumo: Public organizations face difficulties in classifying and promoting transparency of the numerous documents produced during the execution of their activities. Correct classification of documents is critical to prevent public access to sensitive information and protect individuals and organizations from malicious use. This work proposes two approachs to perform the task of classifying sensitive documents, using state-of-the-art artificial intelligence techniques and best practices found in the literature: a conventional method, which uses artificial intelligence techniques and regular expressions to analyze the textual content of documents, and an alternative method, which employs the CBIR technique to classify documents when text extraction is not viable. Using real data from the Electronic Information System (SEI) of the Federal University of Goiás (UFG), the results achieved demonstrated that the application of regular expressions as a preliminary check can improve the computational efficiency of the classification process, despite showing a modest increase in classification precision. The conventional method proved to be effective in document classification, with the BERT model standing out for its performance with an accuracy rate of 94%. The alternative method, in turn, offered a viable solution for challenging scenarios, showing promising results with an accuracy rate of 87% in classifying public documents