Avaliação experimental de um classificador para apoiar a detecção de fraudes em compras públicas

Detalhes bibliográficos
Ano de defesa: 2022
Autor(a) principal: Fontes, Raphael Silva
Orientador(a): Rodrigues Júnior, Methanias Colaço
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Não Informado pela instituição
Programa de Pós-Graduação: Pós-Graduação em Ciência da Computação
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Palavras-chave em Inglês:
Área do conhecimento CNPq:
Link de acesso: https://ri.ufs.br/jspui/handle/riufs/15098
Resumo: Context: The United Nations (UN) describes corruption as an insidious plague, which has a wide range of corrosive effects on societies. In practice, corruption has a variety of instruments, from small amounts in accelerating the granting licenses, to large frauds in bidding processes in different areas of the country. For the health area, for example, spending on medicines involves a significant volume of resources, about R$18 billion in 2018, potentially exposed to harmful conduct to the public purse. In another area of important impact, fuel, the persistent debtor, the one who fails to pay the tax due, was responsible for R$ 14 billion of tax evasion in 2020. To try to combat these problems, it is necessary to classify and automatic subtotaling of Electronic Invoices (NF-es) issued for the purchase of these products, considering their unique identification codes and descriptions. However, the codes are not always registered correctly by the suppliers. Furthermore, if the product description is considered an alternative to the code, this is not a uniform field, being free-write and variable. Finally, some products have a hierarchical classification in their descriptions, which are important for complete identification. Objective: To build and evaluate the effectiveness of a classifier of Invoices for Fuels and Medicines, based on mining the unstructured texts of these invoices, in the context of purchases made by public bodies in the states of Sergipe and Rio Grande do Norte, analyzed by the State and Federal Prosecution Offices (MPE; MPF), Special Action Group to Combat Organized Crime (GAECO) and State Finance Departments. Method: After the development and initial parameterization of the classifier, two controlled experiments were carried out with NF-es held by the MPs, respecting the fiscal secrecy of those involved. Results: Considering the statistical significance, the classifier was able to identify drug descriptions and their hierarchical subclasses, with the following average results: accuracy of 99.81%, precision of 100%, recall or sensitivity of 99.64% and F1-measure of 99.82%. As for fuels, the classifier reached an accuracy of 100% and an F1-measure of 100%. Conclusion: It was possible to show that it is feasible to automate the classification of fuels and medicines, enabling investigations. For drugs, it was also possible to extract the hierarchical subclasses of the descriptions, namely: active ingredient, dosage, pharmaceutical form and quantity.