Agrupamento e categorização de documentos jurídicos

Detalhes bibliográficos
Ano de defesa: 2011
Autor(a) principal: Furquim, Luis Otávio de Colla lattes
Orientador(a): Lima, Vera Lúcia Strube de lattes
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Pontifícia Universidade Católica do Rio Grande do Sul
Programa de Pós-Graduação: Programa de Pós-Graduação em Ciência da Computação
Departamento: Faculdade de Informáca
País: BR
Palavras-chave em Português:
Área do conhecimento CNPq:
Link de acesso: http://tede2.pucrs.br/tede2/handle/tede/5181
Resumo: In this work we study the use of machine learning (clustering and classification) in judicial decisions search under electronic legal proceedings. We discuss and develop alternatives for precedent clustering, automatically generating classes to use to categorize when a user attaches new documents to its electronic legal proceeding. A changed version of the algorithm TClus, authored by Aggarwal, Gates and Yu was selected to be the use example, we propose removing its document and cluster discarding features and adding a cluster division feature. We introduce here a new paradigm bag of terms and law references instead of bag of words by generating attributes using two thesauri from the Brazilian Federal Senate and the Brazilian Federal Justice to detect legal terms a regular expressions to detect law references. In our use example, we build a corpus with precedents of the 4th Region s Federal Court. The clustering results were evaluated with the Relative Hardness Measure and the p-Measure which were then tested with Wilcoxon s Signed-ranks Test and the Count of Wins and Losses Test to determine its significance. The categorization results were evaluated by human specialists. The analysis and discussion of these results covered comparations of true/false positives against document similarity with the centroid, quantity of documents in the clusters, quantity and type of the attributes in the centroids e cluster cohesion. We also discuss attribute generation and its implications in the classification results. Contributions in this work: we confirmed that it is possible to use machine learning techniques in judicial decisions search, we developed an evolution of the TClus algorithm by removing its document and group discarding features and creating a group division feature, we proposed a new paradigm called bag of terms and law references evaluated by a prototype of the proposed process in a use case and automatic evaluation in the clustering phase and a human specialist evaluation in the categorization phase.