Categorização de documentos a partir de suas citações: um método baseado em redes neurais artificiais
Ano de defesa: | 2012 |
---|---|
Autor(a) principal: | |
Orientador(a): | |
Banca de defesa: | |
Tipo de documento: | Tese |
Tipo de acesso: | Acesso aberto |
Idioma: | por |
Instituição de defesa: |
Universidade Federal de Minas Gerais
UFMG |
Programa de Pós-Graduação: |
Não Informado pela instituição
|
Departamento: |
Não Informado pela instituição
|
País: |
Não Informado pela instituição
|
Palavras-chave em Português: | |
Link de acesso: | http://hdl.handle.net/1843/ECID-92APE4 |
Resumo: | The automatic organization of large collections of documents becomes more important with the growth of the amount of infomation available in digital form. This study contributes to this issue evaluating the use of Artificial Neural Networks to automatically categorize documents through the analysis of the references cited in these documents. The method here developed generates clusters of documents based on bibliometric concepts. The publications were categorized using citations as the main input, grounded on the premise that the presence of common citations is an indicative of relationships among documents. Artificial Neural Networks are typically used to solve problems related to function approximation, prediction, classification, categorization and optimization. Many of the experiments reported in the literature describe the use of SOM networks, Self Organizing Maps, in the organization of documents for information retrieval. SOM networks were used in this work in order tocategorize documents in a test database. In this categorization process, the semantic relationships among documents were defined not by the identification of terms in common, but by the presence of common references and their years of publication. After the validation of the method, through the use of a prototype, a database which contained the references cited in 200 articles published in the journal IEEE Transactions on Neural Networks between the years of 2001 and 2010 was created. The publications were categorized by the Artificial Neural Networks and presented in groups organized by their common citations. The results obtained in three experiments showed that the Artificial Neural Networks successfully identified clusters of authors and texts, through their cited references. The analysis of the texts from the cluster publications, formed by the automatic categorization of the documents,evidenced the existence of semantic relationships between the documents. They can be useful to identify groups of researchers working in related fields, for identifying research trends in specific domains of knowledge or in the development or reformulation of queries in the process of information retrieval. |