SIRILICO - Uma proposta para um Sistema de Recuperação de Informação baseado em Teorias da Lingüística computacional e Ontologia
Ano de defesa: | 2005 |
---|---|
Autor(a) principal: | |
Orientador(a): | |
Banca de defesa: | |
Tipo de documento: | Tese |
Tipo de acesso: | Acesso aberto |
Idioma: | por |
Instituição de defesa: |
Universidade Federal de Minas Gerais
UFMG |
Programa de Pós-Graduação: |
Não Informado pela instituição
|
Departamento: |
Não Informado pela instituição
|
País: |
Não Informado pela instituição
|
Palavras-chave em Português: | |
Link de acesso: | http://hdl.handle.net/1843/EARM-7HBND8 |
Resumo: | This work presents studies for the administration of electronic documents using a cognitive approach. We propose an automatic index generation of eletronic texts write in Brazilian Portuguese using linguistic theories, theories of computacional linguistics and ontology. The technique used to create the index is based mainly on the theory of Proposicional Analysis proposed by Frederiksen (1975) and it is based on the extraction of syntactic labels of the words that compose the documents for the generation of semantic labels of those words, for then to generate a lightweight ontology automatically. We suggest, during this work, several contribuitions to improve the Information Retrieval Systems performance, using several techniques that allow context words of indexing texts. Such contributions include optimize syntactic parsers, as well as the automatic generation of lightweight ontologies. Initially a corpus, a small collection of electronic documents about Information Science, written in Brazilian Portuguese and available in the Web, was created. This collection was used to test the prototype. The prototype, nominated SiRILiCO (Information Retrieval System based on Computacional LinguisticTheories and Ontology), was used in a first experiment and later in an experiment to verify and to validate the hypothesis that is possible to develop and to implement an Information Retrieval System totally based on linguistic theories, theories of computacional linguistics and ontology. The SiRILiCOs experiments results of precision and recall are compared with the results obtained with the use of a vectorial model. The analysis of the results suggests that not only it is a possible hypothesis as well as it is very promising. |