Recuperação da informação através de busca comparada em domínio específico, baseado em expressões multipalavras
Ano de defesa: | 2013 |
---|---|
Autor(a) principal: | |
Orientador(a): | |
Banca de defesa: | |
Tipo de documento: | Tese |
Tipo de acesso: | Acesso aberto |
Idioma: | por |
Instituição de defesa: |
Universidade Federal de Minas Gerais
UFMG |
Programa de Pós-Graduação: |
Não Informado pela instituição
|
Departamento: |
Não Informado pela instituição
|
País: |
Não Informado pela instituição
|
Palavras-chave em Português: | |
Link de acesso: | http://hdl.handle.net/1843/BUOS-97XFVY |
Resumo: | Normally, the search engines in databases is performed using keywords provided by the user to perform the documents identification. This study aims to propose an additional alternative that can be aggregated to Information Retrieval Systems (IRS) to assist the user in the process of information search. This alternative allows the realization of an automated search based on a document supplied by the user which serves as a reference. In this context the object of study was the extraction of Multi Word Expressions (MWE) of the document to serve as descriptors of the search in aspecific corpus. The MWE are obtained by a deterministic method which proposed that considers the characteristics of the physical structure of the document and compares the result with that obtained for thirteen different measures of association statistics produced by Statistics Ngram Package (NSP), which considers the text as a set of bag of words. The results demonstrate that the proposed method provides a better semantic representation of the document bringing together qualitative gains in MWE extracted and that it contributes positively to the results of the search compared. From these experiments we have proposed and implemented a prototype of a compared search tool and it was present the results obtained with its use. |