A expansão de queries sobre terminologias biomédicas: uma comparação de artefatos de representação do conhecimento para Recuperação de Informações

Detalhes bibliográficos
Ano de defesa: 2020
Autor(a) principal: Eduardo Ribeiro Felipe
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Tese
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Universidade Federal de Minas Gerais
Brasil
ECI - ESCOLA DE CIENCIA DA INFORMAÇÃO
Programa de Pós-Graduação em Gestão e Organização do Conhecimento
UFMG
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: http://hdl.handle.net/1843/34313
https://orcid.org/0000-0003-1690-2044
Resumo: The expansion of queries is a technique that allows to expand the representation capacity of the original query, adding related terms, in order to increase a syntactic correspondence between the document and the query. The technique can be applied to controlled vocabularies of all types. This thesis uses clinical terminology to study the possibilities of expanding queries in the Information Retrieval (IR) of scientific articles. The general objective is to prove a comparison between knowledge representation artifacts for information retrieval. Although certain terminologies may belong to the same domain of knowledge, their features are organized in different models. While a MeSH uses traditional Knowledge Organization structures, in the sense of its origin in Librarianship; SNOMED CT uses formal constructs, namely, ontological axioms to define terms and relationships. However, much of current practice and literature points to IR based on statistical techniques as the best solution, there are also indications that justify the use of specialized terminology. This perception influenced the present work in the direction of evidencing such possibilities from a case study to compare two medical terminologies, in the retrieval of scientific articles. Some preliminary questions involved thinking about whether the use of terminology could extend document recall, or how different the application of different terminologies from the same domain to the same data could be set. To answer these and other questions, a software was built to apply queries and collect the qualitative results from the two vocabularies already mentioned. From the point of view of methodology, the work addresses, through a case study, the capture and structuring of biomedical terminologies, the acquisition and pre-processing of medical scientific articles, as well as the design of an algorithm capable of performing submitted queries from common terms in both terminologies. In terms of results, the findings point to a greater recall for the MeSH terminology, where the comparative analysis allowed to infer important principles such as: a) the number of words per term, b) the syntactic representation and c) the possibilities of terminological structuring, as main influences in order to suggest good practices - in the context of IR - for the scientific community that develops and maintains such artifacts. As additional contributions, beyond the software developed, the discussions are relevant to Information Science (IS), in a context where the publication of scientific articles has increased significantly, and the terminologies - artifacts developed at IS - can provide a differentiated model in information retrieval.