Um estudo sobre a relevância dos padrões lexicais para a interpretação de textos por meio da extração de informação

Detalhes bibliográficos
Ano de defesa: 2006
Autor(a) principal: Porfirio, Lucielen lattes
Orientador(a): Bidarra, Jorge lattes
Banca de defesa: Benites, Sonia Aparecida Lopes lattes, Sella, Aparecida Feola lattes
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Universidade Estadual do Oeste do Parana
Programa de Pós-Graduação: Programa de Pós-Graduação "Stricto Sensu" em Letras
Departamento: Linguagem e Sociedade
País: BR
Palavras-chave em Português:
Palavras-chave em Inglês:
Área do conhecimento CNPq:
Link de acesso: http://tede.unioeste.br:8080/tede/handle/tede/2324
Resumo: Text interpretation is a complex process that depends not only on linguistics aspects, but also cognitive and extra linguistics. In order to interpret a text, any reader must, initially, be able to decode the language and formulate mental representations of the message brought by the text. In order to do so, he will need, necessarily, to make hypothesis and inferences, and activate his previous knowledge, either linguistics or extra linguistics. Besides, the reader must locate the main ideas of the text that are expressed in the lexical items and in the relation among them. In such case, it s reasonable to admit that the identification of isolated terms in a text and the analysis of its real function in it are both very important elements for the text interpretation work. Several methods might be used for working with text interpretation. Among the most common we have the answer to questions (oral or written) about the content of the text, and more recently the Information Extraction (IE). This one is a method that consists, fundamentally, on identification and extraction of relevant linguistic aspects (lexical, syntactic and conceptual semantic) used for different types of objectives, such as: summarization, categorization and text interpretation. Through the location of keywords and linguistics structures the method goal is identify and extract the most important information that together may allow the individual to understand the subject discussed there more easily. Assuming that the interactions among lexical items are one of the most important elements in text interpretation, the goal of this paper is to discuss in what way the reader could better explore this relation, in order to help him to interpret a text. For the analysis three keywords were tracked in a research corpus in the dominium of gastroenterology: intestine , cause and helicobacter pylori . Based on the lexical patterns of collocation, colligation and semantic prosody, these words were investigated, observing how the linguistic relations of each one could reveal meanings and help in interpretation process. As a result, we noticed that through the observation of the lexical patters it was possible to extract information regarding the text subject, as well as important aspects discussed in them, such as diseases, its causes, effects and treatments, even without having access to the whole texts.