Mineração de opiniões comparativas em português

Detalhes bibliográficos
Ano de defesa: 2021
Autor(a) principal: Daniel Pimentel Kansaon
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Universidade Federal de Minas Gerais
Brasil
ICX - DEPARTAMENTO DE CIÊNCIA DA COMPUTAÇÃO
Programa de Pós-Graduação em Ciência da Computação
UFMG
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: http://hdl.handle.net/1843/52746
Resumo: The constant expansion of e-commerce, recently boosted due to the coronavirus pandemic, has led to a huge increase in online shopping, made by increasingly demanding customers, who seek comments and reviews on the Web to assist in decision-making regarding the purchase of products. In these reviews, part of the opinions found are comparisons, which contrast aspects expressing a preference for an object over others, allowing, for example, companies to know how customers compare their products to their competitors. However, this information is neglected by traditional sentiment analysis techniques, and it is not applicable for comparisons, since they do not directly express a positive or negative sentiment. In this context, despite efforts in the English language, almost no studies have been done to develop appropriate solutions that allow the analysis of comparisons in the Portuguese language. This work presents one of the first studies on comparative opinion in Portuguese. In general, this work contains two main contributions. First, a hierarchical approach for detecting comparisons was proposed, which consists of an initial binary step, which subdivides the regular opinions of the comparatives, to further categorize the comparatives into the five groups of opinions: (1) Non-Comparative; (2) Non-Equal Gradable; (3) Equative, (4) Superlative; and (5) Non-Gradable. The results obtained are promising, reaching 87% of Macro-F1 and 0.94 of AUC for the binary step, and 61% of Macro-F1 for classification in multiple classes. Finally, in the second contribution, an algorithm was proposed to detect the entity expressed as preferred in comparative sentences, reaching 94% of Macro-F1 for Superlative and almost 84% for Non-Equal Gradable opinions.