SumOpinions: sumarização automática de opiniões sobre pontos turísticos

Detalhes bibliográficos
Ano de defesa: 2018
Autor(a) principal: Freires Junior, João Holanda
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Não Informado pela instituição
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: http://www.repositorio.ufc.br/handle/riufc/52200
Resumo: Online travel platforms (e.g. TripAdvisor) have become very popular in recent years as they have provided easy access to a wide range of opinions, from their users’ previous experiences of tourist places, accommodations, restaurants, services, etc. In this way, travelers can use such information to create more assertive travel planning according to their preferences. However, given the large volume of user opinions, the reading and selection of relevant opinions are time-consuming and tiring, making it unfeasible to be performed manually. In this context, this work proposes a new method for summarizing opinions, aiming at the detection of relevant topics on tourist places, with the objective of reducing the number of opinions to be read and to expand the coverage of the relevant issues represented in the summary. To validate the proposed approach, we collected data from TripAdvisor and applied topic modeling algorithms, natural language processing techniques, machine learning, text similarity, and sentiment analysis to construct the summary about the opinions posted by users. Experiments were performed and compared with a state-of-the-art method in multi-document summarization. The results were evaluated based on three evaluation points: topic coverage, summary redundancy and reading difficulty. The diversity of covered topics related to the tourist places presented a considerable increase of subjects addressed in the summary in relation to the competing algorithm. Regarding the redundancy analysis, the results showed that summaries with low redundancy were generated. For assessments of reading difficulty, the results were also satisfactory, since the summaries were not difficult to be read.