ToMAS: Sumarização abstrativa multinível baseada em tópicos usando LLMs
Ano de defesa: | 2025 |
---|---|
Autor(a) principal: | |
Orientador(a): | |
Banca de defesa: | |
Tipo de documento: | Dissertação |
Tipo de acesso: | Acesso aberto |
Idioma: | por |
Instituição de defesa: |
Universidade Federal de São Carlos
Câmpus São Carlos |
Programa de Pós-Graduação: |
Programa de Pós-Graduação em Ciência da Computação - PPGCC
|
Departamento: |
Não Informado pela instituição
|
País: |
Não Informado pela instituição
|
Palavras-chave em Português: | |
Palavras-chave em Inglês: | |
Área do conhecimento CNPq: | |
Link de acesso: | https://repositorio.ufscar.br/handle/20.500.14289/21352 |
Resumo: | Recently, political discussions in Brazil have gained prominence, becoming one of the most debated topics on social media. Factors such as the diversity of themes, the occurrence of events that attract public attention, and the constant increase in the volume of messages have made it challenging to identify in a clear, concise, and objective manner the main topics of the posts. In this context, this study proposes a new automated method that explores the use of automatic topic generation and multi-level summarization techniques using large-scale language models. The proposed method was evaluated to generate summaries from tweets about Brazilian politics collected during the 2022 presidential elections. In addition to the multi-level summarization method, a new summary evaluation method was also proposed, based on the divide and conquer strategy for applying the BERTScore automatic evaluation measure to lengthy texts, which also incorporates the generated sentence size as a weight in the evaluation score. Qualitative and quantitative analyses indicate that the combination of these techniques was able to successfully extract and summarize the main topics, demonstrating great potential to be a useful informative tool in the assessment of different opinions, issues, and topics publicly discussed. |