Detalhes bibliográficos
Ano de defesa: |
2016 |
Autor(a) principal: |
Cunha, Holisson Soares da
 |
Orientador(a): |
Ruiz, Duncan Dubugras Alcoba
 |
Banca de defesa: |
Não Informado pela instituição |
Tipo de documento: |
Dissertação
|
Tipo de acesso: |
Acesso aberto |
Idioma: |
por |
Instituição de defesa: |
Pontifícia Universidade Católica do Rio Grande do Sul
|
Programa de Pós-Graduação: |
Programa de Pós-Graduação em Ciência da Computação
|
Departamento: |
Escola Politécnica
|
País: |
Brasil
|
Palavras-chave em Português: |
|
Palavras-chave em Inglês: |
|
Área do conhecimento CNPq: |
|
Link de acesso: |
https://tede2.pucrs.br/tede2/handle/tede/10483
|
Resumo: |
Daily, millions of users use Twitter to share messages, providing a huge amount of opinionated content on various topics of interest to society. In addition to the volume of messages, Twitter is characterized as a social network in data streaming, that generates new messages in real-time at high speed and with a nonstationary distribution. Because of these characteristics, recent research in Sentiment Analysis has explored Twitter as an online classification task, considering constraints of time, memory, and the need to adapt to changes that may occur in the data distribution. Called concept drift, this phenomenon occurs due to potential changes in the distribution that generates new data within the stream, directly affecting the algorithm’s ability to generalize. Furthermore, the Sentiment Analysis introduces a special kind of challenge, called feature drift. In this case, new relevant attributes are found along the stream and known attributes may become irrelevant, which suggests the use of dynamic feature space. Based on these challenges, this work proposes SENTIMENTSTREAM, a dynamic ensemble classifier, which incrementally processes and analyses new instances along the stream. Specialized to process Twitter data, SENTIMENTSTREAM is composed of two main components: (i) A concept drift detector, able to detect and react efficiently to abrupt changes in the data distribution, and (ii) a feature drift detector, which uses an automatic strategy to monitor and identify potential changes in the attributes space. Experimentation with real data of Twitter indicates that Twitter SENTIMENTSTREAM presents effective results, being effective for tweets classification and treatment of potential changes in the data distribution. |