Export Ready — 

Towards Cyberbullying Detection: Building, Benchmarking and Longitudinal Analysis of Aggressiveness and Conflicts/Attacks Datasets from Twitter

Detalhes bibliográficos
Autor(a) principal: Ferreira, Paula
Data de Publicação: 2024
Outros Autores: Salgado Pereira, Nádia, Rosa, Hugo, Oliveira, Sofia, Coheur, Luísa, Francisco, Sofia, Souza, Sidclay B., Ribeiro, Ricardo, Carvalho, João P., Paulino, Paula, Trancoso, Isabel, Veiga Simão, Ana
Tipo de documento: Artigo
Idioma: eng
Título da fonte: Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
Texto Completo: http://hdl.handle.net/10400.5/98143
Resumo: Offense and hate speech are a source of online conflicts which have become common in social media and, as such, their study is a growing topic of research in machine learning and natural language processing. This article presents two Portuguese language offense-related datasets that deepen the study of the subject: an Aggressiveness dataset and a Conflicts/Attacks dataset. While the former is similar to other offense detection related datasets, the latter constitutes a novelty due to the use of the history of the interaction between users. Several studies were carried out to construct and analyze the data in the datasets. The first study included gathering expressions of verbal aggression witnessed by adolescents to guide data extraction for the datasets. The second study included extracting data from Twitter (in Portuguese) that matched the most frequent expressions/words/sentences that were identified in the previous study. The third study consisted in the development of the Aggressiveness dataset, the Conflicts/Attacks dataset, and classification models. In our fourth study, we proposed to examine whether online aggression and conflicts/attacks revealed any trend changes over time with a sample of 86 adolescents. With this study, we also proposed to investigate whether the amount of tweets sent over a period of 273 days was related to online aggression and conflicts/attacks. Lastly, we analyzed the percentage of participants who participated in the aggressions and/or attacks/conflicts.
id RCAP_a9a404824c7772257ca7d7a873769a3a
oai_identifier_str oai:repositorio.ulisboa.pt:10400.5/98143
network_acronym_str RCAP
network_name_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository_id_str https://opendoar.ac.uk/repository/7160
spelling Towards Cyberbullying Detection: Building, Benchmarking and Longitudinal Analysis of Aggressiveness and Conflicts/Attacks Datasets from TwitterAggressionOffenseHate speechSocial networksNatural language processingDatasetOffense and hate speech are a source of online conflicts which have become common in social media and, as such, their study is a growing topic of research in machine learning and natural language processing. This article presents two Portuguese language offense-related datasets that deepen the study of the subject: an Aggressiveness dataset and a Conflicts/Attacks dataset. While the former is similar to other offense detection related datasets, the latter constitutes a novelty due to the use of the history of the interaction between users. Several studies were carried out to construct and analyze the data in the datasets. The first study included gathering expressions of verbal aggression witnessed by adolescents to guide data extraction for the datasets. The second study included extracting data from Twitter (in Portuguese) that matched the most frequent expressions/words/sentences that were identified in the previous study. The third study consisted in the development of the Aggressiveness dataset, the Conflicts/Attacks dataset, and classification models. In our fourth study, we proposed to examine whether online aggression and conflicts/attacks revealed any trend changes over time with a sample of 86 adolescents. With this study, we also proposed to investigate whether the amount of tweets sent over a period of 273 days was related to online aggression and conflicts/attacks. Lastly, we analyzed the percentage of participants who participated in the aggressions and/or attacks/conflicts.IEEERepositório da Universidade de LisboaFerreira, PaulaSalgado Pereira, NádiaRosa, HugoOliveira, SofiaCoheur, LuísaFrancisco, SofiaSouza, Sidclay B.Ribeiro, RicardoCarvalho, João P.Paulino, PaulaTrancoso, IsabelVeiga Simão, Ana2025-02-06T09:42:22Z20242024-01-01T00:00:00Zinfo:eu-repo/semantics/publishedVersioninfo:eu-repo/semantics/articleapplication/pdfhttp://hdl.handle.net/10400.5/98143engFerreira, P., Pereira, N., Rosa, H., Oliveira, S., Coheur, L., Francisco, S., Souza, S., Ribeiro, R., Carvalho, J. P., Paulino, P., Trancoso, I., & Veiga-Simão, A. M. (2024). Towards cyberbullying detection: Building, benchmarking and longitudinal analysis of aggressiveness and conflicts/attacks datasets from Twitter. IEEE Transactions on Affective Computing. https://doi.org/10.1109/TAFFC.2024.351858710.1109/TAFFC.2024.35185871949-3045info:eu-repo/semantics/openAccessreponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiainstacron:RCAAP2025-03-17T16:33:31Zoai:repositorio.ulisboa.pt:10400.5/98143Portal AgregadorONGhttps://www.rcaap.pt/oai/openaireinfo@rcaap.ptopendoar:https://opendoar.ac.uk/repository/71602025-05-29T04:19:45.290755Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologiafalse
dc.title.none.fl_str_mv Towards Cyberbullying Detection: Building, Benchmarking and Longitudinal Analysis of Aggressiveness and Conflicts/Attacks Datasets from Twitter
title Towards Cyberbullying Detection: Building, Benchmarking and Longitudinal Analysis of Aggressiveness and Conflicts/Attacks Datasets from Twitter
spellingShingle Towards Cyberbullying Detection: Building, Benchmarking and Longitudinal Analysis of Aggressiveness and Conflicts/Attacks Datasets from Twitter
Ferreira, Paula
Aggression
Offense
Hate speech
Social networks
Natural language processing
Dataset
title_short Towards Cyberbullying Detection: Building, Benchmarking and Longitudinal Analysis of Aggressiveness and Conflicts/Attacks Datasets from Twitter
title_full Towards Cyberbullying Detection: Building, Benchmarking and Longitudinal Analysis of Aggressiveness and Conflicts/Attacks Datasets from Twitter
title_fullStr Towards Cyberbullying Detection: Building, Benchmarking and Longitudinal Analysis of Aggressiveness and Conflicts/Attacks Datasets from Twitter
title_full_unstemmed Towards Cyberbullying Detection: Building, Benchmarking and Longitudinal Analysis of Aggressiveness and Conflicts/Attacks Datasets from Twitter
title_sort Towards Cyberbullying Detection: Building, Benchmarking and Longitudinal Analysis of Aggressiveness and Conflicts/Attacks Datasets from Twitter
author Ferreira, Paula
author_facet Ferreira, Paula
Salgado Pereira, Nádia
Rosa, Hugo
Oliveira, Sofia
Coheur, Luísa
Francisco, Sofia
Souza, Sidclay B.
Ribeiro, Ricardo
Carvalho, João P.
Paulino, Paula
Trancoso, Isabel
Veiga Simão, Ana
author_role author
author2 Salgado Pereira, Nádia
Rosa, Hugo
Oliveira, Sofia
Coheur, Luísa
Francisco, Sofia
Souza, Sidclay B.
Ribeiro, Ricardo
Carvalho, João P.
Paulino, Paula
Trancoso, Isabel
Veiga Simão, Ana
author2_role author
author
author
author
author
author
author
author
author
author
author
dc.contributor.none.fl_str_mv Repositório da Universidade de Lisboa
dc.contributor.author.fl_str_mv Ferreira, Paula
Salgado Pereira, Nádia
Rosa, Hugo
Oliveira, Sofia
Coheur, Luísa
Francisco, Sofia
Souza, Sidclay B.
Ribeiro, Ricardo
Carvalho, João P.
Paulino, Paula
Trancoso, Isabel
Veiga Simão, Ana
dc.subject.por.fl_str_mv Aggression
Offense
Hate speech
Social networks
Natural language processing
Dataset
topic Aggression
Offense
Hate speech
Social networks
Natural language processing
Dataset
description Offense and hate speech are a source of online conflicts which have become common in social media and, as such, their study is a growing topic of research in machine learning and natural language processing. This article presents two Portuguese language offense-related datasets that deepen the study of the subject: an Aggressiveness dataset and a Conflicts/Attacks dataset. While the former is similar to other offense detection related datasets, the latter constitutes a novelty due to the use of the history of the interaction between users. Several studies were carried out to construct and analyze the data in the datasets. The first study included gathering expressions of verbal aggression witnessed by adolescents to guide data extraction for the datasets. The second study included extracting data from Twitter (in Portuguese) that matched the most frequent expressions/words/sentences that were identified in the previous study. The third study consisted in the development of the Aggressiveness dataset, the Conflicts/Attacks dataset, and classification models. In our fourth study, we proposed to examine whether online aggression and conflicts/attacks revealed any trend changes over time with a sample of 86 adolescents. With this study, we also proposed to investigate whether the amount of tweets sent over a period of 273 days was related to online aggression and conflicts/attacks. Lastly, we analyzed the percentage of participants who participated in the aggressions and/or attacks/conflicts.
publishDate 2024
dc.date.none.fl_str_mv 2024
2024-01-01T00:00:00Z
2025-02-06T09:42:22Z
dc.type.status.fl_str_mv info:eu-repo/semantics/publishedVersion
dc.type.driver.fl_str_mv info:eu-repo/semantics/article
format article
status_str publishedVersion
dc.identifier.uri.fl_str_mv http://hdl.handle.net/10400.5/98143
url http://hdl.handle.net/10400.5/98143
dc.language.iso.fl_str_mv eng
language eng
dc.relation.none.fl_str_mv Ferreira, P., Pereira, N., Rosa, H., Oliveira, S., Coheur, L., Francisco, S., Souza, S., Ribeiro, R., Carvalho, J. P., Paulino, P., Trancoso, I., & Veiga-Simão, A. M. (2024). Towards cyberbullying detection: Building, benchmarking and longitudinal analysis of aggressiveness and conflicts/attacks datasets from Twitter. IEEE Transactions on Affective Computing. https://doi.org/10.1109/TAFFC.2024.3518587
10.1109/TAFFC.2024.3518587
1949-3045
dc.rights.driver.fl_str_mv info:eu-repo/semantics/openAccess
eu_rights_str_mv openAccess
dc.format.none.fl_str_mv application/pdf
dc.publisher.none.fl_str_mv IEEE
publisher.none.fl_str_mv IEEE
dc.source.none.fl_str_mv reponame:Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
instname:FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron:RCAAP
instname_str FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
instacron_str RCAAP
institution RCAAP
reponame_str Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
collection Repositórios Científicos de Acesso Aberto de Portugal (RCAAP)
repository.name.fl_str_mv Repositórios Científicos de Acesso Aberto de Portugal (RCAAP) - FCCN, serviços digitais da FCT – Fundação para a Ciência e a Tecnologia
repository.mail.fl_str_mv info@rcaap.pt
_version_ 1833602019350282240