A guide on extracting and tidying tweets with R

Adams, Julia Bahia; Chiarelli, Carlos Augusto Jardim

A guide on extracting and tidying tweets with R

Detalhes bibliográficos
Autor(a) principal:	Adams, Julia Bahia
Data de Publicação:	2021
Outros Autores:	Chiarelli, Carlos Augusto Jardim
Tipo de documento:	Artigo
Idioma:	eng
Título da fonte:	Cadernos de Linguística
Texto Completo:	https://cadernos.abralin.org/index.php/cadernos/article/view/410
Resumo:	Social media platforms represent a deep resource for academic research and a wide range of untapped possibilities for linguists (D'ARCY; YOUNG, 2012). This rapidly developing field presents various ethical issues and unique challenges regarding methods to retrieve and analyze data. This tutorial provides a straightforward guide to harvesting and tidying Twitter data, focused mainly on the Tweets' text, by using the R programming language (R CORE TEAM, 2020) via Twitter's APIs. The R code was developed in Adams (2020), based on the rtweet package (KEARNEY, 2018), and successfully resulted in a script for corpora compilation. In this tutorial, we discuss limitations, problems, and solutions in our framework for conducting ethical research on this social networking site. Our ethical concerns go beyond what we "agree to" in terms of use and privacy policies, that is, we argue that their content does not contemplate all the concerns researchers need to attend to. Additionally, our aim is to show that using Twitter as a data source does not require advanced computational skills.

Metadados do item

id	ABRALIN_383b26ebe3e3a86dfe39ccd5f30f912a
oai_identifier_str	oai:ojs3.cadernos.abralin.org:article/410
network_acronym_str	ABRALIN
network_name_str	Cadernos de Linguística
repository_id_str
spelling	A guide on extracting and tidying tweets with RUm guia para extração e manipulação de tweets com RMetodologia de coleta de dadosRede socialÉtica em pesquisaData collection methodsSocial mediaResearch ethicsSocial media platforms represent a deep resource for academic research and a wide range of untapped possibilities for linguists (D'ARCY; YOUNG, 2012). This rapidly developing field presents various ethical issues and unique challenges regarding methods to retrieve and analyze data. This tutorial provides a straightforward guide to harvesting and tidying Twitter data, focused mainly on the Tweets' text, by using the R programming language (R CORE TEAM, 2020) via Twitter's APIs. The R code was developed in Adams (2020), based on the rtweet package (KEARNEY, 2018), and successfully resulted in a script for corpora compilation. In this tutorial, we discuss limitations, problems, and solutions in our framework for conducting ethical research on this social networking site. Our ethical concerns go beyond what we "agree to" in terms of use and privacy policies, that is, we argue that their content does not contemplate all the concerns researchers need to attend to. Additionally, our aim is to show that using Twitter as a data source does not require advanced computational skills.As plataformas de redes sociais representam uma profunda fonte de dados para pesquisas acadêmicas e um amplo leque de possibilidades para linguistas (D'ARCY; YOUNG, 2012). Este campo em rápido desenvolvimento apresenta diversas questões éticas e desafios únicos no que concerne os métodos de coleta e análise de dados. Esse tutorial oferece um guia direto para extração e mineração de dados do Twitter, voltando-se principalmente para o texto dos Tweets, por meio da linguagem de programação R (R CORE TEAM, 2020) via os Twitter APIs. O código em R foi desenvolvido em Adams (2020), com base no pacote rtweet (KEARNEY, 2018), e resultou com sucesso em um script para compilação de corpora. Nesse guia, são discutidas limitações, problemas e soluções na nossa abordagem para a condução ética de pesquisa nessa rede social. Nossas preocupações éticas vão além daquilo com o que "concordamos" nos termos de uso e nas políticas de privacidade, isto é, argumentamos que seu conteúdo não abrange todas as questões a que pesquisadoras(es) devem responder. Ademais, nosso objetivo é demonstrar que utilizar o Twitter como uma fonte de dados não requer habilidades computacionais avançadas.Abralin2021-12-03info:eu-repo/semantics/articleinfo:eu-repo/semantics/publishedVersiontext/xmlapplication/pdfhttps://cadernos.abralin.org/index.php/cadernos/article/view/41010.25189/2675-4916.2021.v2.n4.id410Cadernos de Linguística; Vol. 2 No. 4 (2021); e410Cadernos de Linguística; Vol. 2 Núm. 4 (2021); e410Cadernos de Linguística; v. 2 n. 4 (2021); e4102675-4916reponame:Cadernos de Linguísticainstname:Associação Brasileira de Linguística (ABRALIN)instacron:ABRALINenghttps://cadernos.abralin.org/index.php/cadernos/article/view/410/592https://cadernos.abralin.org/index.php/cadernos/article/view/410/621Copyright (c) 2021 Julia Bahia Adams, Carlos Augusto Jardim Chiarelliinfo:eu-repo/semantics/openAccessAdams, Julia BahiaChiarelli, Carlos Augusto Jardim2023-10-25T06:49:00Zoai:ojs3.cadernos.abralin.org:article/410Revistahttps://cadernos.abralin.org/ONGhttps://cadernos.abralin.org/index.php/cadernos/oaiabralin@abralin.org \|\| cadlin@abralin.org13083-8592675-4916opendoar:2023-10-25T06:49Cadernos de Linguística - Associação Brasileira de Linguística (ABRALIN)false
dc.title.none.fl_str_mv	A guide on extracting and tidying tweets with R Um guia para extração e manipulação de tweets com R
title	A guide on extracting and tidying tweets with R
spellingShingle	A guide on extracting and tidying tweets with R Adams, Julia Bahia Metodologia de coleta de dados Rede social Ética em pesquisa Data collection methods Social media Research ethics
title_short	A guide on extracting and tidying tweets with R
title_full	A guide on extracting and tidying tweets with R
title_fullStr	A guide on extracting and tidying tweets with R
title_full_unstemmed	A guide on extracting and tidying tweets with R
title_sort	A guide on extracting and tidying tweets with R
author	Adams, Julia Bahia
author_facet	Adams, Julia Bahia Chiarelli, Carlos Augusto Jardim
author_role	author
author2	Chiarelli, Carlos Augusto Jardim
author2_role	author
dc.contributor.author.fl_str_mv	Adams, Julia Bahia Chiarelli, Carlos Augusto Jardim
dc.subject.por.fl_str_mv	Metodologia de coleta de dados Rede social Ética em pesquisa Data collection methods Social media Research ethics
topic	Metodologia de coleta de dados Rede social Ética em pesquisa Data collection methods Social media Research ethics
description	Social media platforms represent a deep resource for academic research and a wide range of untapped possibilities for linguists (D'ARCY; YOUNG, 2012). This rapidly developing field presents various ethical issues and unique challenges regarding methods to retrieve and analyze data. This tutorial provides a straightforward guide to harvesting and tidying Twitter data, focused mainly on the Tweets' text, by using the R programming language (R CORE TEAM, 2020) via Twitter's APIs. The R code was developed in Adams (2020), based on the rtweet package (KEARNEY, 2018), and successfully resulted in a script for corpora compilation. In this tutorial, we discuss limitations, problems, and solutions in our framework for conducting ethical research on this social networking site. Our ethical concerns go beyond what we "agree to" in terms of use and privacy policies, that is, we argue that their content does not contemplate all the concerns researchers need to attend to. Additionally, our aim is to show that using Twitter as a data source does not require advanced computational skills.
publishDate	2021
dc.date.none.fl_str_mv	2021-12-03
dc.type.driver.fl_str_mv	info:eu-repo/semantics/article info:eu-repo/semantics/publishedVersion
format	article
status_str	publishedVersion
dc.identifier.uri.fl_str_mv	https://cadernos.abralin.org/index.php/cadernos/article/view/410 10.25189/2675-4916.2021.v2.n4.id410
url	https://cadernos.abralin.org/index.php/cadernos/article/view/410
identifier_str_mv	10.25189/2675-4916.2021.v2.n4.id410
dc.language.iso.fl_str_mv	eng
language	eng
dc.relation.none.fl_str_mv	https://cadernos.abralin.org/index.php/cadernos/article/view/410/592 https://cadernos.abralin.org/index.php/cadernos/article/view/410/621
dc.rights.driver.fl_str_mv	Copyright (c) 2021 Julia Bahia Adams, Carlos Augusto Jardim Chiarelli info:eu-repo/semantics/openAccess
rights_invalid_str_mv	Copyright (c) 2021 Julia Bahia Adams, Carlos Augusto Jardim Chiarelli
eu_rights_str_mv	openAccess
dc.format.none.fl_str_mv	text/xml application/pdf
dc.publisher.none.fl_str_mv	Abralin
publisher.none.fl_str_mv	Abralin
dc.source.none.fl_str_mv	Cadernos de Linguística; Vol. 2 No. 4 (2021); e410 Cadernos de Linguística; Vol. 2 Núm. 4 (2021); e410 Cadernos de Linguística; v. 2 n. 4 (2021); e410 2675-4916 reponame:Cadernos de Linguística instname:Associação Brasileira de Linguística (ABRALIN) instacron:ABRALIN
instname_str	Associação Brasileira de Linguística (ABRALIN)
instacron_str	ABRALIN
institution	ABRALIN
reponame_str	Cadernos de Linguística
collection	Cadernos de Linguística
repository.name.fl_str_mv	Cadernos de Linguística - Associação Brasileira de Linguística (ABRALIN)
repository.mail.fl_str_mv	abralin@abralin.org \|\| cadlin@abralin.org
_version_	1836103765376106496

A guide on extracting and tidying tweets with R

Registros relacionados