Medindo valores humanos por meio de processamento de linguagem natural
Ano de defesa: | 2019 |
---|---|
Autor(a) principal: | |
Orientador(a): | |
Banca de defesa: | |
Tipo de documento: | Tese |
Tipo de acesso: | Acesso embargado |
Idioma: | por |
Instituição de defesa: |
Universidade Federal da Paraíba
Brasil Psicologia Social Programa de Pós-Graduação em Psicologia Social UFPB |
Programa de Pós-Graduação: |
Não Informado pela instituição
|
Departamento: |
Não Informado pela instituição
|
País: |
Não Informado pela instituição
|
Palavras-chave em Português: | |
Link de acesso: | https://repositorio.ufpb.br/jspui/handle/123456789/20026 |
Resumo: | The study of human values plays a central role in Social Psychology field. Human values are defined as abstract characteristics that serve as a guiding principle in the lives of individuals. Since the second half of the last century, some theoretical models have been proposed that sought to identify how human values are organized. Among these, the Functionalist Theory of Human Values emerged, assuming that the structure of human values is defined by two main functions: guiding behavior and cognitively expressing the needs of individuals. Regarding the methodological strategy for measuring human values, it has been based almost exclusively on self-report measures. However, recent technological advances have allowed the development of analysis strategies that make possible to extract relevant psychological characteristics from quantitative data derived from textual bases, a field known as natural language processing. The present thesis aims to test the hypothesis that the use of natural language processing is adequate to measure human values from lexical indicators (words). This thesis is divided into three articles. The first is a theoretical paper that sought to identify the main aspects of the nature of human values that influence their measurement. In the second paper we used the closed vocabulary strategy to analyze 33,941 speeches of federal deputies in the Brazilian Legislative Chamber between 2011 and 2014. In this, human values were measured from a predefined vocabulary of words, selected from judge selection process. To develop this vocabulary, an initial set of 100,886 words was used to achieve a final list of 24 lexical indicators, four for each subfunction. The results of the second paper showed that the lexical indicators of each subfunction presented a higher co-occurrence index with indicators of the same evaluative subfunction than with others, t (17) = 4.12, p = 0.001. In addition, the mean test-rest correlation of the evaluative subfunctions over the intervals between 2011 and 2014 was 0.70 indicating the temporal stability of the proposed vocabulary. Finally, multilevel regression analyzes have shown that gender and party ideology have an effect on the prevalence of lexical values indicators in deputies' speech. The aim of the third paper was to investigate which language characters are most related to different types of basic values, based on the functionalist theory of values. For this purpose, both Linguistic Inquiry and Word Count and Open Vocabulary Differential Language Analysis approaches were used to analyze 1,110,080 tweets from 1,883 participants (80.4% female), which answered the 18 items of the basic values questionnaire. The results showed that each of the evaluative subfunctions presented positive associations with language characters that support their face validity and point out to relationships with behavior previously found in the literature. In the pattern of negative relationships, there was a predominance of language suggestive of negative affects, emotional instability, and personal distress for almost all evaluative subfunctions. The findings suggest that the language of Twitter can be used to characterize the values of individuals. The present thesis is expected to contribute to the measurement of human values via textual data, to complement those derived from self-report measures and to allow the analysis of natural language databases available to researchers in large volume (e.g. text messages from social media). |