Rastreador web não supervisionado para aquisição, enriquecimento e predição de dados de usuários de redes sociais por intermédio de métodos de inteligência computacional

Vieira Sobrinho, José Luís

Rastreador web não supervisionado para aquisição, enriquecimento e predição de dados de usuários de redes sociais por intermédio de métodos de inteligência computacional

Detalhes bibliográficos
Ano de defesa:	2019
Autor(a) principal:	Vieira Sobrinho, José Luís
Orientador(a):	Cruz Júnior, Gélson da
Banca de defesa:	Cruz Júnior, Gélson da, Soares, Fabrizzio Alphonsus Alves de Melo Nunes, Calixto, Wesley Pacheco
Tipo de documento:	Dissertação
Tipo de acesso:	Acesso aberto
Idioma:	por
Instituição de defesa:	Universidade Federal de Goiás
Programa de Pós-Graduação:	Programa de Pós-graduação em Engenharia Elétrica e da Computação (EMC)
Departamento:	Escola de Engenharia Elétrica, Mecânica e de Computação - EMC (RG)
País:	Brasil
Palavras-chave em Português:	Inteligência computacional Rede neural artificial Rede social Rastreador Perceptron de múltiplas camadas Crawler
Palavras-chave em Inglês:	Soft computing Artificial neural network Social network Multilayer perceptron
Área do conhecimento CNPq:	ENGENHARIAS::ENGENHARIA ELETRICA
Link de acesso:	http://repositorio.bc.ufg.br/tede/handle/tede/9722
Resumo:	Companies are struggling and heavily investing in analytical tools to help them better understand people, customers and interests. Social networks, for example, are inexhaustible sources of data about the daily lives of its users. In these networks it is possible to abstract likes, interests and affinities from people that perhaps only those who really know them can identify. With increasing attention, large players like Facebook and Instagram have taken steps to protect and enhance the privacy of their customer's data. However, even with these actions, profiles that are public within these same social networks can still be scanned by third parties without their consent and without needing access to the network. This task is usually performed by mechanisms named crawlers, which scan Internet pages for data that are processed and transformed into raw material to determine user information such as gender, age, location, interests, etc. The objective of this project is to show how data from an Instagram user can be extracted, normalized, enriched and stored in a persistence layer using an unsupervised web crawler. The information collected builds a rich database, which represents the starting point for analyzing trends, patterns and even engagement prediction of future posts. Once the results are available, presenting them in a practical and useful way is a goal as important as the others. In methodological terms the project is divided into the following implementations, mostly developed in Node.js: web crawler, responsible for searching the data in the Instagram pages; module that extracts, normalizes, enriches and stores information in a non-relational database; neural module, which takes the collected data as inputs of a multilayer perceptron (a traditional implementation of artificial neural networks) to predict the popularity of publications of a given profile; visualization module, where all collected data and their analysis are presented in a smart way. Results are positive and ratifies the interest that this theme has aroused. It is possible to collect a significant amount of information and the neural network can predict in a satisfactory way the selected users popularity. Moreover, the report that displays the information is useful and intuitive. Conclusively, this project proves to be possible to fulfill the objectives proposed in a simple and sophisticated way.

Rastreador web não supervisionado para aquisição, enriquecimento e predição de dados de usuários de redes sociais por intermédio de métodos de inteligência computacional

Registros relacionados