Inferência da localização de residência de usuários de redes sociais a partir de dados públicos

Detalhes bibliográficos
Ano de defesa: 2013
Autor(a) principal: Tatiana Pontes Soares Rocha
Orientador(a): Não Informado pela instituição
Banca de defesa: Não Informado pela instituição
Tipo de documento: Dissertação
Tipo de acesso: Acesso aberto
Idioma: por
Instituição de defesa: Universidade Federal de Minas Gerais
UFMG
Programa de Pós-Graduação: Não Informado pela instituição
Departamento: Não Informado pela instituição
País: Não Informado pela instituição
Palavras-chave em Português:
Link de acesso: http://hdl.handle.net/1843/ESBF-97GJWS
Resumo: The increasing access to social media, associated to the ease of use of sharing services, have fostered the voluntary generation of a large amount of personal data in these environments. The shared information, which vary from photos of everyday life to professional associations, can be exploited for various purposes. While these data provide opportunities for users to strengthen their ties in social networks, they also favour the development of personalised mechanisms and more efficient recommendation strategies. However, the same data can also be manipulated to promote malicious and unwanted viral marketing or access sensitive information about users. The privacy breach frequently occurs due to unawareness and carelessness of people about making information publicly available. With the rise of the location-based services, an additional aspect is added to the data related to geographic information, which makes the discussion about privacy even more incisive, since such data can endanger the physical safety of users, allowing them to be tracked. In this dissertation, we explore one of the most popular location-based social networks, Foursquare, aiming at investigating how its members exploit public system resources (specifically the attributes that are associated to geographic information). The characterisation of human behaviour in Foursquare consists of a study which aggregates about 13 million users and aims to observe the potential of geographic attributes in the system to act as sources of information leakage. In this context, we propose various inference models in an attempt to reveal the home location of users through their geographic data publicly available. Although the models are generic, being able to produce inferences at various scales, we focus on finer-grained inferences at the city and geographic coordinate levels that, if successful, represent greater risks to individual privacy. Our experimental evaluation indicates that the proposed models can easily infer the city where users live with an accuracy of about 78% within a radius of 50 kilometres. At an even finer scale, we correctly infer the coordinates of the users home with approximately 60% accuracy within a 5 kilometres radius.