Anonymity and studies of social networks
By sylvie | March 30, 2009
I keep mumbling on about how it’s almost impossible to truly have an anonymous reviewing process in the age of Google. If someone has previously published on the subject that the paper I’m supposed to be reviewing covers, I’m probably going to come across it if I do a literature research to see if there are other pertinent papers that the authors should include in their article.
These guys aren’t looking at this issue though. They’re actually looking at something that is an issue that is a lot more delicate: are anonymised data from social networks truly anonymous?
Operators of online social networks are increasingly sharing potentially sensitive information about users and their relationships with advertisers, application developers, and data-mining researchers. Privacy is typically protected by anonymization, i.e., removing names, addresses, etc.
We present a framework for analyzing privacy and anonymity in social networks and develop a new re-identification algorithm targeting anonymized social-network graphs. To demonstrate its effectiveness on real-world networks, we show that a third of the users who can be verified to have accounts on both Twitter, a popular microblogging service, and Flickr, an online photo-sharing site, can be re-identified in the anonymous Twitter graph with only a 12% error rate.
Our de-anonymization algorithm is based purely on the network topology, does not require creation of a large number of dummy “sybil” nodes, is robust to noise and all existing defenses, and works even when the overlap between the target network and the adversary’s auxiliary information is small.
This is an important ethical issue for all social networks researchers. Are we accidentally exposing users’ information to the world in such a way that is potentially detrimental to them?
Topics: Social Software, Ethics |