Even Anonymous Users Can Be Identified With Only Two Pieces Of Data From Social Media Apps

Image courtesy of photographybynatalia

If you think you’re evading the constantly tracking eye of the Internet by using throw-away email addresses and obscure screen names to register your social media accounts and other apps, you’re probably wrong. A new study demonstrates how simple it can be to correctly identify someone using otherwise anonymous data.

Columbia University’s Data Science Center and Google teamed up for a recently published study [PDF], where researchers took a large data set from social media apps and stripped it of any names or other indicators that explicitly identify the user.

“Almost every interaction with technology creates digital traces, from the cell tower used to route mobile calls to the vendor recording a credit card transaction; from the photographs we take, to the ‘status updates’ we post online,” reads the study. “The idea that these traces can all be merged and connected is both fascinating and unsettling.”

The researchers in this study only used geolocation data — the information collected when you tag an Instagram photo with “Bob’s Bar” or post a Facebook update from a concert that your friends couldn’t get tickets to.

Previous similar studies about anonymized data have shown that, for example, it only takes as few as four credit card purchases to accurately identify the shopper. But this new report claims that you may only need location data from two social media apps to figure out with a high degree of confidence who an otherwise anonymous user is.

For the study, researchers cooked up an algorithm to compare geotagged Tweets with photos posted on Instagram or check-ins via Foursquare, with the intention of seeing if this data was sufficient to correctly identify users. They did something similar comparing location information for credit card purchases to cell tower pings.

It’s not as simple as merely looking at a Foursquare check-in and a Tweet and immediately knowing who the person is. Researchers had to account for factors like the imbalance in data sets — many people post to Twitter more frequently than they post photos on Instagram or use Foursquare — and the very nature of the data being shared — a Tweet or a Foursquare checkin is more likely to involve something happening right at the time the data is shared, while an Instagram photo might be uploaded hours or even days after it’s taken.

“Many people choose not to identify themselves online,” explains study author Augustin Chaintreau. “If I now tell you that your location data makes you recognizable across all of your accounts, how does that change your behavior? This is a question we now have to answer.”

Chantreau gives BuzzFeed News a real-world example of how location data could be used to identify an anonymous user.

“[O]n LinkedIn you are likely to use your real name … but maybe you are also using Tinder or some or other application which you would not want linked back to your real name,” he explains. “Using the data in what you have posted, those accounts could be linked, even if in one of them — say Tinder— you believed you were operating in ghost mode.”

The study found that comparing the credit card purchases with cell tower pings provided an even more precise method of identifying an anonymous user.

Adds co-author Chris Riederer, “People are now sharing their location on a growing number of apps, often without realizing it… Companies no longer have to be very sophisticated to access this data and use it for their own purposes.”

Want more consumer news? Visit our parent organization, Consumer Reports, for the latest on scams, recalls, and other consumer issues.