Facebook profiles have become the de-facto identities of people across the internet. This is thanks, in large part, to Login With Facebook, the social network's universal login API, which allows users to carry their profile information to other apps and websites. You've probably used it to log in to services like Spotify, Airbnb, and Tinder. But sometimes, especially on lesser known websites, using Facebook's universal login feature may carry security risks, according to new research from Princeton University published Wednesday.
In a yet-to-be peer-reviewed study published on Freedom To Tinker, a site hosted by Princeton's Center for Information Technology Policy, three researchers document how third-party tracking scripts have the capability to scoop up information from Facebook's login API without users knowing. The tracking scripts documented by Steven Englehardt, Gunes Acar, and Arvind Narayanan represent a small slice of the invisible tracking ecosystem that follows users around the web largely without their knowledge.
“We never thought this was possible. It was really surprising,” says Acar. "This is tapping into a social API, which you are not expected to—but this sounds a bit beyond the line."
The researchers found that sometimes when users grant permission for a website to access their Facebook profile, third-party trackers embedded on the site are getting that data, too. That can include a user's name, email address, age, birthday, and other information, depending on what info the original site requested to access. The study found that this particular breed of tracking script is present on 434 of the web's top one million websites, though not all of them are querying Facebook data from the API—the researchers only confirmed that such a script was present.
Most of the scripts the researchers examined grab a user ID that is unique to that website, as well as the person's name and email. But the problem is, using Facebook's API, you could easily link that unique ID to someone's Facebook profile. For example, a tracker might have registered that Visitor 1 went to a webpage, but with Facebook Login, they could connect that person to their public social media profile. That information can be used to track users across other websites and devices.
After Princeton published their research, Facebook said it would suspend this ability.
“Scraping Facebook user data is in direct violation of our policies. While we are investigating this issue, we have taken immediate action by suspending the ability to link unique user IDs for specific applications to individual Facebook profile pages, and are working to institute additional authentication and rate limiting for Facebook Login profile picture requests," a Facebook spokesperson said in a statement.
The Princeton researchers identified seven different scripts that are capable of pulling information from Facebook's login API, one of which they couldn't link to a specific company. The remaining scripts are created by six marketing and fraud prevention companies: Lytics, ProPS, Tealium, Forter, and OnAudience, the last of which stopped collecting information from Facebook's login API following the publication of another third-party tracking study conducted by one of the same researchers in December. In a statement, OnAudience stressed that the platform that had this capability, behavioralengine.com, no longer exists, and its current platform uses different technology for collecting data. ProPS did not immediately return a request for comment.
Adam Corey, the CMO of Tealium, as well as James McDermott, the CEO of Lytics, explain that the Princeton researchers' findings are not as simple as they may appear, in part because the internet's tracking ecosystem is so complicated. First, it's important to understand what these companies, and other like them, actually do. They create software and tracking tools that websites can use to find out information about their customers, which sites pay for. In other words, a site might buy a tracking product from one of these companies, and then use it to suck information out from Facebook's API. But that capability is not usually what a company intended for their tools to be used for.