The Facebook/Cambridge Analytics scandal is booming. The “data breach”, “betrayal of trust”, “selling personal data” and other words are used frivoulously. But as with every scandal, there’s little substance.
Most of the time is dedicated to explaining what microtargetting is, what data mining is, what data protection is, and how evil Cambridge Analytics is (and honestly — their CEO bragging about the dirty tricks they offer — is really low).
But very little is said about what actually happened and what the underlying problem is.
First, a few misconceptions:
- “Facebook sold our data” — no, Facebook doesn’t sell data. This is not their business model. Their business model is collecting data and using it to give advertisers a better way to target ads. That may be called “selling data”, but they don’t give the actual data to the advertiers.
- “Cambridge Analytica stole our data” or “Facebook leaked our data” — no, the data was public. It was accessible through an API, so they could collect it automatically. Even without an API they could have written a scraper and obtained our data from our public pages — it would have been more tedious and time consuming, but not impossible. Facebook did give anonymized data to Kogan for research purposes, but you can’t get to the individual there. You can, of course, see statistical trends and dependencies.
- “Cambridge Analytica invented this new extremely powerful microtergetting” — no, microtargetting has existed before, and has been used in campaigns in the US before, by both Democrats and Republicans. Cambridge Analytica used a new psychometric research that has potential, but we don’t know for sure whether it was successful or not — CA uses it to promote themselves, but it might be just marketing.
What happened is way more prosaic — Facebook has been grossly negligent in the design of their API and that made it easy to obtain data about millions. Ontop of that, add a researcher who broke the terms of service and provided the data to a 3rd party. And ontop of that add a third party (CA) with very little integrity.
For some years now Facebook has fixed the flaw in their API. So the same thing cannot happen. Grilling Mark Zuckerberg now about their API design from several years ago is like your mother finding out you smoked in high school and telling you how bad that is. But you’re not smoking for years! The whole thing must feel akward for Facebook, as they solved that particular problem years ago. Of course, fixing it doesn’t automatically delete all the previously collected data.
Some people say “well, if you put something online, then why are you surprised it is used”. And that’s largely true about the personal data that we fill in. Obviously, API or not, our public profile can be seen by humans and machines alike (although I agree that a machine can’t scrape “friends only” details, whereas the API provided that). But that’s the personal data that we’ve put there. There’s also personal data that was collected from us. All our behaviour on Facebook and other places where facebook has “eyes”. But guess what — that data was NOT part of the Cambridge Analytica “leak” (I prefer to call it “dataset”). Business Insider has published a list of what the old API provided. It looks like a lot, but there’s a lot more data that Facebook has about us.
Don’t get me wrong — Facebook has been a very bad data guardian over the years, and allowing unresetricted access to that data is a dumb thing to do. Especially given that it doesn’t actually help them with their business model.
But the scandal makes us miss a larger point, again, as with any scandal. And that is — what data is collected about us without our knowledge. You clicking the like button (or any of the other emotions), you writing a comment in a facebook group is the tip of the iceberg. How about — every page in every website that you visit?
Yes, Facebook has spread its tracking capabilities all over the web — through the “like” plugin, through its “pixel”, through a number of other plugins that website owners install on their websites. By the use of tracking cookies, facebook knows every step of your (unless you use Firefox, which has an option to block tracking).
This data hasn’t leaked. It’s currently used to target ads at you by Facebook itself. Forget about their stupidity with the API lax permissions. That’s fixed now. What isn’t fixed is the fact that we are profiled by Facebook itself. And even if we aren’t now, we will be. What Cambridge Analytica did will probably be available in the near future as a facebook-native ad interface. And they will have way more data than Cambridge Analytica.
What to do? Regulate? The European Union has already done that — the GDPR is introducing explicit consent for all personal data we provide AND that is collected from us. But it has created more uncertainty into how exactly will that play out implementation-wise, as I’ve described before. Will we have a slightly more verbose, but still useless “cookie warning”? Will we have website owners fined for Facebook’s practices? At least Facebook will have to tell us what data they have about us. All of it.
There’s an upcoming ePrivacy regulation that will change how tracking cookies work. But there are already other methods of cookie-less tracking, like Canvas fingerprinting. Will the regulation be sufficient?
Okay, let’s say every browser follows Firefox into blocking all tracking attempts, and let’s say they successfully block canvas fingerprinting as well as any other clever workarounds. Also, let’s say we become more concious about our data (and here’s where scandals like this are actually useful) and we tweak our privacy settings. Can there be free internet services without targeted ads? Can Facebook (and Google, for that matter) exist without the ability to profile us and efficiently target us with ads? Can they support their infrastructure and expensive engineers without much of advertisers’ money?
Probably yes, though with much less profit and much less options to expand and potentially innovate. Maybe that’s the direction we should be taking?
But regardless, here are some takeaways:
- we are constantly tracked and our data is used to target ads at us — with or without Cambridge Analytica;
- Facebook deserves the criticism, but I’d say it should be about their gross negligence rather than malintent in this particular instance;
- if we are succeptible to the messaging they throw at us, then ads can influence (decicevely or not) political outcomes;
- communication platforms have always played a political role (mandatory Godwin’s law — the radio was instrumental to Hitler’s rise to power), and Facebook and Google will have to realize that;
- personal data regulations can help set a standard for data protection and let users excercise their rights as data owners, but they can also generate uncertainty for businesses;
- use Firefox :)