What is fake news and how to fight it

“Fake news” has become a buzzword recently. And it’s a label getting attached to any news that someone doesn’t like or agree with. Including by the US president.

But what is “fake news”? Thanks to the widespread use, and thanks to Trump, it has started to mean “any news that isn’t entirely correct”. News that has bias, or where journalists have cherrypicked their examples, is now called “fake news”.

It isn’t. It is what it has always been — biased reporting. Fox has a huge republican bias. The New York Times has a subtle (at times not so subtle) democratic bias. The Daily Mirror has a Labour bias, The Sun has a Tory bias, Bulgarian press has pro-whoever-is-in-power bias, and so on. These biases manifest themselves in many ways, but it is not fake news. Even the “unconfirmed” reports that appear in some shady outlet just because there wasn’t a second source to confirm the news for reputable ones to publish it isn’t fake news, it’s just unconfirmed news.

Fake news, on the other hand, is entirely made up news. News about events that either didn’t occur at all, or are related to something that did occur, but have nothing else rooted in reality.

In Eastern Europe particular examples about fake news are stories with titles like “Incest now legal in Europe”, “Russia has demonstrated its world class undefeatable weapon”, “Aliens have landed near city X”, “European politician X called Orthodox Christianity an archenemy of Europe”, “A German family leaving for Russia, because the state forced their children to be gay”, and so on.

A recent publication outlined the “alternative media” landscape in the US, pointing out much of the practices what I’ve already seen in Bulgaria and other EU countries — using multiple websites that cross-post content with no or minor modification, and amplifying it through social media.

The websites are hundreds, even thousands. A colleague of mine is calling them “mushroom sites’, as they appear to grow in number pretty fast. Some of them get shut down, and new ones appear. Most of them can be linked to one another due to the identical content posted at identical times. Some of the websites share metadata (hosting companies, domain registration dates, etc). I.e. they can be linked to a single or a few operations that are aimed at disseminating the fake news.

But having news on some obscure websites is not enough — they have to be spread through social media. Two tactics seem to be used there — one is using fake profiles to post the content in popular facebook groups, the other one is getting the fake news sponsored and targeted through the facebook ad system. In twitter, as the publication above shows, it’s about hashtags instead of groups. Basically, a place where many unsuspecting people will see it.

Recently the Wall Street Journal cited a leaked document, according to which Russia used fake news to sway Bulgarian election. Having seen the amount of fake news websites and their spread in social media, it’s very likely they did that. Whether it managed to sway the election or not is a different topic, but it isn’t a coincidence that many of the fake news in Eastern Europe are pro-Russian and anti-European.

In Western Europe and the the US the fake news are about different topics, but the methods are practically the same. When you see the data it’s almost obvious that this is a well-planned, well-executed operation to spread disinformation.

Now that we know what fake news is and how it operates and proliferates, what do we do to stop it?

My suggestion number one is — introduce journalist identities. Every article must be signed by a journalist. (In rare cases it can be signed by the editor in order to protect a particular journalist). Anonymous news is not a good idea. Anonymous sources — sure. Platforms for leaks (like WikiLeaks) — sure. But in the end there must be an actual journalist that writes and publishes the material.

Yes, I know there was the Google authorship project, and that it’s dead now. But it was just additional metadata, not true identity.

There needs to be a more rigorous standard for confirming identity — e.g. through a network of journalist associations, who can issue credentials/keys, with which each article can be digitally signed. It shouldn’t be just about putting a name in the metadata.

Yes, this sounds like a lot of bureaucracy. And possibly limiting free speech? Not so much, actually. There can be a standard set of components provided to journalist associations, even multi-tenant software-as-a-service that handles the identity bits. It is going to be basically like purchasing an SSL certificate, just it won’t be for website identities, but for journalist identities. And it’s not limiting free speech — as much as I would like Alex Jones to shut up with his fake news, at least there’s a real person there. And he can prove it, so he’s free to publish, and get judged based on his publications.

What would be the benefit of that? How would that stop the dissemination of fake news? Once there’s a standard for journalist identity, Google, Facebook and Twitter can implement safeguards — i.e. google can drop significantly the score of anonymous publications, making them practically impossible to find, Facebook can block such publications from being advertised (Twitter, likewise) and warn users when they are shared, as well as reduce their likelihood for appearing in news feeds (hashtag results in the case of twitter).

There has been a lot of research on detecting fake news using machine learning (e.g. neural networks). It is promising, but I think it’s going to be whack-a-mole at some point — fake news will learn how to appear legitimate to machine learning, which would have to constantly be retrained. I think the identity part will be a better long-term solution.

Will there be fraud? Of course — people will get registered as journalists by fake associations, and then will be used as authors throughout many fake news sites. But that is very easy to detect by both Google and Facebook, and so having one author, or authors from the same identity-issuer appearing in a chain of websites, can again be used in reducing the search score. The details will certainly by tricky, but I think it’s way more likely to get things right than with any other approach.

But isn’t that approach too harsh? Aren’t we introducing a lot of complexity to solve a minor problem? Sadly, the problem isn’t minor at all. Fake news sway elections, so “fake news” is power. And in the long run, they leave “muddled thinking” in societies, which are no longer able to tell truth from made-up, agenda-driven news.

That won’t solve the bias problem, and more subtle disinformation techniques would still be possible, nor it will solve the problem of the dying business model of journalism, but these are all complex, intertwined issues and we should start from somewhere, rather than watch society dissolve in toxic false information.

Software engineering. Linguistics, algorithmic music composition. Founder at LogSentinel.com

