Could data science help us fight back against the COVID ‘infodemic’?

Gary Flood Profile picture for user gflood October 23, 2020
Volunteer data scientists say a combination of graph and machine learning is uncovering just how organised spreaders of online medical disinformation may be - and how open the social media giants are to helping them

Image of fake news and misinformation
(Image by Gerd Altmann from Pixabay )

Earlier this month, YouTube said it would remove videos containing misinformation about COVID-19 vaccines and would expand its current rules against falsehoods and conspiracy theories about the Pandemic. It also revealed it's removed over 200,000 videos containing dangerous or misleading COVID-19 information since early February. No wonder the World Health Organisation says the world isn't just fighting a pandemic, but an ‘infodemic' as well. As The New York Times recently put it, we are facing "the mass distortion of truth and overwhelming waves of speech from extremists that smear and distract".

The problem, allege citizen data scientists: the infodemic isn't just crazy people talking to each other online, which in 2020 is basically BAU. The infodemic is much more better organised than that, and by some pretty sinister players. Actually, it may even be worse: it's a trash fire constantly being fanned by social media, whose algorithms keep offering to connect a ‘Plandemic' or anti-masker with more and more sources of medical disinformation, to drive ‘engagement'.

A strong accusation like this needs a strong set of facts to back it up. But Project Domino, an open source and volunteer grass-roots campaign whose results are being shared on GitHub, says it's used graph and Machine Learning to get just such proof. 

The initiative brings together a number of advanced technologies, all based on GPU-driven cloud computing, like the RAPIDS open source data science library, large-scale knowledge graphs from Neo4, graph neural networks (Stanford GraphSAGE), and automation of threat intelligence detection and response pipelines. 

QAnon, MAGA and conspiracy theorists 

All this is being co-ordinated by the leaders of two US software firms, Graphistry, a Berkeley Parallel Computing Lab spinout and a disaster management specialist called Disaster Tech, who are working with, it's claimed, volunteers from digital crime investigations, research, social networks, data startups and public policy, all putting in after-hours time to help. The former's founder & CEO, Leo Meyerovich, explains that the tools he and his collaborators are using stems from research on GPU-accelerated computing:

We do a lot of work in investigative tech and powering both advanced analysts in areas like security fraud, anti-money laundering, even figuring out outages. But it's actually pretty general technology, which lets us look at connections and data.

What kind of connections and data? Mainly, medical misinformation - disinformation and digital crime interventions - which as puts it, means "going down the rabbit hole". His example is an anti-vaxxer from Australia called Sherri Tenpenny. She has around 50-60,000 followers on Twitter and an equally big presence on Facebook, but also has an unknown number of newsletters subscribers. Her Twitter profile combines the words ‘Nazis' and ‘state abuses', Meyerovich notes, which gives some flavour of her messaging. Using data tools, he goes on, her followers include public anti-vaxx/natural medicine like medically-minded people.

No surprise, so far. But then what are major public conservative figures doing following her, like Michelle Malkin? Or the President of the US? 

We're seeing these national conservative public figures on the right, but we're seeing a mixture of QAnon, MAGA (Make America Great Again) and Trump supporters, but also other conspiracy theorists in what seem very different fields, like UFOs. What we're seeing is the networks crossing over with each other, and that's basically to support influence.

Meyerovich acknowledges that this is done mainly to game the Twitter algorithm; a Fox News reporter can legitimately have 40,000 followers, but if they're following 30,000 people back that probably suggests an inauthentic/automatic action of following back. Which is where the next stage of the Project Domino concerns kick in; that social media platforms are more than happy to see these high levels of interaction, as that's what drives the business model.

That's prohibited by the Twitter terms of service. But they get clicks, and so welcome to the anti-vax misinformation network. For example, Judy Mikovitz, the source of the infamous ‘Plandemic' video that claims the coronavirus crisis is a hoax, is the number one recommendation you get off Twitter as soon as you go to this webpage to learn about the pandemic.

To explore these problems further, the team looked at over 100m COVID-related tweets to try and find organised behavior, applying the uniform manifold approximation and projection) Deep Learning technique. What this work found, he claims, is 5000 highly active accounts across multiple misinformation topics that suggests bots programmed to talk about the same thing but controlled by the same person:

If a Tweet talks about COVID it's very likely they are also talking about QAnon. We're actually seeing a lot of QAnon influence across these different organised online groups.

Sean Griffin, founder and Chief Executive Officer of Disaster Tech, the other main company supporting Domino, claims to have seen the danger of what Facebook and Twitter can help spread when in the White House (he served both the Obama and early days of the Trump administration).

Since they're ad tech, effectively, they've done an incredibly poor job at downtrending or policing or governing or regulating or self-regulating on getting rid of the misinformation and disinformation that is pushing junk science. And unfortunately, a lot of folks have been fooled, and it's leading to injury and illness and infection and in some cases death, where you have husbands and wives dying thinking it's a hoax.

So to us, it's not about a left or right thing. This is a medical health emergency, and we need to do something.

We're asking tech to be a better neighbour

What, though, is Project Domino's ultimate message to the tech community? Apart from being open to more collaborators helping out in their spare time, for Griffin it has to be about the IT industry taking some responsibility here:

Tech companies can help prevent deaths by stopping misinformation. They have a duty as a member of a community; tech companies need to have a moral compass here, and take responsibility.

We're not asking them to dismantle their entire business model, by the way - we're asking them to be a better neighbour. Get out of the boardroom and get onto the street and see what's happening, which is that COVID misinformation is killing people, and it's time to take action.

If you're interested in finding out more, Meyerovich has this presentation online, while the main Project Domino website is here

A grey colored placeholder image