Amid COVID-19 and the Movement to End Racial Injustice, It’s Clear the Machines Have a Lot More to Learn

The breakdowns happened fast when COVID-19 hit.

Automated trading algorithms filtering through news stories for lighting fast investment ideas started to have fits because all the news was suddenly bad… Amazon, processing a huge influx of basic household supply orders and retail replacement shopping, while simultaneously hit with logistics chain disruptions found that product recommendations being served up to customers were wildly out of whack.

AI isn’t good at reading the room, so auto-generated marketing language talking about “going viral” is as likely to pop up as before, although it will certainly hit consumers in a very different context now. Similarly, it seems unthinkable now that Walmart would have a longstanding practice of locking up personal care products seen as being multicultural because an algorithm suggested they had a greater probability of being shoplifted, but the fact is they only changed that policy very recently, and in response to the persistent protests over racial injustice.

That kind of discrimination wasn’t an overnight breakdown, but rather a long-standing pattern of bias in certain types of AI and machine learning approaches. That it has come to the forefront again at the same time as the COVID-19 breakdowns may only be fitting, though.

It’s a reminder that the tools are only as good as the scientists who develop them, and that data itself may not be as unbiased as we would like to believe.

But the tidal wave of issues ripped open – first by the COVID-19 pandemic and then by the wave of Black Lives Matter protests unleashed after the murder of George Floyd in Minneapolis – may be exactly what the data science industry needs to finally take these lapses seriously and develop real and lasting solutions.

As COVID-19 Prompts Spontaneous Behavioral Changes, AI is Left Spinning in Place

With a global pandemic, the very fabric of the data that machine learning algorithms rely on has changed and may continue to change in ways that make developing accurate measurements and predictions extremely challenging.

Basic approaches that worked just fine before the pandemic hit have been called into question:

  • A/B testing is polluted by the fact that public behaviors shift too rapidly to provide a consistent basis for testing
  • Anomaly detection is going completely bananas as every spike stands out against historical data
  • Data source pollution and diminution is rampant as historical collection mechanisms (retail sales data, security camera, Bluetooth tracking beacons) shut down with the businesses that were generating them

Even where more data is being generated, as in e-commerce applications, it’s hard for data scientists to trust it as external factors change and influence behavior.

While “new normal” is a popular phrase these days, what data scientists are finding is that there is no normal at all.

AI Bias Failures Have Been Offensive at Best… and at Worst, Tragic

Offensive AI flubs are nothing new. Microsoft’s experimental Tay chatbot was exploited by trolls who turned the learning algorithms into a tool to make its conversations racist and misogynistic within 24 hours of release. And Amazon’s Rekognition facial recognition system famously plucked out nearly 30 congressmen and identified them as known criminals.

Biases baked into some corporate and governmental algorithms cross over from offensive to dangerous:

It’s obvious that all these issues reveal deep racial bias, prompting the data science community to do a lot of soul searching as they went to work to address the glaring flaws.

And in reflecting on the discrimination that the AI revealed, it becomes clear that this isn’t something that can simply be corrected with a one-time adjustment to an algorithm, and then forgotten about.

Most of The Hard Work is Ahead for Data Scientists Fixing Broken Algorithms

That means there isn’t going to be a single formula for fixing AI and machine learning approaches. This fact only stresses the importance of having well-educated and thoughtful data scientists to deal with the problems.

Many of the models currently being used by governments to predict the spread of COVID-19 are enhanced with machine-learning algorithms and big data gathering. Looking forward, researchers at Lawrence Berkeley National Laboratories, for instance, are using advanced ML techniques to attempt to predict seasonality in COVID-19 spread. And AI systems are comparing virus data to current drug catalogs faster than any human researchers to find previously-developed compounds that may offer effective treatments for the disease… one search conducted in only three days identified a newly developed drug called Baricitinib that may effectively block the cytokine storms that cause so much damage in COVID cases.

The role of big data in racial justice may be equally important. The effort to identify where and how police abuses and discrimination occur have been increasingly driven by data. A lot of it has simply been the fact that incidents of abuse that would otherwise go undocumented are being captured on cell phone video. But technology is also playing a role in terms of investigating systematic bias. For example, the Washington Post painstakingly assembled data on the number of people shot and killed by police, which has laid open the extent of police shootings in the United States while revealing how the victims have disproportionately been black and LatinX.

Taken together, COVID-19 and racial injustice in the U.S. have ripped the cover off the fragility of most machine learning algorithms. Data scientists will have to go back to the drawing board to develop more robust, and equitable, systems. Machine learning isn’t going anywhere, though, and the tool may actually prove integral to fighting disparities, in policing and healthcare, and society as a whole.