AI inevitability - can we separate bias from AI innovation?

Profile picture for user Neil Raden By Neil Raden October 15, 2020
Summary:
AI evangelists pay lip service to solving AI bias - perhaps through better algorithms or other computational means. But is this viable? Is bias in AI inevitable?

Hands holding a sheet with question mark in front of closed doors and stormy clouds © StunningArt - shutterstock

We've been led to believe that A.I. is going to solve all of our problems - economically, socially, environmentally. It stretches credulity that it can do that when all it does is find patterns in numbers. But what it is capable of - in that limited role - is dangerous.

Nevertheless, A.I.'s inevitability, predicted by industry, academics, and industry analysts, goes without question. That the issues of bias, exclusion, and disinformation are social problems that cannot be addressed with pattern-matching and curve fitting, and cannot satisfactorily be dealt with by technology is a strength, not a weakness of the inevitability narrative.

The false promise of "solving" bias computationally obscures the larger issue: bias is pervasive. But there is an industry-wide strategic switcheroo lurking here, a beguiling diversion. Instead of being a problem, bias becomes part of the broad picture of A.I. innovation. It's like Hunter S. Thompson's definition of Gonzo Journalism: start a fire and report on it.

Beyond Defense and Intelligence operations, whose scale is unknown, most A.I. today is used to pierce your privacy veil and provide a "360 degree" view of you to sell you things. Dealing with data about human beings, everyone knows A.I. has a big problem: bias. A.I. Ethics and notions of "fairness" are now the full employment act of swarms of thinkers, writers, academics, consultants, and self-declared "ethicists," many of whom have no credentials, no experience in A.I., or even a clue how to solve the bias problem.

The A.I. industry purports to attack the bias problem computationally. It can't be done. A.I. engineers' imagination for ethics, fairness, and the law is hopelessly limited because bias isn't a technology problem; t's cultural, subject to interpretation, and insidious. Bias creeps into algorithms, into models of algorithms, into the data and semantics and labels used by A.I., and even into how the models are deployed. But if you step back a little, you see that the technology companies' hand-wringing and focus over bias obscures a much more significant problem with A.I. With all the hype, what mostly goes unnoticed is the two-headed monster of A.I. -   inevitability and A.I. bias. All sorts of undesirable societal issues, like bias, job losses, environmental issues, housing, economic disparity, and others, become part of A.I. innovation's promise. 

I've seen up close narrow increases in "fairness" celebrated as progress, such as loosening the credit approval algorithm for borrowers whose criteria are not 100%. But buried in the rationalization is that the requirements were biased and discriminatory in the first place. This isn't fairness. These tepid efforts are baby steps to address long-standing discrimination.

Everyone has heard of the iconic cases of egregious A.I. bias: Amazon's hiring algorithm, Facebook's ad server, and the COMPASS Judicial system. Why hasn't there been more publicity about a system sold to healthcare organizations by Optum, the $100 billion unit of UnitedHealth Group? It predicts which patients will benefit from extra medical care, and dramatically underestimates the sickest black patients' health needs, amplifying long-standing racial disparities in medicine. The model's flaw was that it used the cost of care as the significant predictor of risk. Black patients historically were less able to pay, so the system ran an infinite feedback loop, denying black patients a higher quality of care.  Having made this discovery, a team from UC Berkeley worked with Optum to find variables other than cost to assign the expected risk scores, reducing 84% of bias. The problem was solved in this one instance, but it is still endemic in similar tools employed by public and private institutions that provide healthcare to 200 million Americans.

When insurers price coverage on drivers' economic status, they are invariably doing so in a manner that disproportionately targets African Americans with higher prices. Companies have not asked for a customer's race for decades, but they may or may not be aware that they can infer it from data they collect (the so-called proxies or latent-values), but their AI-driven pricing tools are well aware of it. There is no reason to believe that they are deliberately supporting systemic racism, but they'd have to be blind actuaries not to notice it. Still, the results are undeniable - government-required auto insurance is consistently more expensive for black Americans and the working poor in general. A state insurance regulator told me that auto underwriters previously used only a handful of data to price a policy but enough to determine an applicant's race a few years ago. Today, they have access to hundreds of data points on each individual and AI-driven pricing tools to leverage it. As she explained, auto insurance is mandatory and expensive and is essentially a regressive tax on the working poor. When she receives a rate filing from an insurance company, she sees her role first as "Is it fair?"  

The FICO score is a real boat anchor for the working poor. They don't have low FICO scores because they're deadbeats or dishonest or reckless drivers. They have a low FICO score because they struggle with finances for a plethora of reasons: low wages, employment instability, housing instability, high prices for substandard food. But their score follows them around and causes high auto insurance costs, high-interest rates, difficulty finding mobility in careers. No, it's not fair.

Why do we care about ethics in A.I.? For reasons like the ones above, A.I. can harm people at scale before anyone notices, but the damage is already done. The first ones to notice are usually those with the least agency to do anything about it. 

Part of the problem is a lack of understanding of how machine learning works. It's not a magic eight-ball. The model has to be "trained" by processing data that has been labeled (this record is a picture of a horse). Data is biased. Labeling introduced more bias. Once trained, the model has found specific patterns. It applies to unlabeled data to make a prediction. In a typical scenario, the algorithm uses gradient descent (or ascent, depending on the model) to converge on a cost function solution. But what happens when it doesn't? Like Jurassic Park, the algorithm finds a way. It will deviate from the selected features and apply "latent values" to get to a convergent solution. Sometimes, these solutions look perfectly reasonable, but they are utterly wrong. Amateurish development, an aching desire to push something out, insidious problems with the data can cause a significant mess.

My take

So, if bias can't be solved computationally, what's the solution? I've had managers of companies I've trained in A.I. Ethics notice the participants have a solid grasp of the concepts but continue to operate in ethically-compromised ways. The reasons are apparent. There is a massive asymmetry between adverse effects in the social context and the economic benefits in deploying inferencing systems. These forces augur against asking the question: We can build these systems, but should we? 

Frank Lloyd Wright once quipped that "Route 66 is a giant chute through which everything in the middle of the country is falling to southern California." Today, it's the one-way transfer of our data to the elephant tech companies and applying automated, predictive solutions to everything we do, personally, collectively, and politically, and supports a malicious monopoly. Any A.I. system that affects people's lives must be subject to protest, account, and redress.

I'm afraid we need legislation to disincentivize data hoarding, including carefully defined bans, levies, mandated data sharing, and community benefits policies, all backed up by enforcement. Smarter data policies would reenergize competition and innovation, both of which have unquestionably slowed with the tech giants' concentrated market power. The most significant opportunities will flow to those who act most boldly.

The second great opportunity is to wrestle with fundamental existential questions and build robust processes to resolve them. Which systems deserve to be made? Which problems most need to be tackled? Who is best placed on building them? And who decides? We need genuine accountability mechanisms, external to companies, and accessible to populations.

Image credit - Hands holding a sheet with question mark in front of closed doors and stormy clouds © StunningArt - shutterstock