The topic of bias in AI is one that’s had a lot of airtime at diginomica and beyond. Bad practice involving facial recognition tech or automated candidate selection in recruitment are among the best-known examples, while this summer in the UK, pupils waiting for their vital A-Level exam results found themselves on the wrong end of an algorithm that marked them down, ruining university entry chances in many cases.
The recent (excellent) BlackTechFest conference took on some of the questions around bias in AI in a lively panel discussion that inevitably left more questions than answers in its wake, but provided food for thought. Opening the debate, Dr Djamila Amimer, CEO of AI management consultancy Mind Senses Global, began by attempting a definition of algorithmic bias:
Algorithmic bias is defined as systematic and repeatable errors by computer systems that can cause and derive unfair outcomes, such as giving privilege to one group over another...Most often data is given as the first and primary factor behind algorithmic bias....Is it really that only data that is behind algorithmic bias or do we have other factors that contribute to that algorithmic bias?
The answer to that last rhetorical question is, of course, yes, a point picked up by Shakir Mohamed, Research Scientist with DeepMind, the AI research laboratory founded in acquired by Google in 2014:
I really like that definition of systematic and reproducible error. In those two words you can actually unpack the different kinds of components of where bias is coming in. Bias is coming in whenever there is systematic bias and error coming in. So, for example, the first one will be living in a society that has a set of biases already. That bias is going to be reflected in the mindset, in the thinking, in the way that people are approaching their work unless we are very careful. Data itself has a very important role in systematic bias, but bias is coming in many areas. It is in the way we are measuring. It is in what we are even considering worthwhile measuring. Sometimes we don't have a measurement, so we fill in what's missing instead. All of these can be sources of bias.
Then there is a third source of bias, which is in the actual technical computer algorithmic system itself. We make certain choices when we are deciding what variables to use. We are compressing the model, we are making choices as to how we are building the model, using one approach versus another one. They themselves can introduce bias. You have all these different factors, combining with each other and then what you get is effectively an artificial division, a system of difference is created which empowers some and disempowers other people. We do need to be careful. I think the question of bias is a very deep one, very multi-faceted and I think it's important that we remember the multi-faceted nature that it has.
Mirroring the real world
Bias in algorithms mirrors the real world, suggested Katrina Ffrench, CEO of StopWatch, a coalition of legal experts, academics, citizens and civil liberties campaigners who work together to address what they define as “excess and disproportionate stop and search” and to promote best practice to ensure “fair, effective policing for all”:
I think we need to kind of zoom back into how these algorithms come about. If there’s bias in society already and the status quo is unequal and then you produce mechanisms or use tools in the same fashion, you're likely to exacerbate it.
Ffrench cited as a case in point the Gangs Matrix, a database set up by the London Metropolitan Police following civil disturbance and rioting that took place back in 2011:
What basically happened was that the police decided that they needed to identify who was at risk of criminality, specifically serious violence, so they put together this database. The main issue that we found with the database is that it was definitely discriminatory. It used a very rudimentary XML spreadsheet, into which officers would put scores to do with the harm or risk that they calculated individuals to have.
Research by Amnesty International found that 80% of the people listed on the Gangs Matrix were 12-24 year olds, 78% were black and 99% were male - and 35% of people logged had never actually committed a violent offence. The police called the database a risk management tool to prevent crime and shared its data with other official agencies. This resulted, according to FFrench, in people being denied driving licences, university places, employment and in one one instance, a child being taken into care.
The Information Commissioner’s Office eventually ruled that the Met Police were in breach of data protection laws, but a lot of damage was done by that time, said Ffrench:
It just felt wholly disproportionate. What the police were doing was using AI, using policing tech, to justify discriminatory policing and then most people in civil society, the young people impacted, had no understanding of it, it was incredibly difficult to challenge…you have human rights and those were breached and that's where I'm really fearful for AI and tech and the lack of transparency and the impact it can have on people's lives. Without information, [people] have no idea what they're subjected to.
What to do?
So far, so depressingly familiar. But the panel then turned their attentions to what might be done to redress the balance. The temptation with examples of algorithmic bias, as in the case of the UK exams scandal, is that when they are exposed, the brakes are slammed on and a policy u-turn takes place, a practice that doesn’t tackle the underlying problem. This infuriates Mind Sense’s Amimer:
I get really frustrated when I hear about an AI tool or an algorithm that has been shelved or has been just ditched because it was biased. I understand that if there is bias, obviously the algorithm shouldn't be in use in the first place. But where I get frustration is surely someone, somewhere could have done something about it? Is the answer always to ditch? Don't we have the power to address and to fix the bias, so we have a bias free algorithm or bias free AI?
DeepMind’s Mohamed was of a similar view that u-turns are not the answer:
The way we're going to address this particular kind of problem is going to need to be at every level. It's going to be at the technical level, at the organizational level, at the regulatory level, at the societal and grassroots level. I really think the first thing we need to do is build a very broad coalition of people, coalitions between people like me who are technical designers and expert people who are on the ground who understand and see the distress [bias can cause].
He pointed to the push back against facial recognition as a case in point:
Over the last five years or so we've seen that kind of coalition from amazing women in their fields, black woman who saw this distress, wrote papers to expose the issue and then five years later building those coalitions. Every company now has decided we're not going to be involved in facial recognition. Cities and states themselves have decided to ban facial recognition. So the first solution - and maybe the hardest work - is to do that kind of broad coalition.
AI for good
And it’s important to remember that AI and algorithms can be used to good effect, said Naomi Kellman, Senior Manager for Schools and Universities at Rare Recruitment, a specialist diversity recruitment company which aims to help employers build workforces that better reflect diversity in society:
We have built what we could also call an algorithm in the form of the Context Recruitment System. Originally top employers in certain sectors tended to look for certain type of grade profile and also certain types of work experience. That appears color blind, but it's not because we know some people have more access to good education and good opportunities. What we were able to do is build a system that looks at people's achievement in context.
So it looks at the school you went to and says, 'What does A,A,B look like in your school? Is that what everyone gets or is that the best grade anyone's got for the past few years?'. We can highlight to employers when somebody has actually outperformed in a school situation that maybe doesn't tend to produce good grades. We also collect data on people's socio-economic status - if they've been eligible for free school meals or if they grew up in the care system or if they came to the country as refugee, all things that we know have an impact on people's chances of achieving academically and we can put things in context.
This is encouraging businesses to take a wider perspective on recruiting talent, she said:
The organizations that use it now see that they interview a much broader group of people, because instead of having a very basic algorithm that says ‘three A's or you're out’, they now use all of this data to say, ‘Actually this person has high potential’, because we're looking at more data points and that means more people get hired from a wider range of backgrounds. Students are coming to see it being used in graduate recruitment and also at university level. Universities now do contextualization and they're looking to expect that from employers. So I think it's about thinking about how we can use data to broaden opportunities for people and to put things into context.
Context is key certainly. What Kellman and her organization are talking about is a very worthy goal, but a long term one that will require a lot of changed perspectives from employers in the tech space, some of whom still have lamentable track records on the diversity front. As StopWatch’s Ffrench noted:
I think it's about diversity and representation. That's about tech companies doing more to recruit and to retain and to promote black professionals...until we're in those spaces, we're gonna find that these things keep replicating themselves.