Main content

Britain's Horizon scandal - what enterprises and policymakers can learn about hallucinating IT

George Lawton Profile picture for user George Lawton January 9, 2024
Summary:
Horizon hallucination scale has important implications and lessons for all enterprises.

Horizon

Hallucinating AI has been a hot topic over the last year, but the sorry tale of the UK Post Office's Horizon IT system highlights the tragic impact that hallucinating IT systems and organizations can have as well.

If you live outside of the UK, you may be forgiven for not appreciating the fallout of the Horizon IT scandal, its life-changing impact on hundreds, if not thousands, of honest, hardworking people, and the ferocity of the public outrage that has been on show in recent days. 

Computer Weekly broke news about the problems with Horizon as far back as 2009 - its timeline on the matter is worth a read if you want to get a deep dive into the problem - but what has triggered the current furore is a four-part TV drama, Mr Bates vs. The Post Office, that has just aired, portraying the trials and tribulations of one of the victims.

It all started in 1999 when a new Post Office IT system developed by Fujitsu called Horizon began hallucinating accounting details that made it appear that local sub-postmasters were padding their accounts. These were independent contractors rather than employees. Post Office executives decided they had stumbled across graft on an unprecedented scale and convicted over 700 people for false accounting, theft, and fraud. 

The fallout was quite tragic, with many upstanding pillars of their communities left bankrupt, jailed, committed suicide, or developed addictions. One such victim was Alan Bates, sub-postmaster in North Wales, who first contacted Computer Weekly in 2004, arguing that the reason his accounts did not add up was due to dodgy software since at least 2000. Post office officials insisted that his case was unique and that no one else was having similar issues. But Bates's doubts and persistence eventually led him to discover dozens of others having the same problem. 

Over the years, the afflicted eventually reached out to UK Members of Parliament, who offered some assistance and helped launch an official investigation. But the final report was sent to the dustbin the day before it was to be officially released. Some of the afflicted did manage to prevail in court years later and were awarded damages that mostly have not been paid out. 

It is only now capturing the attention of the wider public, the UK government, and prosecutors who are now investigating the postal executives and some Fujitsu employees for potential criminal wrongdoing.

Amid widespread demands that former Post Office CEO Paula Vennells should be stripped of her CBE honour, awarded to her for services to the Post Office, UK Prime Minister Rishi Sunak eventually called the scandal “an appalling miscarriage of justice" on Sunday. 

My take

The inquiry continues, with Post Office and Fujitsu execs due to give evidence, so we must be careful in what we say at this point. 

What can be said with confidence is that the TV series managed to personify the victim experience in a way instantly understandable to the public, regulators, and prosecutors not captured in traditional quality assurance processes. 

I am not just being fanciful here because I think there are practical ways to consider the adverse impacts of new tech rollouts, whether it's generative AI, traditional IT, or new business services. The latest trend in user experience designs is to create user personas to characterize the experiences and journeys consumers and employees might have with a new service. Maybe, as part of the quality assurance process, it's time to start thinking about victim personas as well. These would capture the potential adverse experiences, life-altering events, and victim journeys of frustrated, overworked, and confused users, delivery drivers, warehouse workers, employees, and citizens. 

And here is where I think generative AI and digital twins could help. I have written quite a bit about generative AI’s propensity to hallucinate and some ways to reduce this. But hallucination is not restricted to the shiny new thing rolled out by a questionable Silicon Valley company. People, organizations, politicians, governments, and IT systems all have a propensity to hallucinate. It's just we don’t typically call it that unless the people are high on drugs or its ChatGPT.

When governments deliberately hallucinate, like about weapons of mass destruction, we may call it Realpolitik. When Silicon Valley innovators hallucinate, they may be called innovators if things turn out like Steve Jobs’s iPhone or frauds if they go the way of Theranos or FTX. 

Over the last month, I exchanged dozens of emails with a vendor that promised on their website an ergonomic chair with “forward tilt,” which is an important feature for me. They initially would not let me return it since it was custom-made by a third-party supplier. After another half-dozen emails with the third-party supplier, I discovered that the chair in question was configured with “forward slide” but not “forward tilt.” 

Armed with this new information, the vendor finally agreed to take the chair back. I wondered who or what hallucinated that detail, which was the source of so much frustration for both me and the poor agent fielding my request. The point is that hallucinations are not such a rare occurrence, and smart organizations have workarounds and risk management processes in place to mitigate their impact. 

I recently wrote about SRI’s work on mitigating hallucinations by probing a system with questions at different levels of abstraction, such as retrieving facts, explaining concepts, or drawing a connection between related data. What if we could extend this kind of framework to ask these kinds of tools to imagine victim personas and their experiences of new services, models, and processes as well?

This is where digital twins come in, since they help connect the dots across different systems, tools and processes to contextualize the meaning of data for various use cases. In the case of the Horizon scandal, it was not just a new, more efficient IT system, which it may have been. It was also a life-changing fraud imposed on hundreds of honest subcontractors.  

Another travesty in the Horizon scandal is that the accused were only provided a spreadsheet of the results of the faulty system, which differed from the accounting systems and processes the postmasters were familiar with and trusted. At no time in any of the proceedings were the accused given the opportunity to investigate the underlying algorithms and software code for fault. 

GDPR provisions on automated decision-making that include some coverage of algorithmic transparency might have helped prevent this miscarriage of justice. But these provisions only came into effect in 2018, long after the scandal had unfolded. Still, there is ongoing debate on what algorithmic transparency means in practice. We can only hope that the current outrage flamed by the Horizon scandal will guide policies that prevent other hallucinations from having such a damaging impact in the future. 

Loading
A grey colored placeholder image