The International Organization for Standardization (ISO) has recently published ISO/IEC 42001:2023, an AI management system standard. The work was informed by AI experts worldwide from over eighty organizations from thirty-nine countries. It joins over 300 other standards cataloged in the UK’s AI standards hub that are being developed or have been published by prominent standards organizations.
With so many different AI standards, it's fair to ask why we need one more. In this case, it provides a framework for thinking about which standards, processes and tools organizations need to apply depending on the organization's context.
The 42001 standard provides high-level guidance on relevant AI-specific risks and connects these to specific steps and processes required to mitigate these. Management system standards differ from technical standards in that they spell out the repeatable steps organizations can implement to improve operations and reduce risks. 42001 joins other management standards that focus on quality management information security, environmental management, and guidelines for auditing management systems.
In the case of AI, this promises to improve the quality, security, traceability, transparency, and reliability of AI applications. It centers on a plan-do-check-act approach that has been successful in other domains. It provides a high-level framework that helps organizations understand the essential elements they must consider in their responsible AI efforts. The high-level standard is bolstered by numerous other supplemental standards relating to specific issues like bias, transparency, security, and data management.
The standard itself does not proscribe any specific tools, but it does describe the kinds of processes organizations need to consider to mitigate risks. It also informs a framework for training and certifying IT auditors, who can then sign off that enterprises, vendors, and government organizations have implemented sufficient technical, administrative, and cultural processes to mitigate AI risks.
I spoke with Dr Matilda Rhode, AI and Cyber Security sector lead at the British Standards Institution (BSI), about what the standard means for enterprises. She recently joined BSI after leading cyber innovation at Airbus, when she decided she wanted to make a difference in building more trustworthy AI. She explains:
I was working for a large organization before, where I was looking at how we can use AI to our benefit in cybersecurity. And there, you always have to think about the malicious actors and the risks of any new technology you put in place. And I realized other colleagues were looking at the opportunities around AI outside of cybersecurity, like computer vision applications and things like that, which also have their own risks. And I started supporting other teams to try and onboard these risks.
And you realize what a challenge it was for them. Because the people who are working at the cutting edge of data science have had many years of computer science training, but it's like any new technology we see come in. Sometimes, the risks aren't the highest priority when you're trying to get a new product to market. And in that case, those teams really need support and guidance on what is the best practice when it's a new field. So that's what got me really excited about it was, seeing that there was a real need for calling out for this stuff. And over the past few months, I've been to loads of events on AI, and I don't even have to stand up and say, ‘Hey, guys, what about standards,’ because companies just stand up and say, ‘we need standards.’ So it's been an easy job.
Learning from security
At a high level, AI risk management can take advantage of many of the best practices learned in managing cyber security risks. However, a few idiosyncrasies about AI software make it different from other types of software that need special attention. Rhode says:
I think one of the main things with AI technologies is, you're not explicitly programming. You're showing the algorithm what to do instead of telling it what to do. And that means that there can be unexpected behaviors. As a result, doing verification and validation and versioning of software is much harder because the bulk of stuff you have to track is huge.
For example, Large Language Models (LLMs) are often trained on the whole Internet, making it harder to identify the root cause of bias, misinformation, and hallucinations. AI also introduces new socio-technical risks that can arise in conversational interfaces and create new risks for people who might be impacted.
The security industry has also developed numerous collaboration organizations, such as the SANS Institute and OWASP, that help apply best practices to various aspects of security. In the case of AI, Rhode said adjacent standards and collaborations will similarly be required for the different aspects of AI risk management, particularly as it is applied to specific industries like healthcare or finance. These include issues around algorithmic impact assessment and mitigating bias.
ISO is working on a few of these. They have already published standards on bias mitigation for AI systems and robustness for systems on functional safety. Later this year, they plan to publish another on algorithmic assessment that looks at the impacts on different populations of AI models. Rhode recommends organizations check out the AI standards hub to help inform their own programs. She explains:
We're also involved in the AI Standards Hub program because we recognize how complex this space is. It is funded by the UK Government, along with the UK AI Research Institute and our National Metrology Institute, which is like the UK version of NIST. We have a whole kind of searchable database of the standards out there that allows people to see what's going on. It’s a very nice interface and has a community side to it. People can track the progress of standards, like if you want to know when it's had a revision or being published.
Another aspect of the standard lies in building a culture and processes for continuous learning. On the culture side, it’s important to recognize that at an organizational level, progress requires a team effort. While leadership may start with some high-level goals, people on the front lines often see many new problems first. Similar to data literacy efforts, enterprises also need to think about cultivating AI literacy so that people know what kinds of issues to look for and to ensure they have a common language for discussing and prioritizing issues that arise.
On the process side, it’s also important to consider that AI systems and their results continuously evolve. For example, a new AI model that works perfectly on the day it is deployed can lose accuracy over time in response to shifting behaviors, new trends, or other issues. Additionally, many AI models employ reinforcement learning that can often help improve the performance of models in some metrics over time but also inadvertently reduce other qualities that may not have been considered at deployment.
AI metrics are also a work in progress. For example, what is the best way to quantify bias, algorithmic impact, and hallucinations in an easily measurable, comparable way? BSI is working with the National Physical Laboratory, which oversees the previously mentioned NML, to help define and curate a set of meaningful metrics that might guide AI risk management efforts.
Enterprises need to think about AI risk management as a process of building trust, not only in the AI tools themselves but also in the culture for implementing it and measures for tracking progress or setbacks. Rhode says:
I think the key value from it is really around building trust and confidence in the system that you are developing whether you're developing systems that you're going to use internally, or that you're going to use to support customers. And again, if you are looking to procure AI, you can check for certification against this. There are so many considerations to look for in a reliable system which span from applying your traditional IT security to looking at these kinds of more socio-technical risks. That is very difficult for an organization or to put that pressure on a set of developers to make sure you've just made the system secure and reliable. It's a very interdisciplinary topic. So, I think the standard can give organizations confidence that they've considered a really wide range of key risks that's been developed by experts in the field who've worked with the technology, and they've seen what's important in this case.
Ongoing outrage in the UK over the Horizon IT scandal was recently stirred up by a recent documentary. The catalyst was the implementation of an inaccurate accounting system, but the problem cascaded into a horrific personal tragedy afflicting thousands due to poor management and oversight practices. A stream of former and current executives and IT staff have been called to testify about the incident in public testimony, while the popular press has been sharing the horrible stories of the afflicted.
I asked Rhode if the new AI risk management standard might avert similar tragedies in the future. She explains:
I see this management standard as emphasizing the business aspects more than the technical aspects, and that the ones that kind of hang off it are the ones that go more deeply into the technical side of things. And there's a huge emphasis on testing and validation and putting tools around so that these tools aren't kind of just hallucinating and trying to come up with answers where they don't actually have a lot of data to back up, but they still produce something because that's what they're programmed to do.
And there's a big, educational piece that needs to go along with the best practice so that the best practice is kind of out there. But the big challenge organizations have is actually doing the implementation, and that's from awareness at the leadership level through to implementation at the technical level. And I think, you know, that some of the risks that we've seen in the news, you know, could have been mitigated by following these kind of best practice guides.