Assessing the NIST AI Risk Management Framework
- NIST's AI Risk Management Framework gives organizations a lens for managing AI risks - but mentions of AI ethics take a back seat. Is this a problem, or a better approach?
The NIST (National Institute of Standards and Technology) AI RMF (Risk Management Framework) is a set of high-level voluntary guidelines and recommendations that organizations can follow to assess and manage risks stemming from using AI (the framework is still under development; NIST has posted the latest iterations). It is not statute, regulation or law.
The AI RMF aims to support organizations of all sizes to “prevent, detect, mitigate and manage AI risks.” It is intended for any organization developing, commissioning or deploying AI systems and is designed to be non-prescriptive, as well as industry and use-case agnostic.
Just one comment before describing the framework:
- Ethics - 0
- Ethical - 4
- Ethicist - 0
- Philosophy - 0
- Govern - 66
- Risk - 268
Granted, this framework is not presented as just another of the endless high-level statements of AI Ethics and principles. It’s to be used as a guide for organizations to manage and moderate the risk of this technology. It does feel, though, that the desired qualities of AI: to be trustworthy and responsible, valid, reliable, safe, secure and resilient, explainable and interpretable, and privacy protected all feel like just a means to an end, where the real focus is risk management.
The AI RMF “Core”
There are four core elements of the AI RMF: Map, Measure, Manage and Govern. Each is broken down into subcategories with players and activities assigned. There is too much detail to review in an article, but the essence is:
- Map: Context is recognized, and risks related to context are identified
- Measure: Identified risks are assessed, analyzed and tracked
- Manage: Risks are prioritized and acted upon based on the projected impact
- Govern: A culture of risk management is cultured and present
One of the more striking sections of the report is how thoroughly it debunks the Human-in-the-Loop approach. The HITL is a last resort, a poorly conceived mechanism assigning blame for AI apps will run rampant. It doesn’t work. I wrote about it in 2011, but the NIST report gives it a good shake too:
One common strategy for managing risks in high-risk settings is the use of a human “in-the-loop” (HITL). Unclear expectations about how the HITL can provide oversight for systems and imprecise governance structures for their configurations are two points for consideration in AI risk management. Identifying which actor should provide which oversight task can be imprecise, with responsibility often falling on the human experts involved in AI-based decision-making tasks.
In many settings such experts provide their insights about particular domain knowledge, and are not necessarily able to perform intended oversight or governance functions for AI systems they played no role in developing. It isn’t just HITLs; any AI actor, regardless of oversight role, carries their own cognitive biases into the design, development, deployment, and use of AI systems. Biases can be induced by AI actors across the AI lifecycle via assumptions, expectations, and decisions during modeling tasks. These challenges are exacerbated by AI system opacity and the resulting lack of interpretability. The degree to which humans are empowered and incentivized to challenge AI system suggestions must be understood.
What’s missing from the framework?
Deliberate or not, no part of the report developed a solid framework for the following:
- Lack of alignment with business objectives
- Poor data quality
- Lack of collaboration between teams
- The skills gap
- Failure to anticipate how AI interacts with other systems
Some of these things are mentioned but without much specificity. Launching an AI project is different from other IT projects. Specifically, operational applications apply consistent rules and decision trees predictably and can be observed and understood. Despite their anthropomorphized names, machine learning and deep learning applications do not deal with neuroscience. They are based on probability and discovered patterns. Their action is both emergent over time and unpredictable.
False positives and false negatives are common because the “logic” isn’t apparent. The massive number of iterations can lead to unanticipated bias and other non-Trustworthy results affecting ROI and reputation. Techniques for getting a successful implementation in production are unlike those for the development of the project and unlike those for operational systems. The danger of an inferencing system producing incorrect, inconsistent and even dangerous results once in production and unnoticed until it’s too late can be avoided
Another thing the NIST AI RMF missed is what I call subsequential bias, looking over the horizon, the secondary and tertiary unintended effects of your model. As your model operates, no matter how hard and thorough you scrubbed the unethical aspects, the model’s results can create an environment for unethical secondary and tertiary effects. I also didn't see anything about adversarial perturbation or immutability, but I like it. I'm weary of high-level proclamations and principles. This is what organizations need.
It’s refreshing that this document doesn’t dwell on the usual AI ethics issues. A few years ago, I was conducting AI Ethics workshops, and it gradually became evident that the attendees had never been exposed to the discursive nature of ethics. However, they understood the subject matter and even enjoyed the exercises dealing with ethical dilemmas. Unfortunately, most did not make the connection to apply it to their work in specific terms. The NIST has done the industry a genuine service by creating a working framework that organizations can use and modify to fit their particular circumstances.