Robot empowerment - a viable alternative to Asimov's three laws of robotics?
- Human-robot interaction is upon us - we're in dire need of a framework that makes sense. Asimov's three laws of robotics are one model, but is it applicable to today's robots? An alternative based on robot "empowerment" is worth a close look.
It's become increasingly clear that human-robot interaction is expanding rapidly, and it's time to think about how this will work.
This article is going to be a little technical and involves some math, so I'll start with the conclusions:
- Growth in human-robot interactions is happening faster than we thought. These are not devices that possess advanced AI. They are everyday devices meant to work with humans, e.g., Automated Transportation, ATMs, Education, Surveillance and Safety, Cooking, Medicine, Home Maintenance, and Mining Equipment.
- A robot does not possess the same reality as humans.
- Therefore it is unlikely that robots and humans will ever share a language.
- Asimov's Laws cannot be programmed into a robot.
- Asimov's Laws include terms like "Human" and "Harm," which are so semantically ambiguous that a robot cannot understand them.
- And what is impossible is to program into a robot is every possible action and consequence.
- What is needed is something similar to Asimov's Three Laws, with embellishments, that is generic enough that it is universally applicable to robots and indifferent to their specific morphology.
Many things are problematic about AI today, particularly the abuses of it from social media, disinformation, bias, and just poorly made applications, that thinking about living with robots seems a bit remote. I haven't given the issue a lot of thought until I came across a paper, Empowerment As Replacement for the Three Laws of Robotics. It's a provocative title for dense, academic writing, which I'll try to summarize.
The authors focused on the rapidly expanding use of robots interacting with humans today, not the fantastical vision of super-intelligent robots and how to control them. Their solution, surprisingly, is not control but empowerment. It reminds me of the Zen saying, "If your cow is unruly, give it a bigger pasture."
There is a great deal of mostly uninformed chatter about the dangers looming from human-intelligent or even super-intelligent robots and how they will be controlled from taking over from humans if they can be at all.
In Rethinking AI Ethics, I proposed that Asimov's Three Laws were incomplete, inconsistent, and not realistic. Just to review, the rules were introduced in his 1942 short story Runaround, and compiled in the novel I, Robot of 1950)
A robot may not injure a human being or, through inaction, allow a human being to come to harm.
A robot must obey the orders given by human beings except where such orders would conflict with the First Law.
A robot must protect its existence as long as such protection does not conflict with the First or Second Law.
Italicized are concepts that require a far advanced cognitive ability than current and near-term robots. If you think about the Three Laws, how would you incorporate them semantically and cognitively into a robot? How would we explain "harm" (if you recall, in the film iRobot of 2004, rogue robots could interpret the words "human" and "harm" as they wished. Because they observed humans harming other humans, they concluded they needed to kill humans to protect humans (a conclusion that humans often reach themselves).
As everyday robots become more ubiquitous, it's essential to understand they aren't people (even though we consistently anthropomorphize AI with words like train, learn, deicide, neural, and names like Ross-IBM, Kensho, Zo-Microsoft Amelia, Sofia, Alexa). The problem with incorporating the Three Rules into current-day robots is the difficult semantic comprehension of rules expressed in natural language. Already familiar and straightforward concepts, such as "harm," cannot be naively related to the robot's perspective. What that means is that robots have a radically different perspective from humans and utterly different reality. It's doubtful that robots and humans could ever have an approximation of a common language. Because it is challenging to build robots that understand what constitutes harm, how could the robot avoid inflicting it?
Asimov's Laws cannot be programmed into a robot. The authors of this paper agree, but not for the usual epistemological reasons. Instead, their thesis does not establish an explicit verbalized understanding of human language and conventions in the robots.
The empowerment concept
Empowerment endows the robot with the resources to react to a wide variety of different situations and types of robotic embodiment. This entire program is based on a mathematical model, not rules or code, especially the generous use of Metric Spaces, a branch of Topology branch. I studied Metric Spaces in the distant past, so I'll include this section in the event someone else has too. The authors claim this idea is mathematically developed, so it seems reasonable to include. A formal definition is (not necessary to read for this paper):
Metric space, in mathematics, especially Topology, an abstract set with a distance function, called a metric, that specifies a nonnegative distance between any two of its points in such a way that the following properties hold: (1) the distance from the first point to the second equals zero if and only if the points are the same, (2) the distance from the first point to the second equals the distance from the second to the first, and (3) the sum of the distance from the first point to the second and the distance from the second point to a third exceeds or equals the distance from the first to the third. The last of these properties is called the triangle inequality. The French mathematician Maurice Fréchet initiated the study of metric spaces in 1905. From: Metric space | mathematics | Britannica.
It is surprising how many papers and books have been written about Metric Spaces based on such simple principles.
The whole concept of Empowerment is a formal mathematical model that has not been implemented in actual robots. This is a theory. But their model can be used generically to computing concepts, such as self-preservation, protection of the human partner, and responding to human action. In that way, it can approximate Asimov's Three Laws of Robotics operationally without the need for a language. Some guidelines are added for the robot to effect actions based on the current set of factors and the robot's morphology. Metaphorically, it is the same as starting your Lincoln with a Toyota key fob without separate instructions. The authors propose three such policies:
First, "robot initiative": The robot can apply the principles because they are generic enough to novel situations using new goals and derived from recent cases.
Second, "breaking action equivalence": what if different actions all produce outcomes?" What facilities does the robot have to act on one of several options, and can the robot optimize once it can ensure that the primary task is satisfied?" (this is not a new concept, often referred to as Hierarchical Constraint Resolution - when there are multiple solutions that fit the criteria, just choose one)
Finally, "safety around robots": The easy answer is the "kill switch," i.e., the drastic step of shutting down the. But if the robot is carrying out a vital function where or when it is maintaining safety or preventing damage, the robot must act rather than stop acting. There is a clear need for generic, situation-aware guidelines that can inform and generate robot behavior.
How they propose to do this is in a formal, non-language-based method to capture the underlying properties of robots as tools (here I disagree. As a mathematician myself, I see Metrics Spaces and Topology in general, as a language. But I suppose their point is that it isn't Natural Language).
Instead of employing language, they apply the information-theoretic measure of Empowerment (this is not an original robotics idea, it goes back to 2008 in a paper Keep Your Options Open: An Information-Based Driving Principle for Sensorimotor Systems, and concepts from many papers of potential and causal information flow in general, which the authors describe as a "heuristic to produce characteristic behavioral phenomenologies which can be interpreted as corresponding to the Three Laws in certain, crucial aspects."
Sometimes, I think the language they use is more difficult to understand than the science. How do robots exhibit "characteristic behavioral phenomenologies?" I feel like they slipped for a moment into the anthropomorphizing syndrome.
Pressing on, the ability to cope with different, quite disparate sensorimotor configurations is desirable for the definition of general behavioral guidelines for robots. This means not defining them separately for every robot or changing them manually every time the robot's morphology changes.
Here's the math.
The authors claim they have worked this out mathematically and provided a glimmer of a multi-step action. It looks like this:
𝔈(r) := C(At→St+1∣r) ≡ maxp(at∣∣r) I (St+1; At∣r)
This simple explanation is that many actions can influence a state in the future, not only the next step but also future outcomes, say, t +3, so the distribution of results is generated starting at time t.
The equation is a "probabilistic communication channel," where the agent transmits information about its actions ( At+1, At+2, At+3) through a channel and evaluates how it affects the outcome at time 3 ( St+3.)
There's a maximal conditional in here too, (maxp(at∣∣r) I (St+1; At∣r)) which is the maximal influence of an agent's actions on its future sensor states (its potential efforts). That can be modeled by a Shannon channel capacity (another well-understood model from information theory), which returns the received signal-to-noise power ratio and what the agent may have changed at the end of its 3-step action sequence.
Another way to look at this and simplify it is that Empowerment is the channel capacity between the agent's actuators A in a sequence of time steps t and its sensors S at a later point in time.
A more visual way to depict this is with a Bayes Network:
Image via Frontiers in Robotics and AI
In the diagram, S represents the robot sensor, while Sh implies the human sensor. The robot actuator is A, and, correspondingly, Ah is the human actuator. The rest of the system is represented by R. The index or subscript t denotes the time the variable is evaluated. Causal connectors of the Bayesian networks are the black arrows.
The dotted and dashed lines denote the three types of causal information flows of the three kinds of Empowerment. The direction of the potential causal flow for 3-step robot empowerment is seen as the red dotted arrows. The blue dotted arrows show human Empowerment. Lastly, human-to-robot transfer empowerment is seen as the dashed purple line.
I have this overwhelming feeling that they've pulled together a tidy model, but I'm not convinced that it meets the requirements. They've composed a reasonably simple framework, but when the authors added the need for additional "guidelines," I have the impression that the simplicity and plasticity of the approach could disintegrate. However, they have raised the issue that we need to control roots now, not in the future, and the naïve hope to be able to do it with language is probably impossible.
Without some evidentiary data that this concept can work, I can offer no opinion, but I take from this article that academics in robotics are beginning to think about managing robots while giving them a measure of autonomy. Today, we have Natural Language Processing, and we're allowed to speak to a device and have it talk back. NLP programs can write credible prose and compose music. In some ways, the interactions are remarkable, but in no way does the machine have any idea what you or it is saying.
In OpenAI's new language generator GPT-3 is shockingly good-and completely mindless:
OpenAI's GDT-3 is the largest language model ever created and can generate amazing human-like text on demand but won't bring us closer to true intelligence.
It is entirely contrived by training in a specific domain. This is what the authors mean when they say robots exist in a different reality from humans, and we need to turn our attention to efficient ways to manage robots instead of wishing they would just understand what we want.