Main content

AI pioneers call for guardrails on ‘autonomous AI’

George Lawton Profile picture for user George Lawton October 26, 2023
AI pioneers Yoshua Bengio and Geoffrey Hinton, along with 23 other academic researchers, have called for ‘urgent governance measures’ to protect society against the danger of ‘Autonomous AI.’ They argue national institutions need strong technical expertise and the authority to act quickly.


Yoshua Bengio and Geoffrey Hinton are sometimes called the “Godfathers of AI” for their steadfast belief and ongoing work to prove that neural networks had a future despite pushback from others in the early ‘90s. Now, they, along with 23 other academic researchers, are calling for urgent action in an open letter.

Given the stakes, they are calling on major tech companies and public funders to allocate at least one-third of their AI R&D budgets to ensure safety and ethical use. They also want regulators to have more oversight and control of autonomous AI systems. 

The letter starts off elaborating on many other risks and ethical concerns raised by researchers for some time. Powerful AI could amplify social injustice, erode social stability, amplify misinformation, and enable large-scale criminal or terrorist activities. 

But perhaps the most important new point is they argue that these risks could soon be amplified by autonomous AI that can plan, act in the world, and pursue goals. Whether or not you believe that advanced general intelligence is just around the corner, autonomous systems are here today. However, they have been relatively limited in scope, like driving cars, robotic process automation, self-healing IT systems, or reconfiguring IT systems. Enterprises have been actively pursuing these through concepts like hyper-automation for automating the automations for years. 

Now, the researchers argue, autonomous agents are quickly growing in capabilities to pursue more nebulous goals on their own. For example, ChatGPT was quickly adapted to browse the web, design chemistry experiments, use software tools, and contract with humans to defeat Captcha. It’s bad enough when ChatGPT hallucinates an inaccurate answer to a question, but what happens when its successors are collaborating to gain human trust, acquire money, and influence decision-makers? 

And it’s not like a bad AI even needs to suddenly wake up. Enterprises are actively handing over power to reduce costs, replace workers, automate processes, and pursue new opportunities. 

Different kinds of autonomy

Autonomy in and of itself is not an entirely new concept. Enterprises have embedded some basic kinds of autonomy in IT systems for decades, such as automated fail-over mechanisms programmed to respond to an outage. These control systems have been relatively constrained in their ability to act. The new risks point to an entirely new kind of autonomy that is still in the early stages of development. 

For lack of a better framework, it seems helpful to consider three scopes of autonomy across guidance, scalability, and orchestration. 

Autonomous guidance pertains to the ability of an autonomous system to navigate an environment or manage IT or operations technology autonomously. While traditional script-based automation may break in response to novel circumstances, autonomous guidance systems can adapt and respond accordingly. Here, we have systems like self-driving cars, self-healing IT systems, and adaptive security systems. 

Autonomous scalability pertains to the ability to learn from and automate existing manual processes. For example, tools like process mining and task capture can automatically interpret existing processes and jump-start the creation of new automations using robotic process automation and low-code/no-code tools to code, test, and deploy applications and new automations at scale. 

Autonomous orchestration and collaboration refers to the ability to collaborate with humans, other AI agents, cloud services, financial services, social media networks, and physical devices to pursue broader goals. 

We are already seeing early versions of autonomous orchestration and collaboration in warehouse management systems for optimizing a fleet of robots and unmanned air traffic management systems. But these currently operate within a well-defined scope. 

The danger these researchers point to is what happens when autonomous orchestration and collaboration agents break out of the confines of a warehouse or a drone management system? They suggest some scary possibilities:

To advance undesirable goals, future autonomous AI systems could use undesirable strategies—learned from humans or developed independently—as a means to an end. AI systems could gain human trust, acquire financial resources, influence key decision-makers, and form coalitions with human actors and other AI systems. To avoid human intervention, they could copy their algorithms across global server networks like computer worms. AI assistants are already co-writing a large share of computer code worldwide; future AI systems could insert and then exploit security vulnerabilities to control the computer systems behind our communication, media, banking, supply-chains, militaries, and governments. In open conflict, AI systems could threaten with or use autonomous or biological weapons. AI having access to such technology would merely continue existing trends to automate military activity, biological research, and AI development itself. If AI systems pursued such strategies with sufficient skill, it would be difficult for humans to intervene.

A big concern is that business leaders and government regulators may wilfully invite in these more capable and autonomous agents with the best of intentions. Companies, governments, and militaries may cut back on human verification to gain an edge and reduce costs.

My take

AI does not need to be generally intelligent to scale outside of our control. The issue here is the ability for autonomous orchestration and collaboration. Hackers are already doing primitive versions of this with innovations to develop more resilient strains of malware. And shortly after the release of ChatGPT, tech-bros began bragging about how a new service called AgentGPT could automate more of their lives outside the control of Open AI management. 

Mustafa Suleyman, a co-founder of Deep Mind, recently suggested we think about artificially capable intelligence to characterize these more autonomously capable forms of AI. In his recent book, The Coming Wave, he argues that the Turing Test, in which an AI could convincingly look human, was passed with the rise of new large-language models chatbots. 

His modern Turing test would involve giving an AI agent a hundred thousand dollars with instructions to turn it into a million on the Amazon marketplace. He says:

We’ll come to robots later, but the truth is that for a vast range of tasks in the world economy today, all you need is access to a computer; most of global GDP is mediated in some way through screen-based interfaces amenable to an AI. The challenge is in advancing what AI developers call hierarchical planning, stitching multiple goals and subgoals and capabilities into a seamless process toward a singular end. Once this is achieved, it adds up to a highly capable AI, plugged into a business or organization and all its local history and needs, that can lobby, sell, manufacture, hire, plan—everything a company can do, only with a small team of human AI managers who oversee, double-check, implement, and co-CEO with the AI.

While these types of autonomous agents may sound too good to be true, they may sort of work for some people who get lucky. Just like a few bitcoin winners can convince millions of others to part with their money, houses, and good credit, so might a few autonomous AI billionaires similarly convince a lot of others to follow them into the rabbit hole. The scary thing is this swarm of autonomous AI agents could also spin out of control. 

As the Managing AI Risks authors write:

Without sufficient caution, we may irreversibly lose control of autonomous AI systems, rendering human intervention ineffective. Large-scale cybercrime, social manipulation, and other highlighted harms could then escalate rapidly. This unchecked AI advancement could culminate in a large-scale loss of life and the biosphere, and the marginalization or even extinction of humanity.

Up until now, most discussion about responsible AI has simply been an extension of best practices we have already been pursuing with data science, analytics, algorithmic decision-making, and copyright. But here, these researchers are adding a genuinely new and important distinction that enterprises, regulators, researchers, and citizens all need to take seriously. 

A grey colored placeholder image