The rush to adopt ChatGPT – which was estimated to have passed 100 million users in January, only two months after launch – has been extraordinary. It’s becoming more popular than TikTok, busting the myth that people mistrust AI.
But caution is vital, warn academics in a new report, which suggests that professionals should be forced to disclose its use.
Between them, OpenAI’s epoch-defining generative system, plus the likes of Luminous, Bard, Stable Diffusion, Midjourney, DALL•E, Synthesia, MusicLM, and others, challenge the belief that Industry 4.0 technology will free humans from dull, repetitive tasks so they can focus on being creative.
So far, the outcome appears to be quite the reverse, with some corporate users already ordering staff to spend their time feeding creative machines, as if they are just loading paper into a giant photocopier. Such activities implicitly devalue human talent, though the ability to conceive of imaginative prompts is now a premium skill.
As diginomica reported earlier this month, centuries of human creativity, expertise, and history have been used as raw material to train large generative AI models (LGAIMs), mainly because their copyright status has been vague enough to exploit. Grist to the mill, perhaps.
With Qualcomm’s AI Research division successfully deploying foundational image model Stable Diffusion on an Android phone this week – 15 seconds for 20 inference steps – such tools are likely to become a generation’s playthings.
However, as diginomica warned this month, generative AI could also lead to a pandemic of deliberate misinformation as well as accidental error, with the GPT engine already being used to create legal documents, dissertations, and (covertly) some newspaper articles. Will anyone know what verifiable facts are anymore, or where they came from?
Without human oversight, the negative outcomes of that could be far-reaching, which begs the question: why not employ human experts to write your content rather than to check on an AI’s output?
This begs an even bigger question: what can be done to minimize the potential harm from generative AI? How can society be protected from any overreach of users’ ambition or shortfall of common sense? In other words, how can LGAIMs be regulated, but without slamming the brakes on innovation and on good, socially useful applications?
New answers are proposed in a working paper from three German researchers, who have the ear of EU policymakers: (the wonderfully named) Philipp Hacker, Chair of Law, Ethics & Digital Society at the Universidad Europea Viadrina in Frankfurt; Andreas Engel, a Senior Research Fellow at Heidelberg University; and Marco Mauer, Student Assistant at the Humboldt University of Berlin. (Unless ChatGPT has invented them, of course. Who knows anymore!)
The latest version of their report, Regulating ChatGPT and other Large Generative AI Models (February 2023), starts off by acknowledging the technology’s dangers, and that some form of regulation is essential. It states:
Errors will be costly, and risks ranging from discrimination and privacy to disrespectful content need to be adequately addressed. Already, LGAIMs’ unbridled capacities may be harnessed to take manipulation, fake news, and harmful speech to an entirely new level. As a result, the debate on how (not) to regulate LGAIMs is becoming increasingly intense.
However, for anyone worried that the European Union is poised to regulate a booming market out of existence – via the AI Act and other measures – the authors argue that EU rules are not only ill-prepared for the advent of generative AI, but they also have completely the wrong focus. In short, they miss the point (something that vendors such as Microsoft, Google, and BT agree on, as we reported on 24 February). The authors note:
[The EU is] quarrelling about direct regulation in the AI Act at the expense of the, arguably, more pressing content moderation concerns under the Digital Services Act (DSA).
The paper adds:
The EU is spearheading efforts to effectively regulate AI systems, with specific instruments (AI Act, AI Liability Directive), software regulation (Product Liability Directive) and acts addressed toward platforms yet covering AI (Digital Services Act, Digital Markets Act). Besides, technology-neutral laws, such as non-discrimination law, and also data protection law, continue to apply to AI systems.
It may be precisely the technology-agnostic features [of the latter examples] that make them better prepared to handle the risks of LGAIMs than the technology-specific AI regulation that has been enacted or is in preparation.
The fact that LGAIMs are trained to generate content such as text, images, or sound, from human prompts – they are derivative work engines – makes them distinct from models that are mainly designed to make predictions or spot patterns in data. Despite this, regulation is largely focused on conventional AI, explain the authors.
According to the paper, this means that LGAIMs will:
Have to abide by the high-risk obligations [that are applied to conventional AI], in particular the establishment of a comprehensive risk-management system, according to Article 9 [of the] AI Act.
Setting up such a system seems to border on the impossible, given LGAIMs’ versatility. It would compel LGAIM providers to identify and analyze all ‘known and foreseeable risks most likely to occur to health, safety, and fundamental rights’, concerning all possible high-risk uses of the LGAIM.
In other words, the makers of generative AIs such as ChatGPT would have to analyze the potential risks of every possible application of their technology, all the time, including ones that may be impossible to predict.
This would not only be wildly impractical, claim the authors, but also extremely expensive. As a result, it would hand the competitive advantage to the deep-pocketed likes of Google, Meta, and Microsoft/OpenAI – the very US giants the EU wants to keep in check. (Not to mention the ones that would prefer Europe to regulate use cases rather than the technology itself, as we reported.)
So, as it stands, Europe’s AI Act could have the unintended consequence of spurring more, not less, anti-competitive behaviour and market concentration. An outcome that would be against the spirit of the legislation, which is intended to spur the involvement of SMEs and innovative start-ups.
And that’s not all, warn the authors: the European Parliament let it be known this month that it intends to classify all generative AI systems as high-risk. In this light, a vendor outcry was inevitable.
Of course, many creative, expert people and knowledge workers might shrug their shoulders at such an outcome, and question what harm would be done to the planet by generative AI being strangled at birth in EU red tape. Who cares, apart from vendors?
Consider this: with Microsoft’s market capitalization standing at $1.87 trillion – bigger than the GDPs of most nations – who can most afford to lose money: Satya Nadella’s company, which backs OpenAI, or the world’s estimated 30 million creative professionals? Microsoft’s market cap could pay each and every one of those workers $60,000 – more than most of them currently earn – and still have enough change for a few thousand lattes.
But that aside, what regulatory solution might protect society yet still encourage tech innovation? The report says:
We suggest a shift away from the wholesale AI Act regulation envisioned in the general approach of the Council of the EU, and towards specific regulatory duties and content moderation.
Interesting. The authors make what they believe are “four concrete, workable suggestions” for LGAIM regulation: transparency obligations; mandatory yet limited risk management; non-discrimination data audits; and expanded content moderation.
In the first case, professional users would be obliged to disclose which parts of their publicly available content were either generated by AIs or adapted by them. This is a good, compelling, but trouble-making suggestion, as it would force organizations to consider the public relations (rather than investor) impacts of such disclosure. A real popcorn moment.
The report continues:
Such information is arguably crucial in any cases involving content in the realm of journalism, academic research, or education. Here, the recipients will benefit from insight into the generation pipeline. They may use such a disclosure as a warning signal and engage in additional fact-checking, or at least take the content cum grano salis [with a grain of salt].
Eventually, we imagine differentiating between specific use cases in which AI output transparency vis-à-vis recipients is warranted (e.g., journalism, academic research or education), and others where, based on further analysis and market scrutiny, such disclosures may not be warranted (certain sales, production, and B2B scenarios, for example).
For the time being, however, we would advocate a general disclosure obligation for professional users.
The authors add that non-professional users should be exempt from that obligation. However, they acknowledge that some non-professional uses might still have negative impacts in a world of powerful social influencers:
One might push back against this in cases involving the private use of social media, particularly harmful content generated with the help of LGAIMs. However, any rule to disclose AI-generated content would likely be disregarded by malicious actors seeking to post harmful content.
Eventually, however, one might consider including social media scenarios in the domain of application of the transparency rule, if AI detection tools are sufficiently reliable. In these cases, malicious posts could be uncovered, and actors would face not only the traditional civil and criminal charges, but additionally AI Act enforcement, which could be financially significant.
In the case of risk management, the report says:
Importantly, the full extent of the high-risk section of the AI Act, including formal risk management, should only apply if and when a particular LGAIM is indeed used for high-risk purposes. This strategy aligns with a general principle of product safety law: that not every screw and bolt need be manufactured to the highest standard. […] The same principle should be applied to LGAIMs.
On the subject of non-discrimination in training data, the authors suggest that “certain data curation duties”, for example representativeness and approximate balance between protected groups, should apply to LGAIM developers. They note:
Discrimination, arguably, is too important a risk to be delegated to the user stage and must be tackled during development and deployment. Here, it seems paramount to mitigate the risk at its roots.
The regulatory burden, however, must be adapted to the abstract risk level and the compliance capacities (i.e., typically the size) of the company. For example, LGAIM developers should have to pro-actively audit the training data set for misrepresentations of protected groups, in ways proportionate to their size.
A flawed solution as, logically, it risks weighting the system against some minorities, which runs counter to the principle of protecting vulnerable people.
Finally, on content moderation, the authors note that this remains one of the biggest challenges of generative AIs: specifically, their potential misuse for disinformation, manipulation, and hate speech. The report states:
LGAIMs, and society, would benefit from mandatory notice and action mechanisms, trusted flaggers, and comprehensive audits for models that have many users.
The regulatory loophole is particularly virulent for LGAIMs offered as standalone software, as is currently the case. In the future, one may expect increasing integration into platforms of various kinds, such as search engines or social networks.
Indeed, the latter is already happening, with Microsoft’s Bing being just one example. But how could DSA-style content moderation apply to ChatGPT, and what would it look like in practice?
The report envisages it having two components. The first is harnessing the wisdom of the crowd, Wiki-style, to correct and flag LGAIM output, with “trusted flaggers, who could be private individuals, technology-savvy NGOs, or volunteer coders”.
The second would be an obligation on AI coders to respond to the notices submitted by those flaggers, which would have to be prioritized by the content moderation team. The report explains:
Their job, essentially, is to modify the AI system, or to block its output, so that the flagged prompt does not generate a problematic output anymore, and to generally search for ways to block the easy workarounds that would likely be tried by malicious actors.
All promising suggestions, with the professional obligation to disclose every usage of generative AI being by far the most potent. (Why would anyone want to hide that, unless they were trying to deceive their readers?)
In turn, this might force organizations to consider how they appear to users when deploying ChatGPT. It might also persuade readers to reconsider their relationships with some content providers. After all, why go to a website, blog, or publisher for AI-generated content when you could simply generate it yourself?
But arguably, the report also reveals the inherent absurdity of generative AI adoption at scale. Why create some vast regulatory edifice to oversee AI’s usage – with trusted flaggers, expert human moderators, and the rest – when you could just employ those humans to generate your content instead?
In short, what’s the bloody point?
But one thing is certain: these issues are far too important to leave to trillion-dollar vendors to decide, despite their recent attempts to bend the ear of the British government.