AI truth is "user assembly required"
- With all the focus on what generative AI is adding, we're ignoring what it's taking away, argues Salesforce's Peter Coffee.
We commonly hear the phrase, “assembly required,” when something is delivered to us in pieces that we’ll have to put together. What about the opposite? What about “disassembly unnoticed”? What happens when our experience has only been with pre-assembled parts, that we did not even realize were separable? When these things get taken apart, will we learn to notice what’s now missing?
Specifically, the growing conversation around generative AI is mostly focused on what it can add to our toolkits (or what it might replace in our labor force); far less noticed is what it is disassembling — and removing — from what used to be built in and taken for granted.
What led me to this line of thinking was April’s launch event of a dense and massive book (447 pages in a reading-glasses-required font), the Routledge Handbook of Risk Management and the Law – for which I wrote Chapter 20, entitled “Value and Risk in the Fourth Informational Revolution”. (I was not paid for that contribution beyond a courtesy copy of the book, and I get no royalties from its sales, so this is not a self-serving plug.) What triggered the questions above, and the comments that follow below, was a question candidly raised that day – whether the time scale of putting ideas between hard covers can possibly keep up with today’s pace of change, especially in all things having to do with information and computation.
Not only is it still possible, I argued at the time, for a book to capture and illuminate timeless truths. It is actually more necessary than ever to consider, and document, the many different components of information, and the risks and the rules that attend their interactions, when things that used to be integrated are becoming unfamiliarly independent.
Consider the relationship between information and trust. When the only way to learn something was by being told about it, by another human being, the source was inherently linked with the content. The desire for a source as close as possible to the truth is literally part of our language: since at least as long ago as 1896, the first known published use of the phrase, we’ve figuratively said “I heard it straight from the horse’s mouth” – because people betting on horse races wanted to get their information from a source as close as possible to the horse.
Once we invented writing, and then moved on to mechanized printing, we lost that integral connection between “what’s said” and “says who?”: it became possible for something to be said, and to be shared at scale, with no way to qualify it by the position or the probity of the source. We started to have to ask, “What’s their agenda?” – and note that the root of “agenda” is not merely, “things to be done,” but actually a Latin word meaning “things that ought to be done.” We had to go looking for the intention, and not just the objective meaning, of what we were being told – because knowledge of who was telling us no longer came reliably pre-assembled.
If we couldn’t inherently know “who says so?”, we could at least hope to know “why should we believe this?” – and when most of what got written down was records of trade and industry, the shared motive all around was accuracy. Mesopotamian cuneiform, “the world’s first abstract writing” (observes the frequently-worth-reading Tim Harford), wasn’t used for poetry or correspondence, but to create the world’s first accounts – and soon afterwards, to make contracts. Adds Harford, “Writing wasn't a gift from the gods. It was a tool that was developed for a very clear reason: to run an economy.” (I guess this makes cuneiform the lineal ancestor of COBOL, which may turn out to last a comparably long time.)
Over time, though, people started to use writing for sharing of opinions and fictions, as well as to record transactions and to document future commitments. Furthering an agenda became, not an error or a distortion, but a principal use of the technology – and we therefore needed an entire apparatus of qualification. We invented auditors and editors; when the volume of writings grew to global scale, we invented search engines that attempted to weigh relevance and credibility of writings with algorithms reflecting people’s evaluations. Google PageRank would be among the examples – but note well that the final consumer of the information that’s served up by a search engine still gets to see a “says who?”, providing at least an opportunity for critical filtering.
What then of generative AI? One of my colleagues here at Salesforce said a week ago, correctly, that “their job is to generate from a prediction, and there is no concept of truth.” This isn’t a bug that can be fixed: it’s an inherent either-or between two different and incompatible goals for these systems. As noted by Stephen Wolfram, “there’s an ultimate tradeoff between capability and trainability: the more you want a system to make ‘true use’ of its computational capabilities, the more it’s going to show computational irreducibility, and the less it’s going to be trainable. And the more it’s fundamentally trainable, the less it’s going to be able to do sophisticated computation.” There is no “says who?”; there is no fitness function that rewards improvement of accuracy; there is no explicit visibility, let alone editability, of the components that have been mashed up into the output that we receive from a prompt.
Detecting plagiarism, recognizing hallucination, and complying with basic requirements of GDPR and similar strictures are going to be obstacles (perhaps even roadblocks) for GPT models built on public data – as opposed to models that are constructed to assemble the data of an enterprise into a conversational knowledge base, which may turn out to be the far more valuable role for this technology. The use of GPT to answer the Lew Platt question, “What does our company know?”, seems likely to be more valuable than the answer to “What does ‘the world’ believe?”
In the meantime, though, we have professional obligations to understand the components of information; to recognize and optimize the risks arising from assembling, applying, and relying on that information; and to be aware of, and responsibly compliant with, the relevant (and rapidly changing) laws. Further, an increasingly information-aware marketplace will be watching us, and judging us against rising expectations and (perhaps unconscious) assumptions, concerning the content and source and motivation of what that market hears and sees.
These things, we still have time (and actually increasing need) to put between hard covers. And open the books. And think about what we read.