Lead story - from model drift to UI replacement controversies, generative AI debates roil on
As companies grapple with the right/safe ways to proceed with generative AI, the diginomica team pushed for clarity. I intentionally stirred the pot with Can generative AI displace the enterprise UI? An unprompted debate with provocative implications:
Wouldn't that turn enterprise software vendors on their heads as well? Do you want your AI assistant to tell you, 'I can't complete that task, because you lack the necessary product licenses for this cloud or that cloud'? If AI is that disruptive to UIs, then won't enterprise software product categories (and their accompanying licenses) be part of the casualties?
I question whether vendors can pick and choose where to monetize generative AI - without confronting blowback to their own business models and approaches. For now, there are other pressing issues to consider, such as consumer Large Language Model drift. George parsed the results of recent Stanford study in ChatGPT - gauging the emerging risks of consumer-facing model drift:
They found that GPT-4 accuracy on math problems dropped from 97.6% to 2.4%, while GPT-3.5 went up from 7.4% to 86.8%. The percentage of directly executable code for GPT-4 dropped from 52.0% to 10.0% and for GPT-3.5 from 22.0% to 2%, while they both also wrote slightly longer code. Visual reasoning scores went up slightly, with GPT-4 going from 24.6% to 27.4% and GPT-3.5 going from 10.3% to 12.2%.
As George points out, model drift is not a new tech problem. But it does cause particular issues for LLMs, which simply cannot be debugged like traditional software. He cautions:
Hopefully, few enterprises are relying on ChatGPT to write code or solve math problems. There are plenty of other tools purpose-built to solve that problem.
Yes, well, math is not a strength of any generative AI system - that's where combining with other forms of AI, better suited for computation, comes in (something I noted in my article). Coding is another matter. I believe generative AI will prove quite useful for enterprise coding, though it will not be the only tool in the low-code/no-code toolkit.
These types of LLM issues are not insurmountable problems, especially when we get to LLMs purpose-built for enterprises. But for now, during this 'maturity waiting period,' enterprises might be tempted to jump start with third party generative AI solutions. Those available now are typically based on some form of OpenAI's technology, or open source models that could be prone to drift issues as well. Test carefully out there - and exercise vigilance against Shadow AI freelancing, e.g. employees pulling generated code from such sources.
Move-fast-and-break-things might work for some startups; it's a terrible method for enterprise AI. Those aiming to get it done responsibly should check George's The five 'E's needed to put together a pragmatic AI strategy. For now, I conclude:
Enterprise AI is all about proper design for proper use cases, rather than indulging in AI overreach, e.g. "Can I fire my content marketing team?" I expect reinforcement learning will close those gaps further, but the very reason we are talking about the need for iterative model improvement via user input shows us why "AI is our UI" is - at best - a longer term consideration.
Diginomica picks - my top stories on diginomica this week
- Generative AI, meet the contact center - the latest silver bullet to kill off the traditional operating model? - Barb applies the topic-du-jour to a classic enterprise cost/pain point, the call center: "Milanovic and Ringman agreed that the first part of leveraging generative AI will be to support agents, but it will also shift to the front-end customer experience as confidence grows."
- Not where they want it to be - e-commerce tensions between Ocado and Marks & Spencer increasingly open - Stuart on a joint venture that hasn't (yet) panned out.
- Forestry & Land Scotland grows new digital infrastructure and cuts out legacy dead wood - Mark Chillingworth with a nifty use case: "The (Nutanix) NC2 platform has bought Forestry & Land Scotland time, as it doesn't need to re-architect applications in a hurry."
Vendor analysis, diginomica style. Here's my three top choices from our vendor coverage:
- Cloud revenue growth misses for SAP in Q2, but the future is bright with AI, according CEO Klein - Stuart finds SAP at an earnings crossroads: "The noise about the AI-driven future opportunities wasn’t enough to keep investors happy in the short term as the share price took a pummelling. Klein talked a lot about the potential margins on embedding AI throughout the SAP portfolio. If that starts to deliver later in the year, investor reaction may be more favorable."
- IBM CEO Arvind Krishna - how Red Hat experience will help build a multi-billion dollar generative AI consulting business - Another firm betting big on AI. Meanwhile, the Red Hat acquisition provides good short term news, and points towards consulting futures. Stuart quotes IBM's CEO: "Our path is clear - in the same way we have built a consulting practice around Red Hat's hybrid cloud platform that is now measured in the billions of dollars, we will do the same with AI."
- "Nobody's well prepared, with very few exceptions" - Salesforce's Tim Christophersen on new sustainability regulations coming down the track - Madeline with a notable take on sustainability, perhaps the overlooked enterprise storyline amidst the AI fervor. She quotes Christophersen: "Salesforce has some of the best people in the world on our sustainability team, on our compliance team, and it even makes our head spin, the alphabet soup, the constantly emerging standards, the difference between regional standards. We have the EU Corporate Sustainability Reporting Directive, we have SEC, Australia, [and] the UK is no longer part of EU. It's mind blowing."
A few more vendor picks, without the quotables:
- Celonis becomes a ‘profit center’ for AmerCareRoyal as it mines process data - but education is key - Derek
- Qualtrics to invest $500 million in AI as it launches new XM platform - Derek
- Leading the crowd to generate wisdom - Legal and General Investment Management's data strategy in action (Mark Samuels - Cloudera use case)
Jon's grab bag - Gary looks at the prospects for sustainable urban living in Does the key to liveable cities in a hotter world lie in geospatial intelligence, powered by Machine Learning? Brian delves into examples of next-gen ESG tooling in The tech to sustain – JLL and Carbon Pathfinder.
Neil examines quantum computing advancements in Is phototonic quantum computing the answer to commercial quantum use? Maybe. Chris does the same for UK FinTech in Digital payments - don’t blame us when things go wrong, says industry. Finally, Brian's headline might be humorous, Meet today’s brazen and more frequently fraudulent jobseekers! Are you ready to deal with them? But the shifting terrain of employment fraud isn't a joke either...
Best of the enterprise web
My top eight
- Red Hat saved IBM’s bacon this quarter - Ron Miller with another angle on IBM's earnings: "On the bright side, software revenues were up 7.2% in IBM’s most recent quarter to $6.6 billion with Red Hat leading the way up 11%, making the $34 billion purchase in 2018 look better with each passing quarter." Also see: HfS Research's deeper IBM-Apptio analysis: IBM’s acquisition of Apptio can shine if IBM Software and IBM Consulting work together to deliver cost-managed innovation at speed.
- A Possible Formula For Measuring Buying Decision Confidence - Gartner's Hank Barnes riffs on a new approach to buyer confidence. As I said to Barnes on Twitter: "What is interesting here: buyer confidence is not static, not tied to a static brand reputation or something a vendor can internally control (only influence). Big implications on community, peer reviews, quality prospect interactions etc."
- Google and Bing AI Bots Hallucinate AMD 9950X3D, Nvidia RTX 5090 Ti, Other Future Tech - Who knew Tom's Hardware would be the go-to source for why generative AI consumer search engines are either a flawed or wacky idea.
- Navigating The Snails Trail: Moving to Outside-in Planning Processes – Lora Cecere with another rethink-your-assumptions supply chain post. As reader Clive Boulton said on Twitter: "After 50+ years of the four-walls-of-MRP planning inside-out; outside-in planning is overdue."
- How to Measure Digital Transformation Results and Value Creation - Eric Kimberling of Third Stage Consulting proposes some fresh - and welcome - project metrics. How about minimizing disruption? "Many organizations fail to consider the potential costs and risks associated with being unable to perform essential operations when implementing new systems."
- Ifeoma Ajunwa on the Quantified Worker - podcast from the Sunday Show - I wish I could say we weren't heading down the "quantified worker" path, but to avoid it, changes are needed. Example: "AI" is far too culture-bound right now to "evaluate" video interviews of diverse applicants with different accents etc.
- Hollywood on Strike - Ben Thompson brought his A-game to this analysis of the tech unheavals underpinning the Hollywood strike.
Brian referred me to this savvy deconstructor of corporate job decriptions:
"fast-paced also means understaffed with a high turnover rate - also toxic. Avoid!" -> LOLZ
— Jon Reed (@jonerp) July 19, 2023
More edgy enterprise satire, this time on the return-to-office face-off:
that final line drives it home: "next on the agenda... we will be doing another round of layoffs..."
but yeah, you can choose not to come into the office we're "hybrid" lol https://t.co/jjwGx91AoK
— Jon Reed (@jonerp) July 18, 2023
Turns out AI is already pretty good at the companionship thing... As for "healthy," that's not for me to decide. However, a bit of snark seems fair game:
Uncharted territory: do AI girlfriend apps promote unhealthy expectations for human relationships? https://t.co/tvd5pIZgtD
"Control it all the way you want to"
-> no I'd say that's perfectly healthy LMFAO
— Jon Reed (@jonerp) July 22, 2023
Oh, and a couple weeks ago, I won email inbox buzzword bingo:
The NewgenONE low code platform helps enterprises simplify complex, content-driven processes, and caters to their evolving needs while bridging silos, building an integrated system, and delivering personalized experiences. It’s backed with a cloud-native, multipersona AI/ML data science platform, enhanced document classification and extraction capabilities, integrated process and Robotic Process Automation capabilities, and strengthened DevOps for easy application deployment/update.
Turn in your cards everyone, I won this round... See you next time.
If you find an #ensw piece that qualifies for hits and misses - in a good or bad way - let me know in the comments as Clive (almost) always does. Most Enterprise hits and misses articles are selected from my curated @jonerpnewsfeed.