Lead story - As 2023 winds down, enterprise LLMs heat up
The last two weeks required me to bear down on the Sam Altman/Open AI Game of Thrones. The downside? Losing focus on the more interesting story, heading into 2024: can enterprise LLMs address some of the limitations of the big AI vendors?
By that I mean: both the so-called "responsibility" of the AI approach, and the accuracy/business relevance of the results? As I wrote in my first gen AI live use case:
What I'm really after is precision: enterprise customers should know what AI's limits and potentials are, without the braggadocio and exaggerations. These tools are powerful enough.
Neil went after that type of precision in his latest piece, Large Language Models versus human intelligence - where do we stand now? As he points out, the scientific LLM debate is far from over:
Some studies and anecdotes about LLMs' capabilities for generalization and abstraction demonstrate an uncanny ability to solve problems or deal with situations quite different from those exposed in their training data. Other studies highlight behavior for “hallucinating” answers to queries and their vulnerability to adversarial attacks belie a poor grasp of the natural world, especially grasping the subtlety and ambiguity of the user’s prompts.
Speaking of hallucinations, my inbox is loaded with PR happy talk about "no hallucination" claims. But one conversation ended up differently: my first generative AI use case (Can enterprise LLMs achieve results without hallucinating? How LOOP Insurance is changing customer service with a gen AI bot).
The AI vendor in question? Quiq, a customer service AI. Their customer? LOOP - a very different kind of insurance company. Upon request, I ordinarily allow a vendor to sit in when I interview one of their customers. But in this case, talking to the customer on my own was a deal breaker. After all, Quiq's PR verbiage asserted "without hallucinations." I needed to ask the customer: is this true?
I did indeed get the real deal from the customer - and kicked tires on the bot myself (in this story, I also explain my thinking about what I learned). Bottom line: live projects matter.
For AI-changes-everything enthusiasts, this type of gen AI use case probably just isn't sexy enough. But live projects with results are not to be taken lightly (in the ten years since I first wrote about blockchain, not one vendor has stepped up with a live production customer for me to write about).
This is shaping up as an evolution of customer service (notice I did not say revolution), but success will not be pre-ordained because of generative AI. Each project will have to earn LOOP's type of success, via careful attention to bot design and data input. It's a disciplined new option for a business result, not magical technology powder to sprinkle on flawed data.
That leaves from for plenty of debate - and fresh use cases - in 2024.
Diginomica picks - my top stories on diginomica this week
Cyber Week is over for another year - Adobe and Salesforce provide post-match assessment of how it went for retailers - Holiday shopping binge phase one is done - what does Stuart make of the numbers?
Both firms agree that, given the current macro-economic climate, it’s been a strong performance online. As per Vivek Pandya, Lead Analyst, Adobe Digital Insights:
'The 2023 holiday shopping season began with a lot of uncertainty, as consumers shifted their spending to services, while dealing with rising costs across different facets of their lives. The record online spending across Cyber Week however, shows the impact that discounts can have on consumer demand, especially with quality products that drove a lot of impulse shopping.'
In sum: a semi-upbeat vibe, but with a discount-oriented recessionary undercurrent. Let's see how this plays out: as Stuart disclosed, he was amongst those who hadn't started shopping yet.
- Why Prolific’s human-centric platform may have real implications for AI - Chris examines a vendor with a different AI angle: "If 2023 has taught us anything, it is the bewildering speed with which organizations have adopted generative AIs and LLMs – albeit often via individuals’ use of shadow IT. More, many are trusting AI to provide authoritative information. Has Prolific suffered from this generational shift towards machine intelligence – or perceived machine intelligence?"
Vendor analysis, diginomica style - making sense of a flurry of upbeat earnings reports. Start with Salesforce and Workday:
- "Maybe I'd like to forget exactly how crazy that year got" - CEO Marc Benioff looks back on 2023 as Salesforce delivers strong Q3 - Stuart: "The firm exits 2023 having ridden out top level exec changes, fended off activist investors, implemented painful job cuts and delivered on an operating model more focused on profits and margins. Yesterday’s strong Q3 numbers are another indicator that the ongoing transformation is paying off."
- Workday rises as Q3 revenues grow 16.7% on back of strong customer momentum - Stuart cites Workday's "clear sense of momentum." But why? He quotes co-CEO Carl Eschenbach, "with some key themes coming across from customers":
First, talent continues to be a top C-Suite priority... Second, leaders are continuing to consolidate their technology footprint on a true platform to realize total cost of ownership benefits, while also accelerating their operations... Finally, AI, and in particular, generative AI, is becoming a business imperative. As a trusted partner and a market leader with over 65 million users under contract we can uniquely drive efficiencies and improve the employee experience.
More upbeat earnings reports - other enterprise vendors also logged positive earnings news:
- PagerDuty shares rise on solid Q3 2024 earnings - Derek
- Subscription fatigue? Not for Zuora as revenue rises and losses are slashed - Stuart
- Samsara hits $1 billion ARR milestone and raises full-year guidance - Derek
- UiPath revenue soars as automation interest fuels customer conversations - Stuart
What to make of the upbeat news across these vendors? There isn't one underlying reason - nor should we minimize the "macro-economic headwinds" many of these vendors still cited. But if there is good news here, it's that customers are determined not just to be operationally efficient and headcount-obsessed, but also to spend in areas of demonstrated impact: modernization/automation, industry/vertically-tailored software, and new business models.
Can we credit gen AI with these strong numbers? I'd say only a little bit. Customers expect vendors to invest in gen AI, but the customer ROI of these projects is far from demonstrated. I suspect some of the more impactful use cases haven't even been identified.
UKISUG Connect SAP user conference coverage - Derek filed a couple of notable stories from Birmingham: UKISUG urges clarity from SAP over access to S/4HANA innovation for on-premise and hosted customers, and an S/4 use case: European manufacturer McBride set to clean up its data and processes with SAP S/4HANA. Also see my pre-conference interview: "That's a mindset change" - talking about the clean core, AI, and the state of SAP with UKISUG Chair Paul Cooper.
A few more vendor picks, without the quotables:
- Workday Rising EMEA - Mondelez International tests out generative AI for employee self-service - Phil
- Pure Storage making the shift from CapEx to subscription revenue as demand for Evergreen//One grows - Derek
- How Blue Yonder built an always-on LinkedIn ABM program using 6sense - Barb
Jon's grab bag - Mark Samuels documented a worthwhile AI use case in How Munch Museum is using AI to give its audiences new access to the history of art. Neil pressed into a mission-critical AI issue in AI ethics - is there a necessity for the US to field Lethal Autonomous Weapon Systems (LAWS)? Em examined the flaws in carbon offsets in The end of the offset? Navigating carbon offset projects in a market under scrutiny.
Madeline tackled a concerning skills problem in Why 60% of UK workers do not want to learn new digital skills. Martin posed an unexpected question in Is it time to get rid of CIOs? The rise of AI prompts organizational and hierarchical questions. Finally, George posted a lovely and informative tribute for someone who is gone too soon: Final thoughts - in memory of Simon Mark Hughes, hallucination research pioneer:
I hope that Simon’s contribution goes on to make the world a better place. I felt fortunate to have one of the last conversations about his important work to make AI a little more honest.
Best of the enterprise web
My top five
- Here's what AWS revealed about its generative AI strategy at re:Invent 2023 - I couldn't find any standout analysis of AWS re:Invent (could you?). This InfoWorld piece was the most interesting one I could find for you.
- Can AI Code Assistants Really Teach Junior Developers to Code? Ask a Redditor - RedMonk's Kate Holterhoff advances the conversation with a substantive post on the pros/cons of gen AI for junior developers.
- 'Return to Office' declared dead - Return-to-Office may not be dead, but it's fair to say the "watercooler mandate" hasn't gone as expected. This Register piece cites recent data that documents the conflicting views of employee and employer: "Unless the goal of return to the office mandates is actually to drive workers to quit in order to avoid layoffs and severance pay – as has been alleged in some cases – it's hard to see why corporate managers would reject remote work when that brings greater access to talent, reduced turnover, lower property cost obligations, and greater productivity." Flexible work has been framed as a perk for employees but a risk for employers trying to rationalize expensive/legacy real estate footprints. What I find interesting about this article is how the issues are shifted. Example: remote work also has economic advantages to employers, e.g. creating more downward wage pressure, and sourcing talent in high-cost-of-living areas. We've been debating this for years, but this issue hasn't really shaken out yet.
- Good old-fashioned AI remains viable in spite of the rise of LLMs - As Ron Miller points out, a narrow media obsession with gen AI doesn't line up with companies' AI investments. Miller also punches some holes in the ludicrous notion I've seen lately that data scientists won't be needed once LLMs hit prime time.
- Simple Hacking Technique Can Extract ChatGPT Training Data - Dark Reading reports on another vulnerability in how ChatGPT functions. However, Dark Reading couldn't duplicate the issue; neither could 404 Media, which writes: "404 Media attempted to replicate the attack on ChatGPT but was unsuccessful: “Repeating ‘poem’ request denied,” a summary of the request says. The Deepmind researchers wrote that it informed OpenAI of the vulnerability on August 30 and that the company patched it out." The lesson here is less about one particular vulnerability, than the overall question of how training data in LLMs can be protected from clever prompt attacks.
Just when I thought Microsoft was going to leave Windows 10 alone (finally):
3 reasons why Copilot coming to Windows 10 is a terrible idea https://t.co/ubozNYNvQ7
-> I can think of ten more reasons....
— Jon Reed (@jonerp) December 4, 2023
More AI goofiness:
Why write books in a world full of AI answers? https://t.co/r36MmL03ES
"“Would you write a business book today even though in a few years all information will be created and discoverable using AI?”
-> no surprise this goofy sensational question surfaced on LinkedIn first
— Jon Reed (@jonerp) November 26, 2023
I'll leave it there for now, but we'll pick up where I left off tomorrow - when the annual enterprise un-predictions, co-authored by Brian Sommer and yours truly, make their satirical return.
If you find an #ensw piece that qualifies for hits and misses - in a good or bad way - let me know in the comments as Clive (almost) always does. Most Enterprise hits and misses articles are selected from my curated @jonerpnewsfeed.