Enterprise hits and misses - generative AI captures headlines with model drift and UI debates, but are enterprises sleeping on ESG?

Jon Reed Profile picture for user jreed July 24, 2023
As the generative AI news cycle holds center stage, are enterprises falling behind on their ESG response? The future of work is quantifiable, but is that a good thing? Work satire hits home while employment fraud shifts. In your whiffs, I claim this round of buzzword bingo.

King Checkmate

Lead story - from model drift to UI replacement controversies, generative AI debates roil on

As companies grapple with the right/safe ways to proceed with generative AI, the diginomica team pushed for clarity. I intentionally stirred the pot with Can generative AI displace the enterprise UI? An unprompted debate with provocative implications:

Wouldn't that turn enterprise software vendors on their heads as well? Do you want your AI assistant to tell you, 'I can't complete that task, because you lack the necessary product licenses for this cloud or that cloud'? If AI is that disruptive to UIs, then won't enterprise software product categories (and their accompanying licenses) be part of the casualties?

I question whether vendors can pick and choose where to monetize generative AI - without confronting blowback to their own business models and approaches. For now, there are other pressing issues to consider, such as consumer Large Language Model drift. George parsed the results of recent Stanford study in ChatGPT - gauging the emerging risks of consumer-facing model drift:

They found that GPT-4 accuracy on math problems dropped from 97.6% to 2.4%, while GPT-3.5 went up from 7.4% to 86.8%. The percentage of directly executable code for GPT-4 dropped from 52.0% to 10.0% and for GPT-3.5 from 22.0% to 2%, while they both also wrote slightly longer code. Visual reasoning scores went up slightly, with GPT-4 going from 24.6% to 27.4% and GPT-3.5 going from 10.3% to 12.2%.

As George points out, model drift is not a new tech problem. But it does cause particular issues for LLMs, which simply cannot be debugged like traditional software. He cautions:

Hopefully, few enterprises are relying on ChatGPT to write code or solve math problems. There are plenty of other tools purpose-built to solve that problem.

Yes, well, math is not a strength of any generative AI system - that's where combining with other forms of AI, better suited for computation, comes in (something I noted in my article). Coding is another matter. I believe generative AI will prove quite useful for enterprise coding, though it will not be the only tool in the low-code/no-code toolkit.

These types of LLM issues are not insurmountable problems, especially when we get to LLMs purpose-built for enterprises. But for now, during this 'maturity waiting period,' enterprises might be tempted to jump start with third party generative AI solutions. Those available now are typically based on some form of OpenAI's technology, or open source models that could be prone to drift issues as well. Test carefully out there - and exercise vigilance against Shadow AI freelancing, e.g. employees pulling generated code from such sources.

Move-fast-and-break-things might work for some startups; it's a terrible method for enterprise AI. Those aiming to get it done responsibly should check George's The five 'E's needed to put together a pragmatic AI strategy. For now, I conclude:

Enterprise AI is all about proper design for proper use cases, rather than indulging in AI overreach, e.g. "Can I fire my content marketing team?" I expect reinforcement learning will close those gaps further, but the very reason we are talking about the need for iterative model improvement via user input shows us why "AI is our UI" is - at best - a longer term consideration.

Diginomica picks - my top stories on diginomica this week

Vendor analysis, diginomica style. Here's my three top choices from our vendor coverage:

A few more vendor picks, without the quotables:

Jon's grab bag - Gary looks at the prospects for sustainable urban living in Does the key to liveable cities in a hotter world lie in geospatial intelligence, powered by Machine Learning? Brian delves into examples of next-gen ESG tooling in The tech to sustain – JLL and Carbon Pathfinder.

Neil examines quantum computing advancements in Is phototonic quantum computing the answer to commercial quantum use? Maybe. Chris does the same for UK FinTech in Digital payments - don’t blame us when things go wrong, says industry. Finally, Brian's headline might be humorous, Meet today’s brazen and more frequently fraudulent jobseekers! Are you ready to deal with them? But the shifting terrain of employment fraud isn't a joke either...

Best of the enterprise web

Waiter suggesting a bottle of wine to a customer

My top eight

Overworked businessman


Brian referred me to this savvy deconstructor of corporate job decriptions:

More edgy enterprise satire, this time on the return-to-office face-off:

Turns out AI is already pretty good at the companionship thing... As for "healthy," that's not for me to decide. However, a bit of snark seems fair game:

Oh, and a couple weeks ago, I won email inbox buzzword bingo:

The NewgenONE low code platform helps enterprises simplify complex, content-driven processes, and caters to their evolving needs while bridging silos, building an integrated system, and delivering personalized experiences. It’s backed with a cloud-native, multipersona AI/ML data science platform, enhanced document classification and extraction capabilities, integrated process and Robotic Process Automation capabilities, and strengthened DevOps for easy application deployment/update.

Turn in your cards everyone, I won this round... See you next time.

If you find an #ensw piece that qualifies for hits and misses - in a good or bad way - let me know in the comments as Clive (almost) always does. Most Enterprise hits and misses articles are selected from my curated @jonerpnewsfeed.

A grey colored placeholder image