Generative AI built on a proprietary LLM is the way to go — if you know where to look

Raju Vegesna Profile picture for user Raju Vegesna November 30, 2023
Decisions, decisions - Raju Vegesna of Zoho makes the case for businesses making an informed decision about AI - and why the shiniest LLM in the box isn't necessarily the right choice.

AI ethics or AI Law concept. Developing AI codes of ethics. Compliance, regulation, standard , business policy and responsibility for guarding against unintended bias in machine learning algorithms © Parradee Kietsirikul -
(© Parradee Kietsirikul -

The generative AI craze began when ChatGPT rose seemingly overnight, dominating the internet and igniting a media fire drill. The technology was built on OpenAI's proprietary large language model (LLM), and other companies were quickly left with a choice to use ChatGPT or start building their own version, including an LLM at its base. Not that other options didn't exist, they just weren't as developed for out-of-the-box use as they are today. 

The introduction of open-source LLMs complicated this choice. Now, companies could skip right to the generative AI portion of the build if they desired, as the most resource intensive part of the process could be completed in minutes.

In a flurry, vendors are releasing generative AI products built on proprietary LLMs. Few have used open-source LLMs despite the dozens available, though some higher profile companies are getting into that game. Most notably, Meta introduced OpenLLaMa and OpenAI announced it is working on its own open-source LLM, G3PO.

With a litany of choices presenting itself, AI is no longer a shiny new toy. Moving forward, it's a must-have for any mid-market software vendor wanting to pull a meaningful number of customers away from bigger players. Now is the time for these companies to decide how they want to proceed — build or buy generative AI, the basis of which can be open source or proprietary. This decision is made more complicated because each option carries a salient price tag — whether that's in terms of time, trust, competitive advantage, or finances — and the fact that niche, affordable, industry-specific LLMs are on the way but have not yet fully matured.

Aside from employing a dedicated data science team, companies likely lack the information they need to make an informed choice about generative AI. The last thing they'd want is to buy whatever the shiniest new toy might be before it's properly vetted or broken in. It's also worth considering how a vendor is implementing its AI — it may be with full transparency and an eye towards growth, or it may be a quick, patchwork solution with no ability to meet the ever-changing needs of a company. Here's more on how companies can proceed and why going the proprietary route may be the choice with the fewest downsides:

Open-source LLMs

GitHub is overflowing with open-source LLMs at the moment, with more being added seemingly every day. These products are reliable, affordable, accessible, and transparent. They will continue to find support within the open source community, as well, and its members are usually quick to update the code when necessary.

The use of open-source LLMs means their inner workings can be monitored and regulated by government agencies — international governance is essential and, frankly, long overdue, but meddling with the affairs of private companies is outside its jurisdiction. As the level of consumer education goes up, it seems likely that those who are concerned about the misuse of AI technology should opt for a vendor offering generative AI built on open-source LLMs.

Or, that would certainly be the case if regulations weren't so scattershot. There are far too many inconsistencies when, outside of the European Union and a handful of states in the US, governance is conspicuously absent. Plus, there's a good chance the developments in AI will soon occur faster than any legislative organization can respond — by the time troublesome aspects of open-source LLMs have been reined in, new, more pressing issues may arise. Consider the latest executive order issued by the Biden administration, which largely serves to establish investigative committees for research into AI's implications — which, then, will be communicated to companies as 'guidance' rather than mandates.
Even if more comprehensive and actionable governance were to arise, there's no guarantee consumers would even be aware of their existence. According to the August 2023 study, Generative AI Adoption in the Workplace, which polled 4,540 global workers across the world, 35% of US respondents don't know of any regulations or guidelines governing the use of AI.

Still, while there are certainly downsides to running software on publicly available platforms, such as the ability for bad faith actors to intervene, open-source LLMs have means of protecting themselves that step in where governance can't reach. They are trained with a variety of public and private data that improves the software's ability to detect issues before they arise, even new ones. Updates are free, after all, and pushed regularly. It's also more likely that intrepid programmers will insert functionality beyond what an individual company might be working on in its myopia, improving access for all users.

Proprietary LLMs

For many larger vendors, ceding any level of control to the open source community is a bridge too far. Those concerns can be eliminated by building an LLM from scratch and obscuring its processes and finer points. Software juggernauts with secrets — what could go wrong?

The real problem is that consumers themselves are the ones on the hook. Legislation can keep open-source LLMs in check because their models grow from publicly available data, but won't have the reach to regulate ones whose growth depends on data collected privately, especially as the technology balloons across the industry. Whenever a customer uses an open-source LLM, their search history, in-app behavior, and identifying details are logged in service of further educating the AI. That quid pro quo isn't obvious to users, and this means best practices have to be created and enforced by the vendors themselves — a questionable proposition, at best.

Companies can mitigate distrust by providing consumer education opportunities rather than hoping those consumers get up-to-speed on their own — a process that will likely have them favoring AI built on open-source LLMs due to how much more information is available. When a tech company messes up, customers are the ones who suffer most, so it behooves them to be fully onboard before accepting a particular vendor and its LLM. This unfortunate reality feels backwards, as customer behavior should be guiding governance, not the other way around, but all companies can do at this point is equip customers to move forward with confidence.

Once customers have bought into an ethical vendor's LLM, they can rest easy behind extra layers of protection. Within a unified system running on an internal LLM, trained from first-person proprietary data, security updates can happen across the ecosystem instantly and suspicious activity can be logged and escalated to the proper team member to validate. The price and resource requirements of building an LLM are far smaller than the amount of capital, both in finances and customer trust, they stand to lose from an issue with the open-source LLM they've chosen.

Which to choose

Mid-market enterprises interested in generative AI find themselves pulled in a few directions — build or buy their generative AI, either option of which can be built on an open-source LLM or a proprietary one. Or, simply work with vendors who have incorporated the technology into their stack natively. Ultimately, the ideal choice boils down to a company's short-term versus long-term goals. Paying for generative AI out-of-the-box enables companies to join the fray quickly, while developing AI on their own, regardless of LLM status, requires more time but stands to pay larger, longer lasting dividends.

Ultimately, there are simply too many unknowns and far too wide of gaps in regulations to risk constructing generative AI from scratch — by the time it becomes useful, it will largely be outdated. Companies should instead pay attention to everything a vendor does outside of AI. Do they act responsibly and with the utmost transparency? Do they have a history of siding with consumers on hot button topics like privacy? How quickly does the company release updates to its software and how stable are releases, generally, upon initial release?

It's also worth looking at how AI will be offered. If the technology is integrated into a vendor's tech stack from the beginning, its inner workings will be more effectively obscured behind extra layers of security, reducing customer risk. Sometimes this technology is entirely distinct to a vendor, while other times, like Zoho's partnership with OpenAI, the vendor is more focused on honing existing technology for its particular ecosystem. Regardless, advances in the tech can be pushed across the system instantaneously, ensuring that whatever generative AI produces is the most tailored result possible at any given moment, eliminating the risk of wasted time implementing something outdated. Past customer success stories and use cases are an effective way of scoping out a potential tech vendor's customer-centric approach to AI.

The eyes of the tech world are fixed on generative AI, but the only responsible way forward is to take stock of the past.

A grey colored placeholder image