How do we protect data privacy while innovating on data? Learning from Zoho's approach to AI development

Jon Reed Profile picture for user jreed October 5, 2022
All too often, data privacy conflicts with AI advancements. Is there a better way? And how do we avoid headline-making chatbot bias meltdowns? Let's check in with Zoho, as they press further into AI with an extreme stance on customer data privacy.


Attention software vendors: enterprises count on you for AI leadership, skills, and direction. But are software vendors delivering? By "AI leadership," I mean:

  • What is your view on AI futures? Are you looking to take humans out of the loop? To fully automate decisions? How will customers supervise such systems?
  • Are you infusing AI into your products? How do you deal with the problem of bias? How are you training your data sets? Do customers have to pay extra licensing fees for access to AI functionality?
  • Are you providing "intelligence" back to customers in the form of next best actions and aggregated benchmarking data? How are you addressing data privacy?

If you want bold/outspoken answers to these questions, talk to Zoho. In "Privacy is not a feature" - how Zoho's approach to workplace privacy impacts AI development, I wrote about how Zoho's uncompromising approach to data privacy compelled them to build their own AI tooling. But preserving customer data privacy is one heck of a rabbit hole.

Zoho has taken this much further than building AI tools. Example: how do you train AI systems on smaller data sets, without compromising customer data?

How Zoho turned into an AI development shop

Time to dig in, via my first-ever video chat with Ramprakash Ramamoorthy, Director - AI Research at Zoho Corporation. Ramamoorthy has been with Zoho for 11 years. His first internal project at Zoho? An "AI experiment," where they performed statistical analysis on Zoho's help desk product, now called Zoho Desk. In 2015, Zoho built out a separate unit called Zoho Labs:

Don't get carried away by the name "labs," we are not a group of academic researchers. We make sure to pick out common problems from across Zoho - and then we try and solve them.

AI development began in earnest. Data privacy and system performance were non-negotiable. As
Ramamoorthy told me:

The first problem we took up was databases. A database is the foundation for an enterprise software company, so we contribute a lot to open source. We have replaced all commercial database vendors in Zoho with our own version of a [Postgres] database.

In their research, Zoho Labs pursued the three main divisions of applied AI: statistical/machine learning, computer vision, and NLP (natural language processing). Over time, those "experiments" became part of Zoho's apps. Ramamoorthy:

In statistical machine learning, we started off with use cases like anomaly detection, forecasting, finding out frequent patterns, finding out rare patterns -all things related to users and entities in the system.

Next up? Computer vision:

Then we ventured into computer vision, where we have many use cases going on, including receipt digitization. That is where we started using OCR. We have a tool called Zoho Expense, where you can expense receipts to your company.

We started automating receipts. Today, for any document under the sun, we are able to digitize it. Now the goal is to ensure we can extract relevant information from any document.

No surprise: NLP is a major Zoho investment. Ramamoorthy:

The natural language processing team is one of the bigger AI teams in Zoho. We have use cases like translation, where we can support 90-plus language paths... When all of that comes to your enterprise editor, that is where the value add is. You cannot really put down a number to it,  but we see the usage graphs going up for this particular use case.

How AI and customer privacy intersect

What if Zoho runs into a customer privacy issue with web-based AI tools? They build their own. Yes, that includes grammar and spell check:

In a similar fashion, we also have our grammar error correction/detection engine. We first debuted it in Zoho writer, our word processor. Now it is underneath all of the editors in the Zoho product stack. So wherever you type it, we're able to understand your pattern of writing using AI. Then we're able to offer quantitative and qualitative suggestions, identify errors, and fix them.

Zoho takes a stance that too many enterprises overlook: even the most basic consumer tools (like spell check) can expose sensitive customer information.

Today, a lot of people use consumer software for translating sensitive enterprise information - tools that are technically built for the consumer. 

Adding AI to Zoho's apps is a start - but Ramamoorthy now sees a bigger payoff: integrating "intelligent" features across apps.

I think it all comes together in the way we integrate it. For example, today in Zoho Desk, when you have a customer support ticket coming in, we can identify what that ticket talks about. Depending on the past data, we're able to assign it to the right agent. And then we also present beautiful charts, and that helps identify anomalies. That helps to do capacity planning and whatnot, and the reports for managers.

Productizing AI 

And how does this help service teams?

Let's say you're a support manager, and you're going to read ten customer tickets a day, then it should be from the angriest of your customers. Now, we will identify the sentiment and see how the sentiment changes through the lifecycle of a ticket. Did a customer come in happy and then leave angry? Or did they come in angry, and then leave happy?

Essentially, all of these data points are in your system. They help the customer-facing faction to deliver better. Using techniques like forecasting and anomaly detection, we can better predict deals, deal-winning chances, the best time to contact leads, and so on in our CRM platform. The way all of these integrate together is one big value that Zoho adds to the table because of our wide and broad product suite. Again, with bundles like Zoho One, our AI engine is able to access data that is not in silos. 

It's able to more effectively give insight, especially in BI, with products like Zoho Analytics, where you have features like  Ask Zia, where you can just ask the system in natural language - and It can give you results on how things are.

Avoiding chatbot misadventures - the problem of bias

When I talked with Ramamoorthy, Meta was garnering another round of unflattering headlines for its problematic chatbot Blender. So what is Ramamoorthy's take on avoiding bias in AI? One way Zoho does this: keep the use cases focused.

From a service perspective, we use very focused bots, bots that can help get data out of your analytics system, bots that can help you understand, your helpdesk, and the overall stats, and so on. So it's very, very niche, very focused. It's not like a general-purpose conversational bot, like the Facebooks and the Googles of the world.

Another key to avoiding bias? Treat data like code:

One way we make sure our bots stay without any racial bias, and stay within line, is to make sure we treat data like code. Things like bias, things like fairness in AI, we make sure our data is well vetted. A lot of times people ask, 'How could there be a bias in a helpdesk?' But wherever we are automating our decisions, we make sure the data we feed to it is treated just like code. We have version control systems; we have code reviews, the same way we have data reviews, and so on. There is not much unintended behavior in our bot.

Zoho advocates keeping humans in the loop:

We strongly believe in the 'human in the loop,' where there is always human supervision of all these bot interactions. We have added explainability to our bots. Whenever the bot is taking a low confidence decision, automatically, a human is involved in the loop, and the human can always take over from there. So since it's very, very specific purpose bots. At Zoho, we haven't faced such stories. But at the same time, we ensure utmost caution to ensure the data fed into training the bots is always  a proper representation of bias-free.

My take - the practical AI holy grail is next-best-actions

I'm all for taking decisive steps to ensure data privacy. Zoho takes this to an appealing extreme, one that is a very good fit for their development culture. That stance obviously imposes challenges on Zoho as well. Most vendors wouldn't want to build their own alternative to Google Analytics, or build their own web browser. Zoho has done both, and a lot more beyond that.

The risk for Zoho, of course, is getting their development spread too thin with that ambitious scope. Customers will obviously be evaluating Zoho's in-house apps carefully with that in mind. But there is a corresponding strength: diverse apps that play well together - on the same data model. When it comes to applying AI across apps, and achieving the integrated benefits Ramamoorthy talks about, a consistent data model is a big differentiator. This AI approach adds a different layer of value to consider, for offerings like Zoho One.

In the past months, I've run into a couple vendors that insist they can build effective/accurate AI tools with smaller (training) data sets. That's a strong assertion that needs proof points. Machine learning thrives on vast data sets that account for the kinds of anomalies we encounter amidst the complex variables of real life.

However, Zoho Labs has refined a sensible variation. As Ramamoorthy explained, they typically start a new AI project by training on large, publicly-available data sets. Then they fine tune with Zoho's own data. As customers use the tool, they refine further. That approach reduces the "small data sets" limitation significantly.

I am in favor of a humans-in-the-loop AI design approach. However, that's a complicated discussion. In some cases, a human remains in the loop because the AI isn't accurate enough. That isn't desirable. Example: I used for the machine transcript for this article.

However, I had to re-listen to some segments for accuracy. That's too much human-in-the-loop. McDonald's wants to use AI for ordering at all drive throughs, and take the human out of that loop. However, their AI ordering isn't yet accurate enough for that threshold (AI accuracy requirements vary by use case).

On the problem of bias, I agree with Ramamoorthy: an "ask me anything" chatbot poses a more difficult bias problem than focused AI tooling. There isn't a comfortable answer to the problem of bias, however. I see it as a discipline you must commit to, on every technical and cultural level. Ideally, your AI design makes humans more accountable, by identifying biased patterns (example: biases in hiring) - and your humans also catch (and correct) the weaknesses in your AI output.

To me, the holy grail of practical enterprise AI is: role-based, embedded next best actions for every employee (gradually, some of the most repetitive actions, like approvals at a certain threshold, could be automated as the system learns). That feels like a 3 to 5 year ambition from here. But, at least for some roles, Ramamoorthy believes Zoho is much closer than that:

We are at least a year away from achieving that point where [the AI] can contextually recommend things. We do have a lot of small steps that are available right now. But we are strongly moving in that direction.

For example, our CRM has a module that tells you the best time to contact, a module that helps you save all actions together - you know, frequently done actions together as macros... The ideal next step would be [Zia as a sales coach], where right now, you help your salesperson do better. You can look at all conversations where the lead has been converted to a customer successfully. Maybe there are some mistakes where you lost a deal, and then a prompt to make sure you don't do that mistake again.

Ramamoorthy doubled down on the data silo problem:

In Zoho One, where data is not in silos, the AI engine is able able to look at all data that is available in the system, and provide the next best set of actions. I think we are in the last mile of the first marathon. Hopefully in the next one year, we should be achieving that.

Count me in for a test drive - and a diginomica update.

A grey colored placeholder image