TrailblazerDX 2024 - Data Cloud and the 'datablazers' taking a data-first approach to building apps

Phil Wainewright Profile picture for user pwainewright March 8, 2024
Under the covers, new features in Data Cloud are key to enabling the AI capabilities that Salesforce unveiled during this week's TrailblazerDX conference, and a new data-first approach to building apps

Muralidhar Krishnaprasad speaks at TrailblazerDX 2024 - @philww
Muralidhar Krishnaprasad at TrailblazerDX (@philww)

Since the dawn of computing, developers have always built software by deciding on the business logic, and then deciding how that shapes the data. But at this year's TrailblazerDX conference, Salesforce has been talking about an approach that turns that on its head. This argues that there's now so much data available across the enterprise from many different sources, that it makes more sense for application builders to start from the data and then match it up to the business logic — with AI automatically assembling the data and actions required to fulfil the user's intent at runtime. While the Einstein Copilot AI assistant and accompanying Einstein 1 Studio toolset have stolen the limelight at the conference, the crucial component alongside them that enables this new approach is Data Cloud, which includes a number of new capabilities in its spring release, also unveiled this week. There's been a lot for attendees to get their heads around, with Marc Benioff, CEO of Salesforce declaring in a tweet on the eve of the conference:

Let’s make every Trailblazer a Datablazer!

Data Cloud has been in the works for several years, transcending its origins as a Customer Data Platform (CDP) to fulfil a much broader purpose, connecting data across the enterprise and making it available to every application, function, workflow and analytics tool that runs on the Salesforce platform. It was conceived before anyone realized how rapidly generative AI was going to sweep into use, but its ability to bring together data from various sources into a trusted environment has proved to be a perfect complement to the remarkably flexible automation that these AI tools offer.

Recent AI advances have made it much easier to build and try out much more sophisticated queries and automations, and the purpose of the new Data Cloud features is to make it easier to surface the necessary data that fuels those outcomes. Whereas in the past a user might have gone to several different apps and reports to find the information they needed to initiate an action — or in many cases couldn't get all the information they needed to make the best decision — these new capabilities help bring all the information together in one place where they can be acted on, either by the user or, in some cases, automatically by the AI.

One of the key aspects of getting accurate, reliable outcomes from generative AI is to build detailed prompts that provide all of the context for the requested query or action. This is part of an approach known as Retrieval Augmented Generation (RAG) and these new Data Cloud tools effectively make it possible to build very detailed RAG into the prompts that instruct the AI — but without having to know anything about RAG. In some cases, prompts can be created automatically from changes in the data, which then trigger pre-built automated workflows.

New features in Data Cloud

Whether you're constructing prompts or simply creating a user dashboard or workflow, the new capabilities make it easier to collate the required data and bring it into a Salesforce object, wherever it originates. Two important capabilities that are now generally available in Data Cloud are firstly the ability to copy individual fields and also calculated insights, which provide pre-calculated metrics, such as a customer's lifetime value, and secondly related lists, which add relevant data sets, often including real-time information, such as engagement data that shows what products or campaigns a contact has been interacting with most recently. Muralidhar Krishnaprasad, EVP Software Engineering, who has led the creation of Data Cloud, explains:

Traditionally, what happened was, if you had data outside, if you had to make your contact get that lifetime value, or whatever else it is, you had to do all this reverse ETL stuff, a whole thing in itself. You had to worry about how fast can I push data? How soon can I push data? All of that, setting it up, managing it, all of that is gone now. You don't need to worry about it.

First of all, every data in Data Cloud, you can directly use it in Lightning. It's available for you. Or if you want it to be part of your transaction object, a few clicks, we will take care of it, automatically. Even more powerful is this related list concept, where you can actually take any data that maybe Snowflake, maybe we have ingested, and now you can relate that to your contact object.

Two further capabilities have now been introduced in pilot, which he describes as "quite revolutionary." First is the advent of real-time data graphs. A data graph is impressive in itself — instead of having to create SQL queries and manual data joins, it allows a developer to define a set of relationships between data points. This is particularly useful for AI scenarios, where for example a data graph can be called up to provide an answer to a user, and it will have all the necessary fields. Adding real-time capability now means that the data presented can be updated in milliseconds, allowing for real-time contextualization of responses at scale.

The second innovation is the addition of a vector database to Data Cloud, which brings the ability to do semantic search across unstructured data such as documents and transcripts. This allows generative AI to look at blocks of text such as service tickets, customer reviews and chat logs, to find information or detect patterns that were previously hidden. Because the search is semantic, it doesn't need an exact keyword match to identify when different words are being used to describe the same things. He comments:

If you look at 25 years of CRM, or 30 years of CRM, and even databases, it's all structured, meaning somebody's entering something into a form. That's what's powering all companies. But if you really step back and think, 80% of the data we do, it's recordings, PDFs, transcripts, emails and so on.

The best we've been able to do thus far is search. But search is textual. And search is yet another separate system on the side. It's never been integrated into the business layer. What we have done now is to put a semantic index on it, which means it's not an actual comparison. It's a semantic comparison. Like, for example, I might be complaining about fees for a particular thing. You can now ask it not necessarily about fees, you can actually contextually say, is somebody asking me about trouble with their card, you will find it, even though there is no textual [match].

This opens up a new set of automation possibilities using triggered flows in Data Cloud — which from this week can now be tested and troubleshooted before activating — setting in motion an automated process when a data point changes or when calculated insight conditions have been met. He explains;

Every table in Data Cloud can generate a change feed. Let's say you did an ML prediction. That's also a table, that's also a row, that can generate a change feed, which means you can actually trigger a flow on it. So if the ML prediction says, your lifetime value changed, or your prediction of you atritting is high, you can actually trigger a flow. Or your real-time event is coming in, somebody's clicking on the website, you're getting errors, could trigger a flow. It is so broad, that pretty much everything you do, the flow can now work on it. This is very powerful.

From copilot to autopilot

New possibilities now open up when combining the vector database capability with triggered flows. He goes on:

I can do a semantic query, join it back to my case, which can join it back to contact, I can now quickly find out who is talking about fees in my credit card and plot it in Tableau, as an example, plot it by region and say, who's complaining about what? This was not possible before ...

If maybe too many people are negatively posting about something, you can now trigger a flow on it. Attach a case to it. And then you can even figure out who should the case go to, based on how they've solved previous things. You can automate a lot of that stuff.

I think it is, to me, the low hanging fruit for that is so high. A new case comes in today, people still rely on a human to say, 'Oh, these two things looks the same, it's probably a dupe'. Now we can automate it. As somebody is entering a new case, I can tell exactly, 'Hey, this looks exactly like the other case. Don't create it.'

Ultimately, this moves the gameplan from copilot to autopilot, he says, as it becomes possible to automate more and more processses rather than keeping a human in the loop. He goes on:

This blending of structured and unstructured together to power all your experiences — automation, analytics, AI, and Lightning and everything else — I think will be a game changer, because that will what will truly lead us to autopilot ... You don't have to go ask for it. It can do it for you.

For those concerned about the implications of automatically triggering all kinds of actions, he points out that alerts based on changing data can also be used to monitor outcomes and fine-tune the system. He says:

Feedback is a critical thing. We don't talk about it much. But maybe we should. We track all the feedback back in Data Cloud. That's another table, which means everything we talked about, all the stuff you see, you can now apply on feedback, too. Which means you can automate if there are negative feedbacks, if the toxicity is too high, you can create a dashboard on it and then alert it. So you have all of that power as well.

Human at the helm

Salesforce has been talking about a new concept as a successor to the notion of keeping a human in the loop, which it calls 'human at the helm'. Rather than requiring users to check every action proposed by AI before being executed, this proposes taking more of an overview role. Paula Goldman, Chief Ethical and Humane Use Officer at Salesforce, explains:

The human at the helm is the idea that, as AI gets more and more powerful, it's not just creating content, it's taking an action on our behalf ... we need more powerful controls. We need audit trails. We need Einstein 1 Studio where we're actually setting the parameters of actions in advance, we're able to assess whether they're successful, we know the right data is going into there. It's the move from just having humans review every single thing to more powerful control.

At the end of the day, what does all this mean for the Trailblazers, those who have been building applications and must now start to move towards a data-first, AI-driven approach to automation? Rahul Auradkar, EVP & GM, Automation and Integration, says:

In the past, it was forms over data type business logic, coding, if you may. There is a lot of sophistication in there, all the way from the CSS layer, all the way down to the data layer. There's a lot of sophistication in that whole stack.

But that stack presumes that you don't know, a priori, about all the customers' touchpoints, their engagement channels, their engagement modalities. It presumes that you know something about the user, but we have squarely not accounted for the fact that we have enough data about the users, we have squarely not accounted for the fact that we already know the user behaviour based on data that we have, either whether it's past historical data or engagement data that we're bringing.

Imagine a world in which you start building on that data, where you drive analytics on the data that drive automations, and that driven automations lead to redefined business processes, and those redefined business processes drive more productivity and more value to their customers. That's what we're looking at, where the users and developers start looking at the data and the AI processes first, in addition to the business logic, that they're really going after.

Nevertheless, it's important not to get carried away. There are a lot of low-hanging use cases that can deliver value simply by connecting more data into the contact record, he says, and even more when you start to work with very simple unstructured data sources. He elaborates:

A case is essentially a structured piece of data, you've got rows and columns, you've got a structured piece of data. But there are certain cells within those rows and columns that are text, where [the agent] has described a lot about this issue. And it could be a reference from that cell to another knowledge article that exists. So you're looking at the structured data, you're looking at the unstructured data, and you're combining the two of them, the power of it. It's a simple use case. You don't even need to go think about voice transcripts, video transcripts, and email data, all of those. They're all awesome. But we could just take a look at what exists today with all the textual data and the knowledge references that textual data has for customer service ... those are simple things that you could get started right now.

Admins have a role to play too in figuring out what data to bring in and helping to reconcile the different data schemas that, for example, describe a customer as a contact, a subscriber or a profile, according to which system is holding the data. A reference model that harmonizes all these different schemas hides this complexity from downstream applications and flows. Krishnaprasad sums up:

To me a Datablazer will play a big role in first identifying the data, but really more importantly, creating that common model for the enterprise. And then the third part they're going to play a big role is really in reimagining a lot of the business processes and automation to start leveraging this.

My take

I've written in the past about the transition to more composable, Tierless Architecture for enterprise IT, which breaks apart the traditional application silos and makes the data and functions available for combination together in much more flexible, cross-functional patterns. All of the above accelerates this process and taps AI to add powerful new automation capabilities. The takeaway for application builders is that their profession is becoming industrialized at massive scale, and they will have to become adept at building and continuously tuning reusable functionality rather than one-off projects. But first, there's a big job to be done in making sure that the right data is in place, that the functionality is well documented, and that the application builders understand their new roles and what skills they need to focus on to play their part. The next few years are going to be a time of massive change and relearning.

A grey colored placeholder image