Confluent CEO Jay Kreps recently described that his vision for the company is for buyers to utilize its commercial offering of Apache Kafka as a ‘data streaming platform’. This means connecting, processing, governing and sharing data streams in real-time. Confluent’s pitch is that companies that rely on data stored in databases, using batch processing to gain insights and develop applications, aren’t in tune with a world that has moved to always on, real-time data environments. And it’s Confluent’s job, as it sees it, to make these real-time data streams more usable for the enterprise.
To this end, the company today made a number of announcements at its annual user event in San Jose, California, that aim to make it easier for buyers to work with real-time data streams, with a particular focus on allowing them to more easily develop applications using streaming data. This includes real-time AI applications - a move that I’m sure will be welcomed by customers, given the hype and focus around Large Language Models (LLMs) and generative AI.
It goes without saying that for companies to utilize AI effectively, they need data that is trustworthy, governed appropriately and served to the right people at the right time. Equally, buyers are unlikely to limit their focus in deploying AI on one vendor, or one development organization, and will need to be able to source models and tooling from places that suit their needs at any given time.
As such, Confluent announced a range of integration partners today that aim to allow customers to work with AI, but by using the Confluent platform as an exchange layer that provides the necessary ‘safety net’ to ensure the data is trustworthy and secure. Kreps sees this as sitting very much within the company’s wheelhouse - and actually is more a case of ‘business as usual’. He told diginomica:
I think what makes it a little bit easier for us is the value proposition for those AI use cases for Confluent is actually not that different from the value proposition elsewhere: which is real time data flow.
And so people are already looking at us to do this. We were kind of already a de facto standard for that. As you're looking at these new AI use cases, you need a whole set of real time data that will plug into that.
And so, in that sense it's not like a new product for us, or some new thing that has a new way of taking it to market, it’s really just kind of showing the architecture of what our customers are already doing. And it's an extension of what we were doing.
The partner integrations are new, however, and these aim to support customers that want to connect their data flows into these AI systems. The partnerships include:
Technology Partners - Confluent is partnering with MongoDB, Pinecone, Rockset, Weaviate, and Zilliz to provide real-time contextual data from anywhere for their vector databases. Vector databases are especially important in the world of AI, as they can store, index, and augment large data sets in formats that AI technologies like LLMs require.
Public Cloud Partners - Confluent is also building on its agreements with Google Cloud and Microsoft Azure to develop integrations, proof of concepts (POCs), and go-to-market efforts specifically around AI. For example, Confluent plans to use Google Cloud’s generative AI capabilities, with the aim of improving business insights and operational efficiencies for retail and financial services customers. And, with Azure Open AI and Azure Data Platform, Confluent is planning to create a Copilot Solution Template that allows AI assistants to perform business transactions and provide real-time updates.
Services Partners - Finally, Confluent is launching POC-ready architectures with Allata and iLink that cover Confluent’s technology and cloud partners to offer tailored solutions for vertical use cases.
The key for Kreps, when thinking about how customers may want to use AI in the future, is that it’s the flow of data that’s important. What Confluent should provide is the choice of technology providers, so that buyers have the ability to use what suits them best at any given point in time. He added:
I would say one of the benefits that we bring is that we act as a connected layer between different things. It's actually very convenient for companies, the fact that there's nine vector databases in a space that are changing very rapidly - we act as an interchange across all of those, so you don't have to get every bit right.
As long as you have the right flow of data, you can actually support however this evolves, whether you’re using OpenAI today or you're using Google tomorrow…that stack can evolve.
Flink is here
Confluent recently acquired Immerok, which has developed a cloud-native, fully managed Apache Fink service - which has seen growth in usage along similar lines to Apache Kafka. Kreps said earlier this year that Fink would form a central part of Confluent’s strategy going forward, given its powerful processing model that generalizes both batch and stream processing. The key focus for Confluent is that it allows it to monetize the application development around data streams - given its data stream processing capabilities.
The thinking is that with Kafka, Confluent only had one piece of the puzzle. For buyers to fully utilize data streams, they need to combine, develop and reshape those streams with other data from across the organization. However, self-managing Flink can be operationally complex and resource heavy.
Today, Confluent announced its Apache Flink on Confluent Cloud service, which aims to allow teams to create high-quality, reusable data streams that can be delivered anywhere in real-time. With the service, Confluent says that teams will be able to:
Filter, join and enrich data streams with Flink
Enable high-performance and efficient stream processing at scale
Experience Kafka and Flink as a unified platform (a key point for customers already using both separately).
The key thing to understand is that Flink aims to make data streams more usable - providing new opportunities for buyers in terms of what their stream data can do for them. Explaining how organizations should be thinking about working with Flink, especially within the context of their Kafka usage, Kreps said:
By analogy, to the world of data at rest, the lowest layer in the traditional data stack is your file systems, right? Everybody has that and you’ve got a bunch of files. But really, the thing that unlocked the data for use in applications was databases, which just makes it really easy to work with data. It's kind of a back-end for most enterprise applications.
So now, if you think about streaming data, that fundamental layer is Kafka - that's the hub that everything plugs into that has the streams. But how do I build an application on the streams?
I can do it by just working with the raw stream data, in my code from scratch, but anything we can do to make it easier, that's what Flink does is.
It makes it really easy to build these scalable, correct applications. You can do it in SQL, in Java, or in Python. It’s a framework that makes that easy in the same way I think databases did for stored data. For us, we feel like it's equally important, like how important were databases for stored data.
It’s been an interesting first day at Confluent’s Current conference, where we’ve also had the opportunity to hear from a range of customers (more on that tomorrow). The key takeaway for me is that Confluent is focusing on usability of data streams. I’ve been following the company for a few years and whilst the benefits of Apache Kafka are clear, often what I’ve observed is that it requires a heavy operational and skills investment to get it right. Confluent introduced a greater level of simplicity for customers with its Confluent Cloud offering, extracting some of the management requirements away from enterprises. And with Flink it aims to do the same, making processing and development of data streams easier (all within a unified platform). The AI piece is to be expected, but I must say it was refreshing to see a vendor not announce a whole host of LLM and generative AI use cases to the market, but instead focus on the foundational (ahem) elements of making sure the data is accurate, secure and governed well - whilst providing access to the AI tooling customers want and need. We will be providing a number of customer stories over the coming days, so stay tuned for those.