Main content

New Relic expands support for LLM monitoring as it gears up for Observability 3.0

Phil Wainewright Profile picture for user pwainewright June 28, 2024
Summary:
We catch up with New Relic's GVP of Product to discuss the impact of AI and how the vendor is adapting to growing demand with its concept of Observability 3.0

Hand holding a glass ball with clear image inside, on a blurred blue background
(© Tim Savage - Canva.com)

As enterprises seek to harness generative AI, rapid adoption of the underlying Large Language Models (LLMs) creates a new demand for observability tools that can monitor their performance. Akshay Bhargava, GVP of Product at New Relic, tells me how enterprises are using its observability tools to track a range of different parameters as they test different LLMs:

They're not just [measuring] how quick is it? They also want to say, how much does it cost? How accurate is it? How likely is it for bias, hallucination? ... They're looking for observability to provide them all those attributes so that they can make an informed decision.

It's no surprise, therefore, that New Relic has been working closely with high-end chipmaker Nvidia to bring observability to AI workloads running on powerful Nvidia GPUs. This week it extended that partnership, introducing an integration to Nvidia NIM, a microservices container that is ready-made to run LLM inference natively on Nvidia GPUs. Inference is the process an LLM uses to produce answers, and a common enterprise use case for the NIM is to easily deploy LLMs that will respond to queries on a specific enterprise dataset, using Retrieval-Augmented Generation (RAG) to fine-tune the prompts and limit the risk of inaccurate answers.

The addition of AI capabilities brings observability into a new phase since it first started to be widely recognized as an IT term just eight years ago. New Relic is calling this new phase 'Observability 3.0'. It defines '1.0' as the era of server-based Application Performance Management (APM), while '2.0' brought APM together with infrastructure monitoring and logging into a single package of cloud-based capabilities. Now '3.0' adds intelligence, with the main characteristics being the addition of AI capabilities and workflows, more support for open-source standards, and a consumption-based pricing model, says Bhargava.

Three factors driving observability

This coincides with a growing demand for observability, which he says is driven by three factors, including the rise of AI. Most important is the increased dependency across the software stack, as architectures become more interconnected. He gives an example:

Let's say you're a database developer, and there's a problem. The chances are, [when] the customers experience a problem, but you have no idea [where it is]. Is it in the database? Or is it upstream, downstream from where you are? That's a big problem to figure out. Without getting hundreds of engineers from all these different teams together, how are you going to pinpoint it?

This is where observability solutions like New Relic come in, because it can tell you immediately, 'Yes, there's a problem in the database, but it's starting upstream or downstream.' Then you can immediately say, 'Okay, is it at the app level? Or is it down at the infrastructure level, and in the serverless layer? Which team is responsible for that? And I know how to contact them and work through it...'

Linked to this is the growing cost of downtime, with every aspect of business operations becoming dependent on digital technology, and thus vulnerable to unexpected failures. He goes on:

Most companies take 30+ minutes to get to an understanding and even a workaround for the issue. So that time to fix — it's inevitable to have these incidents, and the time to fix is too long. Companies want to shorten that. That's real business impact. How do you reduce that time? How do you become more effective? Observability solutions are the key to it.

Speeding up response times was the motivation for a new integration with Atlassian Jira unveiled last month. This automatically brings insights and observability data from New Relic into a new 'Incidents' tab in Jira, where engineering teams can immediately act on the information and agree next steps, and carry out post-incident reviews once the issue has been resolved.

The advent of AI reinforces both these trends, adding new dependencies and increasing the reliance on digital automation. He says:

If you're building technologies with AI, observability becomes even more key. Because AI is equally, if not more complex. It's adding much more complexity — all the Gen AI, the LLMs, the prompt response, bias, all of these things, the hallucinations. How do you monitor and manage that? That's also a part of observability. Companies that learn to manage that effectively, observe that effectively, they can innovate on AI faster, and they're going to get a competitive edge.

AI is changing observability

In addition to providing capabilities for AI monitoring, New Relic has also developed its own LLMs to bring new features to its AIOps offering. It also has an AI assistant that helps New Relic users build dashboards, queries and code. He explains:

We're using AI and AIOps to streamline and make certain experiences of the product much more efficient and better. A good example of this is, if you come into New Relic today, and you look at your errors inbox, there's all these different errors that come in, maybe hundreds or thousands of errors. How do you make sense of that? We have used AI and AIOps to help summarize and distil those down into themes and categories, so it's easier for a user to come in and understand what are the categories of errors. We're doing a very similar thing with alerts...

A big part of how we're using AIOps is to improve the customer experience in the product, and essentially use AI to reduce the noise and create the right signals and consolidate information in various different parts of the experience.

Increased use of intelligent assistants is also changing the business model as observability moves into this third phase, reinforcing the case for usage-based pricing. He explains:

Let's say today, we charge based on users. It's a very common subscription-based pricing model. In an AI world, where you are potentially reducing the number of users that you need, but you just need one AI bot, how are you going to charge for that? Usage starts to become more of a driver for a lot of use cases with AI in the mix as well.

The shift to usage-based pricing isn't an easy one to make, but New Relic is already well advanced down this path. He argues this positions the company well for growing adoption as enterprises step up their investments in observability. He sums up:

All this compounding of things, from the current way that apps are built and deployed, the rising cost of outages, plus the advent of AI, all of them culminating to more and more need for every single company, every single development team, to use observability. That's what we're seeing happening in the marketplace.

My take

The journey from its APM origins into a full-fledged observability platform has seen New Relic go through quite a few changes, including new private equity owners late last year, followed by the appointment of Ashan Willy as its new CEO six months ago. The transition to usage-based pricing was particulary challenging for a publicly quoted company, but as Bhargava points out, it provides a better foundation for accurately allocating costs when usage is increasingly automated.

AI brings new technology challenges, too, not least the training and deployment of LLMs that can power those automations within New Relic's own product set. More recently, the vendor has expanded into monitoring LLMs for customers, going beyond simply monitoring for uptime and response rates but also reporting on quality factors such as inference cost, bias and hallucination rates. Once again, observability is expanding to bring in even more data points for analysis, reflecting the increasing complexity and importance of digital infrastructure in today's enterprises.

Loading
A grey colored placeholder image