New Relic recruits AI to look for needles in DevOps haystacks

Phil Wainewright Profile picture for user pwainewright September 13, 2017
New Relic unveils new AI powered capabilities at FutureStack New York to help locate needle-like issues in DevOps haystacks

New Relic's Lew Cirne & Gannett's Erik Bursch in FutureStack keynote 370px
New Relic's Lew Cirne & Gannett's Erik Bursch

What's the role of artificial intelligence in application performance monitoring? New Relic CEO Lew Cirne is certain of one thing — it's not so that people can have conversations with their war room dashboards. As he told attendees at the vendor's annual conference in New York yesterday:

I haven't met a customer that wants to talk to their monitoring system.

Instead, New Relic is using machine learning to sift through the mass of performance data being collected, help its customers find the "needles in the haystack" that are causing problems, and fix them faster:

I think it's analogous to augmented reality in that the human is going to have technology help them know where to spend their time and attention. If it's a broken piece of code, we're going to identify what the broken piece of code is, who owns it, and recommend that you go fix it. The actual fix probably is a human writing software, in that case.

AI powered analysis

The conference kicked off with several announcements of new and future products powered by AI, most notably general availability of Radar, which New Relic has previewed at previous events under the codename of Project Seymour. Radar analyzes the data collected in real-time and identifies patterns and potential issues, along with suggestions for resolving the issue. It provides the information to users in a personalized feed based on their role and interests, and constantly learns from user engagement and actions to improve relevancy.

Other announcements today use AI to help filter data to provide more relevant alerts and insights. NRQL Baseline Alerting lets users define dynamic alert thresholds that are triggered when the AI detects anomalous behavior. Error Profiles automatically highlight the attributes in an error condition that are different from historical values, which helps users more rapidly understand the cause of an error, and where to focus their attention to resolve it.

Distributed tracing

The company also previewed a new capability to trace and isolate errors in interconnected systems, which uses the vendor-neutral OpenTracing framework for distributed tracing. Applications are increasingly dependent on distributed systems to perform operations, and it can be frustratingly difficult for a developer to identify the precise location of a problem when there are so many different levels of dependency. The new distributed tracing tool is able to traverse all the connections and drill down multiple levels to precisely identify where a problem is occurring.

The move into interconnectivity solutions is an interesting shot across the bows of Cisco, which bought APM rival AppDynamics earlier this year. Cirne says the open standard approach is an important differentiation for New Relic:

We believe that what we've done with distributed tracing today is, we've introduced the best cross-service tracing available. What we've done that Cisco doesn't do, and nobody else does in that direct competitive space, is we've embraced this open standard.

We decided, let's let our customers use an open API and then their source code isn't locked to a monitoring vendor.

How USA Today survived election night

The conference opening keynote also included an appearance by Erik Bursch, vice president of platform as a service at news publisher Gannett, which publishes USA Today as well as many regional newspapers across the US. He spoke about the experience of the presidential election results in November last year, for which his team rolled out a production level Kubernetes platform in order to support the rapid deployments of new code they knew they would need that night.

Timeliness of reporting on election night is crucial, as any lag in the precinct results coming in will quickly send visitors to other sites. During that night, Bursch's team made 200 separate code deployments, a mixture of bug fixes, data fixes and product changes that adjusted how the news was presented.

That ability to stay on top of a constant stream of new code deployments is typical of the kind of agility that software-led businesses are having to achieve, and it's what fuels the expansion of New Relic's market. In a fast-moving DevOps environment that's constantly delivering new code, having instant insight into what's happening and whether things are going wrong is crucial. As Cirne explains:

It's not that testing doesn't have its place, but you can never predict how software's going to behave at Internet scale, until it's at Internet scale. It's complex.

New Relic's mission is to help its customers move as fast as they can, at scale, as he told attendees:

We are the catalyst that helps you all move fast with confidence with your most critical web and mobile projects.

My take

Modern software development methodologies are at the heart of a new class of ultra-nimble enterprises that constantly adjust their web and mobile applications to adapt to the needs of customers. Today, these organizations are still a minority, but they increasingly represent the mainstream of business, and there will come a time when this is how every business will behave.

New Relic has been doing a good job of giving such companies the tools they need to stay on top of this rapid pace of change, and the investment in AI will help them cope with even more rapid change in the future.

We'll have more from the New York event in the coming days, including an interview with Lew Cirne and some notable customer stories.

A grey colored placeholder image