System availability is increasingly essential to protecting a company’s revenue and ensuring positive experiences for employees and customers. As services become more digital, and the ecosystem of platforms and tools to support this grows increasingly complex, the human capability to manage these environments is being stretched.
This is the problem that PagerDuty seeks to solve with its Operations Cloud and through the use of AI and automation. What started as an incident response tool that was favored by developers, PagerDuty has expanded to help customers more broadly understand, and manage, their web of interconnected systems. The aim being that instead of losing out on income and suffering reputational consequences for prolonged periods when things go wrong, organizations can understand problems quickly and perhaps even move to a model of proactive management.
PagerDuty has mostly been on a steady upwards swing in recent years, with CEO Jenn Tejada saying that in the company’s public earnings that she is focused on achieving $1 billion in annual revenue. This is being reflected in the vendor’s shift to a predominantly enterprise customer base, which suggests its proposition of solving for complexity is resonating.
I got the chance to sit down with Tejada in London this week, where she explained how PagerDuty’s Operations Cloud is hitting the right mark with buyers. She said:
When we went public we transitioned from being an on-call tool for, largely, tech companies and tech startups, to being a modern incident response platform for the Fortune 500. That was a pretty big transition that I’m not sure a lot of the market thought we’d pull off. But now 80% of our revenue comes from enterprise and mid-market.
A lot of that is market pull, because this is something that every company needs, because every company is going through some sort of digital transformation. The problem with that is that the entire technology ecosystem behind that is still laden with legacy systems, technology debt, and operations debt.
The way companies operate largely still follows the industrial era command and control design of the military, which relies heavily on something going up the chain and then direction coming down the chain. A sequential decision making system. Tickets, queuing. The digital world doesn’t work that way, it works in bytes and milliseconds.
Tejada argues that the challenge for a lot of organizations is that they have often ‘digitized’ the “pointy front end of their business”, but the rest of a company’s operation is still ripe for significant transformation. Buyers often still operate as if they solely rely on just physical brick and mortar, in a command and control world, but this doesn’t suit the demands of the ‘always-on’ employee and consumer. Tejada said:
That’s a chasm that is wider than most people estimate.
The Operations Cloud helps our customers modernize the way they work, by not only addressing the technology debt in their own ecosystem, but by being able to handle and manage, through automation and AI, the unstructured, time sensitive, but high value and mission critical work.
Increasingly you’re taking a problem or an opportunity that would take days to identify, diagnose, to get the right people in the room, to be able to do that in seconds.
Disruption costs money - and that disruption is the responsibility of the whole organization, according to Tejada. Operations teams have often sat in a metaphorical - or literal - basement and have suffered the wrath of leaders when things go wrong. However, their role is getting increasingly elevated as the smooth operational running of a company becomes a board-level priority. Tejada gave the example of an e-commerce company that recently told her that it lost $4 million in an hour during an incident. You multiply a few of those incidents over the period of a year and you quickly understand how resolving these issues is a business priority.
Being available manifests itself in trust. You count on that. When it fails it can have a big impact. I think there's a much straighter line now between the reliability and the resiliency of the entire technology ecosystem - and how it delivers the revenue generating experience.
I think the way to think about it is that every employee in an organization is inextricably linked through the digital customer experience. It is heavily reliant on an increasingly complex web or ecosystem of applications and services, many of which that the company no longer has control over - public cloud services, public databases, private services.
And the thing we know as fact is that technology proliferation is outpacing human capability to manage it. You need technology to not only understand that, but to automate the effort to operate it more effectively. That’s where the big opportunity is.
A cautious approach to generative AI
Artificial intelligence (AI) has been central to PagerDuty’s proposition for a long time. It has had its own foundational model in operation for a decade already. However, as may be expected, the company is considering - cautiously - how generative AI could play a role in not only improving the operational resilience of an organization, largely by speeding up processes, but also by broadening access to the PagerDuty platform to a wider range of people.
Tejada said that PagerDuty will use a mix of its own models as well as publicly available ones to build out solutions, but it has already announced some generative AI use cases that include the following:
Users will be able to generate post mortems more quickly using generative AI. Instead of a team having to manually try and capture information across dozens of systems and observability tools, that information already sits within PagerDuty, and so reports can quickly be generated. What happened? Who did what? How was the cause identified? How was the problem solved? Instead of relying on what people remember happening, generative AI can be used to build a report from the source data.
PagerDuty will also create executive ready status updates. So, for example, during a major incident, there are a number of stakeholders in the business that need to know what’s happening (CEO, risk management, legal, sales, PR, etc). However, usually what happens is that someone in the know is pulled out of fixing the problem to inform these various teams. Through the use of generative AI, PagerDuty can watch what’s going on in the incident and draft a status update, which is then approved by someone to ensure accuracy. Tejada said that “the experts still have to sit on top of the generative AI to validate it, because we don't want our AI hallucinating anymore than the next person”.
During our conversation it was clear that Tejada is excited about the opportunity that generative AI holds for not only PagerDuty, but also for all sectors and economies more broadly. That being said, the CEO is acutely aware that the vendor’s whole proposition rests on being trustworthy enough to run a company’s operations effectively. As such, unlike some other vendors in the market, she sees PagerDuty’s approach as more of a ‘slow and steady wins the race’ one, rather than ‘first past the post’. Tejada said:
I do think that you will see us go slow to go fast. I think it's important to get a lot of feedback from users, to test.
I think this is the thing that I think a lot of teams and people and media are underestimating. You need guardrails, you need policies, you need some level of regulation, to make sure it's safe for its consumers. One of the things that has long been a competitive advantage for PagerDuty is the fidelity of the platform, the security of the platform, and the resiliency of the platform at scale.
So I'm not going to turn around and willy nilly and offer some generative AI feature that's going to produce garbage and put that trust at risk. Because as we know, trust is earned in droplets and lost in buckets.
Since trust has been so core to what we do, we are maybe more thoughtful than a company that has a heavier level of churn or shorter customer lifetime.
But, in particular, Tejada is excited about the opportunity for generative AI to become a consumable interface for a broader set of users in the enterprise - opening up access to the platform to different parts of the organization. She said:
If you think about it, today you kind of need to be a developer or at least a reasonably technical person. You’ve got to understand the PagerDuty interface, you’ve got to understand what it means to be a responder to leverage some of the information that a rich incident record can provide you. But if you can engage through a prompt, it's a lot simpler, right?
However, generative AI doesn’t mean that more code won’t be being created and more software won’t need to be being managed. She adds:
I think one of the misnomers, around generative AI, is that we're not going to need as many software developers. And we know that in every step change new technology has created capacity for developers - cloud, distributed compute, microservices - we just built more and more interesting software.
So I think what you're going to see is that generative AI is going to create more efficiency for software developers, but it means that we'll produce more software. And that means more complexity, which means more requirements for PagerDuty. So that's why I feel pretty bullish about TAM expansion as a result.
However, the advent of generative AI doesn’t come without its concerns, and Tejada is keen to see governments respond quickly to implement standardized regulations so that people and organizations know what’s safe and responsible. As diginomica has noted previously, and Tejada agrees with this, is that education is key. She said:
I mean, I do worry that we need to educate our governments, we need to educate enterprises, we need to educate in the classroom, around the dangers of AI and the challenges associated with it. And I think that this could be the one example where global standards could really enable us to move forward faster.
If you think about data - data residency, data security, data privacy - the amount of time and money wasted to manage all the different jurisdictional requirements for data management….if we could just settle on a global standard, we could pour all of that energy into innovation.
There will be unethical parties that will use generative AI to their benefit. And we're going to have to think about: how do we protect against that? How do we manage it? How do we hold people accountable when they use it inappropriately? What does that look like?
A really interesting and thoughtful conversation from Tejada. It was refreshing to have a CEO talk about the complexities and risks of generative AI, recognizing that perhaps caution is a competitive advantage when it is still so early in the game. Simple use cases to test the waters, ones that play in PagerDuty’s wheelhouse of reducing time to resolution for incidents, makes sense. It is indeed true that organizations are facing increased digital complexity and are going to need more automated, sophisticated ways of responding to operational challenges. We will be following PagerDuty closely and I’ve been told I can have access to some more customers to understand the practicalities of how it works in action - more to follow.