Software developers in the Twittersphere are still debating how to define observability. An even more important question all software teams and companies have to address is, how do you know when you've achieved observability? And is observability even something you "achieve" and complete? Or is it something you practice every day?
In the days of mainframes and static operations, there were few known system failure modes, so monitoring tools were an effective approach to visualizing and troubleshooting system failures. Fast forward to today, and the complexity that we've created in the name of speed and scale forces you to adjust how you monitor these systems. It's no longer enough to have a rear-view understanding of the ‘known unknowns' that traditional monitoring provides through metrics, dashboards, and alerts (ie, alert me when my server CPU hits a specific threshold).
Because system change (versus stability) is the norm for distributed environments, you need to flexibly query the 'unknown unknowns' of these dynamic systems. Monitoring alone isn't enough. You need to be able to find answers to questions you couldn't predict when your system was set up. In short, you need observability.
What you gain from observability
Observability lets you see how all of your applications and their underlying services and systems relate, so you can understand dependencies across organizational boundaries and troubleshoot and solve problems faster. Observability gives you context and helps you understand why an issue has occurred.
In a reality where your software's health directly affects the health of your customers' digital experiences and your business, observability gives you the confidence and visibility required to:
- Minimize the time to understand how systems are behaving
- Understand how system and code-level changes impact the business
- Reduce the time to surface, investigate, and resolve a problem's root cause
Most organizations seem to have got the message. In recent research conducted by New Relic, three-quarters of respondents (75%) said they agree or strongly agree that their "organization has a real-time view of how all systems are performing and interacting on a single platform (ie, an observability platform)." Any reasonable person would interpret that to mean that 75% are practicing observability, right?
So why did other data indicate this is not what they're actually doing? For instance:
- Only 8% of total respondents rated as "very good" their ability to know why systems and software aren't working. Knowing "why" vs "what" went wrong is an observability hallmark.
- Three-quarters are unhappy with the time it takes to detect and fix software and systems issues (and point to an overly complex IT environment as the key factor).
- Just 4% of firms have integrated to a great extent their data on software and systems performance with data on the end-user browser and mobile performance. So they have blind spots — they're unable to see the entire landscape or understand dependencies.
- The majority of firms use more than 10 tools to instrument their IT systems and, on average, have instrumented less than half of IT systems. Ten tools are nine too many screens to switch between — nine too many silos to manage.
Are you faking it?
It seems many companies may claim to have observability, but their practices show otherwise. They have no outcomes that validate its presence. They may not realize, but in reality, they're faking it.
So what does true observability look like? And is it something you achieve or something you practice? In my view, it's the latter — because change is constant. Software updates are pushed into production multiple times a day (daily deployments number four, 50, sometimes thousands of times, depending on the company).
And gaining observability over all the interrelated and interdependent processes, systems, and applications requires ongoing vigilance.
Those respondents who performed strongly across all software excellence markers in the research offer a clue to what true observability looks like, especially when you compare their results with the bottom 25% who performed poorly. Here's how to tell whether or not you're faking it.
So, there you have it — these are the clear indicators whether or not you are practising observability. And if you're like the leaders in the study, your business is benefiting. Because the leaders are outperforming other firms when it comes to software and reporting better performance across various metrics, including financial.
To read more about the research and its findings, see Deeper Than Digital: Why and how More Perfect Software drives business success.