Stopping your autonomous cloud get a head-cold with Dynatrace

Profile picture for user mbanks By Martin Banks February 24, 2019
Dynatrace has refined a tool it first demo’d last year to the point where it can track the process steps – and identify any problems – of anything from a single customer’s failed interaction with a web-based e-commerce app through to a complete business management environment running over distributed edge services that include heavy data analytics and IoT systems monitoring.

Customer experience hands on digital globe © photon_photo -
Customer experience is important, no doubt about that, but it also risks ending up a unidirectional process. Vendors take the best guess at what users need from an application or service from what they say to sales staff (if they ever meet them), put on to social media sites or submit in response to occasional customer satisfaction surveys. But maybe the real trick is acquire a more direct and deeper understanding of what users actually do with those applications and services, especially when it doesn’t perform as expected.

And if such an approach can be made to work, there could be scope to take it a good deal further than just an individual customer’s problems. The same techniques could be applied to complete business processes, end-to-end across the enterprise.

This is what Dynatrace was pitching at a year ago when it gave a first airing of Session Replay at its annual conference. Joining the company’s product roster following the acquisition of Chrome, this tool could take a log file of an application’s interaction with an individual user and track through that interaction a step at a time. In this way it became possible to tell what actions the user took, and what the applications’ responses and actions were as a consequence.

There were immediate advantages for improving customer experience to be seen. If a customer emailed or called in with a service complaint – for example a failure in an attempt to buy a product online – tracking through every customer action could readily identify a spot of `finger trouble’, which in turn could lead to a response such as `ahh, I can see what you did wrong, so you need to remember the steps are X, Y and then Z. But while I have been talking to you I have made the necessary changes and your purchase has now gone through and should be with you tomorrow’.

One satisfied, and hopefully happily impressed customer results, or at least that should be the goal.

Going wider, deeper

But there are wider, and deeper, implications to such an approach that Dynatrace sees coming available. Taking the above example imagine a different result: such as the customer’s actions were not in error, but instead managed to hit on a bug in the application code. The impact on the customer experience is the same, but the impact on the business could run deeper, and be more consistent.

Indeed, the ability to replay whole business processes slowly, identify problem areas and re-design them so that they work more smoothly, more consistently, faster and with fewer (hopefully zero) errors not only creates a better customer experience, but also improves the efficiency of the process itself, with all the potential benefits of cost savings and more reliable operations. And `business process’ need not be just one specific task such as checking customer credit card details. With the addition of AI tools it can grow to be all aspects of business management.

To that end, Dynatrace not only recently launched the production version of Session Replay, but included it as part of an upgraded version of Davis, its AI-based operations management toolset. It includes fully automatic deployment and continuous auto-discovery, which allows it to fit in with fast changing, dynamic environments rather than just be a point solution. Problems can therefore be tracked, end-to-end, through an entire cloud stack. This, the company claims, makes it particularly useful for exposing problems and errors in DevOps environments.

It is also an integral component of the company’s Digital Experience Management (DEM) solution, working together with Real User Monitoring and Synthetic Monitoring tools. Here, the Davis AI engine is exploited to help users understand where problems lie and who is affected.

Davis, in turn, has been upgraded to ingest custom metrics, data and events from third-party solutions such as CI/CD and ITSM tools, including F5, IBM DataPower, Citrix NetScaler, ServiceNow, Puppet, and Chef. It also has new algorithms for detecting performance variations without theneed for thresholds or baselines. Better grouping of disparate alerts and single root cause determination now comes from full stack, high fidelity data analysis.

It is also now possible to use Davis as the driver of auto-remediation workflows and self-healing tools using its range of problem identification, impact analysis and root cause analysis tools.

Getting edgy

The ability to scale the capabilities of Davis and Session Replay comes at a time when the need to virtualize and distribute compute capabilities around corporate networks – increasingly virtualizing the data center – is starting to gain momentum.

Their availability should, therefore, give users the tools needed to check the accuracy and efficacy of business processes running across these complex distributed environments, as well as playing a major role in identifying exactly where a process is either broken or open to performance improvement or optimization.

Bernd Greifeneder, SVP and CTO, explained that this capability stretched to coverage of IoT devices, giving users the ability to examine, end-to-end, business processes and operations that mix IoT device input, such as production system sensors, with business management applications such as ERP and CRM systems.

“We wanted to launch Session Replay exactly with that kind of scalability in mind, for those high volume end users. Making this ready for such large enterprise workloads is what took us a year till now. It is clearly the end user experience that is the important factor these days, it is no longer just about e-commerce. That is a key trajectory that our customers have. They need to become agile, in order to be quicker with their online services, quicker in order to have better functionality, and deliver a better user experience.”

Greifeneder sees far wider application of Session Replay in its released version than when it first appeared last year. Then it seemed primarily targeted at those tasked with applications bug-hunting, and it will still certainly have an important role to play there. But he also sees it being a big part of the applications development process, with user experience teams, biz dev ops teams, and business analysts joining its user community.

The ability to analyse specific user groups and how they operate will allow businesses to filter and segment them, then identify what changes may be needed to help improve applications usability, which is a key aspect of improving end user experience. And he did observe that while the obvious target is the end user consumer, the concept of user experience extends right across businesses, including their own staff.

“This is what we are calling Behavioural Intelligence, and is a trajectory we are following, built on what Session Replay gives us.”

He indicated that this has not yet been released as complete package, but acknowledged that this will go much further than analysing consumer behaviour and their interactions with web applications. This would then make extensive use of the updated Davis AI system to provide very precise control and management of increasingly autonomous cloud environments – which themselves will become the only practical way of running large, distributed, hybrid cloud-based services. These will have to be autonomous and highly automated, and will require equally autonomous and automated health protection and healing services if they are to survive in operation.

My Take

When I saw that first demo of Session Replay a year ago I could see how it might empower Level One help desk staff to really help customers and, regardless of where a process had screwed up, successfully unscrew it and create a happy customer. That role is still there but the new goal of building a body of behavioural intelligence about customers, staff, and the systems themselves, could give businesses a tool with which to hone their business processes and really polish up their reputation with users. And as cloud environments get bigger and more amorphous, the idea that humans will be able to `hand-manage’ them becomes more laughable. These systems will need to be autonomous in management decision-making and automatic in execution of those decisions. And when things do go wrong, having something that can trace the wrongness, identify it and manage its rectification will be essential. Doing THAT by hand in the cloud will be a killer for any business.