Thames Water is a private utility provider of water and waste management for the Greater London area, also covering areas that include Surrey and Kent. And whilst residents probably don’t think about Thames Water too much, beyond turning on the tap or paying their bills, it’s unsurprising that the operations management of the company is incredibly complex.
However, Thames Water is exercising greater control of its systems and engineers, with the aim of not only getting a grip of costs in a new multi-cloud world, but to also improve the experience for employees and customers. Less downtime and improved performance is front of mind.
This is being enabled because of a decision to go from being massively outsourced with a number of well known companies - including Accenture, IBM and Deloitte - to being largely insourced. Part of this plan was to bring the company’s ServiceNow Service Desk instance in-house, out of the hands of Accenture. The outsourcing agreement meant less flexibility, a lack of innovation and reduced insight into the running of the company, according to Philip Taphouse, Senior IT Operations Management (ITOM) Architect at Thames Water.
Thames Water had initially contracted with IBM to move the utility provider to an in-house instance of ServiceNow, but they did so with an IBM configuration installed on top of the out-of-the-box platform. Taphouse joined halfway through the project and quickly recognised that Thames Water would be getting less value out of the IBM config than if it went with a vanilla ServiceNow implementation, where it could take advantage of regular upgrades.
The implementation of their instance on to the IBM programme wasn’t going very well, unfortunately. We dragged it over the line because contractually we had to get off the old Accenture instance. But very quickly afterwards decided that we were going to go with a brand new, fresh build of ServiceNow, which my team led and implemented.
Once the decision was made to make a clean break, the turnaround was relatively rapid. Thames Water was on a brand new Kingston build (one of the ServiceNow releases). It had fresh ITSM, no custom workflows, no custom forms, a new catalogue, and “everything was running nicely”, according to Taphouse.
One thing led to another
Once live with the ServiceNow platform for ITSM, Thames Water recognised that it could extend this further and use the SaaS product to implement a command centre for technology operations. Taphouse explained:
The command centre is that age old CTO vision of having one screen to look at, to see what’s going on in their universe. The end game was all about automating and orchestrating stuff.
Our aim is to shift everything into the cloud where possible - that will either end up in Azure or a hybrid cloud. The only reason something would stay in hybrid is because it would be pre-2016 operating system, because you can’t put anything pre-2016 in Azure.
Thames Water decided to go all in with ITOM from ServiceNow, which gave it discovery, operation intelligence, cloud management, orchestration and service mapping. It also integrated the platform with AppDynamics, which coupled together meant that the command centre could get insight into the code and database operations too. Taphouse said:
You can then start to see in one place what your infrastructure looks like compared to your application, where the actual error is, but then binding specific events down multiple layers. So rather than just saying there’s a database on this server, you can say there is a database error on the database instance which runs on that server, with this database queue. You’ll be able to look at it all, across multiple applications - they’ll be able to see the problem, be able to jump on it and clear it.
Future plans also include increased automation. Taphouse said that initially human intervention will be required, to “get comfort levels up”, but more automated decision making will be introduced. He said:
So, for example, if I did a server restart on the same server five times successfully, then you are allowed to do that without approvals. So if we can prove that a script runs successfully five or six times, then we will say, right we want to automate that.
Taphouse explained how this will impact Thames Water engineers, who typically were having to react to problems once they happened, rather than proactively recognising an issue. Taphouse said:
So if you’ve got sewage flowing down the street, that’s going to guide our engineers to where they are and what they’re doing when they get there. If that goes down, we’ve got 20,000 people sat there twiddling their thumbs. That’s not a great scenario.
So with AppDynamics doing all of the APM and server insight stuff, we are able to collect millions of metrics to what’s actually going on with that application. So CPU threads, services that are running, memory usage, etc. So we can start to see if there are any abnormalities. We pump all of that information straight into ServiceNow and then that can look, combined with the machine learning stuff, so it starts to understand what normal looks - then based on stuff that’s happened before, or based on predictive algorithms, we can say that ‘if this continues to happen in the next 20 minutes, this application will go down’. At that point we will send a notification out letting the engineers know they’ve got half an hour to act. We can get the right people there.
Equally, the improved uptime and better management of operations are likely to be felt by both employees and customers too. He added:
I think the benefits will be three-headed. We will have internal time saving. We will have customer experience from an internal employee experience - if we don’t lose payroll applications, fix it before it happens, etc. Our engineers can also continually move, so that should help our end customers as well, getting them there on time, on site, knowing what they’re doing.
A cloudy future
Thames Water has some very interesting plans for how it intends to use the ServiceNow platform in the future. For example, it is hoping that it can introduce pipeline sensor data into AppDynamics and ServiceNow, which obviously has opportunities for improved maintenance in the future. Taphouse said that he’d like to “open the floodgates and let everything through”.
However, what’s also interesting is that Taphouse eyes the ServiceNow platform for creating a brokerage system for a multi-cloud environment. He explained:
The next evolution, and we are working on a proof of concept, is a true cloud brokerage system. Yes we signed that big deal with Azure, but there’s no reason we can’t leverage AWS or Google Cloud as well.
If you look at Azure the storage costs are huge. So if you want a low compute powered server with four terabytes of disk, actually AWS is probably your best bet because of cost. Based on those options, we would automatically choose the best place to put it, keeping tabs on cost in real time. You can de-provision and then re-provision, all of it done through ServiceNow.
And Taphouse warned other technology buyers that this advanced management of cloud and technology in-house is essential - or else things could get out of hand. He said...
Every CIO or CTO, if they’re not worried about cloud sprawl, they should be. How are you going to control that? That’s another massive area where we have leveraged the ServiceNow platform. The only place you will be able go in Thames for a virtual machine will be in the ServiceNow portal. If you’re not doing that kind of thing, be prepared to go and find five or six AWS/Azure engineers, quickly, which will cost you a lot of money.