The cost of IT outages: from the sell side
- Summary:
- IT outages can cost end user organisations dear in both money and reputation. So when failure occurs, is it fair to blame the supplier for not living up to a vision of 99.999% uptime?
In part one of this short series of articles, we looked at IT outages from the buy side perspective. In this second piece, the sell side expresses its view courtesy of three IT suppliers at a recent roundtable discussion hosted by global law firm Pinsent Masons.
The Sell Side Perspective
Exponential-E, which works with some of the UK’s leading enterprises including Fidessa, the Interquest Group, Channel 4, Fulham Football Club and the London Capital Group, risk mitigation - cited by end user organisations as a key factor - is all about thinking ahead:
For Simon Acott, director at"A lot of this is about planning. You need to understand how you want to scale and why are you scaling? You've got to look at this and ask 'what's the impact of moving from one infrastructure to another?'.
"If you don't plan something properly you're to end up missing something. Servers don't sit there and say 'I'm not having a particularly good day'. They do what you ask them to do."
Acott adds that it's important that the buy side has realistic expectations of the sell side and the uptime commitments they can reasonably expect:
"You can achieve 100% uptime possibly, but financially can you afford to achieve that? If I said to you that the IT system will not fail for four or five years, the reality is that I'm going to be wrong. It will fail.
"Achieving 100% uptime is either not going to happen or it's going to be staggeringly expensive. It's finding the balance between the two.
"What do you really need to achieve and how much money are you prepared to spend to that based on relative risk of what happens if you don't? How much are you prepared to find the balance between risk and reward?"
Risk averse?
Damian Saunders, director of the cloud platforms & networking group at Citrix, cautions that while risk mitigation is essential, end user organisations must not become risk averse:"In the past, IT was seen as a back-office function which was very much mission focussed. Today, it’s a business critical mechanism that has the power to drive business success and enable innovation, but only if viewed as so by senior management.
"Many businesses follow a risk-adverse policy to embracing technology trends, e.g., the public sector, but in doing so may inadvertently be leaving themselves exposed to risk.
"Being a tech-laggard may seem risk-adverse but it leaves the business blind to assessing how they manage their technology, and any new solutions which may provide better protection for their business against IT outages.
"A business can be caught out by the speed of change which occurs in the technology industry, leaving them exposed to risk. Having the right management matrix in place could better protect businesses against an unforeseen IT outage.
"The presence of an IT executive in the boardroom is critical to this."
He adds that that planning for failure is critical to the prevention or mitigation of such failures and that this needs to be an ongoing process:
"By assuming failure, and planning for it, a business can put more controls in place around change management, knowing in advance where its high risk pain points are, and isolating them.
"If a business knows how quickly it can get systems back-up and is able to quickly determine how an outage happened, the impact of an outage can be mitigated.
"It’s important to note that creating a build library to kick-in in the event of an outage is not a one-time event. It needs to be reviewed and evaluated constantly to ensure fail overs - server VM, data center, ISP, as appropriate - are up to date, and business pain points are current."
Disaster recovery
Having an up-to-date recovery plan is essential, agrees David Bickerton, vice-president EMEA for the ITO Portfolio at HP Enterprise Services:"Time is a critical factor in an outage. As soon as something ceases operation, there will be a back-log of processes that will mount up. The longer the outage, the more difficult it becomes to recover and untangle these processes, the situation becomes exponentially worse.
"To give an example in the banking world – an individual has numerous in-comings and out-goings attached to their personal bank account – salary, direct debits, mortgage payments, etc. In the event of an outage these payments are delayed, which has potentially severe knock-on effects to all involved.
"It is critical for a business to document its systems and recovery processes. Having a recovery team in place and ensuring strong internal knowledge of processes will tie directly to how much impact an IT outage will have."
Bickerton suggests that the conversations between buy and sell side are still held back by financial considerations rather than long term strategy:
"Suppliers are getting smarter about risk and are now having grown-up conversations with their customers about the subject. It seems the sticking point is the internal cost pressure placed on customer organisations to do more, with less.
"Often, CIOs report into the Finance Director, who tends to focus more on the price tag of an IT solution, rather than the strategic benefit it could bring to the company – such as planning for an outage. Economics is often regarded above all else - until something goes wrong."
But he adds:
"I don't like the notion of blame. It's the wrong kind of conversation. The CIO has a really interesting challenge, especially in the economic world we live in today when there is forever that cost pressure.
"I think that the CIO's challenge is to make sure that accountability is discharged all the way through the supply chain, probably with a greater degree of accuracy than has been done in the past."
On the subject of service level agreements, Bickerton advises rethinking their role:
"SLAs have traditionally been used as a control framework and you probably find more of a penal framework described. You've got to come back to this: what are you trying to achieve? I don't think any supplier wants to sign up to SLAs it can't achieve.
"But I think there is a tension there between what you are trying to design for and how you want to control it. In my 25 years in IT, I've seen a lot of discussions around SLAs and quite often adult supervision goes a long way towards people actually understanding what risk they're trying to transfer and what they're trying to manage."
This is where the assistance of external legal advisors can be essential, he concludes:
"A big way legal can help the client community is in understanding the transfer of risk. It's really important for any client to understand what risk they are transferring. If the pressure is cost alone, that's the wrong reason."
In the final part of this series, we explore the legal and contractual aspects of IT outages.