Dude where's my data? Mitigating against the cloud application nightmare scenario
- Summary:
- Brian Sommer provides answers to the cloud application nightmare scenario that almost no vendor wants to discuss.
Background
When you subscribe to cloud application software services, there will be lots spoken about the security of your information and how the vendor can recover your information should something happen to their primary data center or the hard drive containing your information. Most cloud application vendors have all kinds of redundancy built-in to cover minor-to-catastrophic events that are mostly under their control or part of the cloud provisioner’s service mix. I don’t think it’s enough.
It’s great that they have failover centers in case a colossal earthquake, fire, hurricane, etc. takes down their primary data center. Many firms promise you’ll be back up and running in 2-24 hours with a maximum data loss of fewer than 2 hours of information entry. That’s a good thing.
And, some vendors can help you in the event that some current employee fat-fingers does something and accidentally deletes your customer master file. They may charge you an arm and a leg (and it could take two weeks to fix!) for this service but it can be done.
But, most vendors I’ve reviewed offer scant more. This is worrisome.
Current remedies
What do vendors recommend your firm should do? Most vendors suggest that you create some scripts or custom programs that create flat file (or CSV or some other extract) dumps of your data every day or every few hours. Your firm has to write these programs because few firms seem to offer these.
That seems impractical….
Some vendors suggest you get software from a third party that will backup up your data for you. What I’ve deduced is that these third party solutions do the backup just fine. It’s getting the data back into the application that’s a mystery.
For those of you who grew up dealing with database backup and recovery, you already know about concepts like rolling back updates, sync/check points, rolling forward updates, etc. You’re probably NOT going to see this in the cloud world.
That seems odd…
Then, you find out that many products don’t come with a recovery program either. You have to reprocess all your data again using the same programs that handled the data originally. You can’t rebuild or repopulate a database. In fact, you will probably have no idea what the underlying structure of the persistent storage database looks like. You quickly discover that your ‘backups’ need to look like online transactions and be reprocessed as same.
This situation gets downright ridiculous if your firm is some monster entity with scores of transactions processed every single day. What if there’s a major recovery event needed ten years down the road? Are you supposed to reprocess ten years of transactions? Why can’t you restore your databases? Some vendors will do that on an ad-hoc basis but shouldn’t that be part of their offering? Shouldn’t that be part of the peace of mind assurance they offer?
If not then that's wrong…
Ironically, the utilities that the vendors’ have to recover their customers’ databases are rarely available for customers to use themselves. While I get it that customers shouldn’t need to have database level access to the cloud solution, how can a customer recover quickly without these? Likewise, how can a customer rapidly create a new instantiation of their data without these tools?
That is impractical too…
Getting practical
Let’s take this out a couple of realistic steps. What’s a customer supposed to do when for example, the vendor suddenly goes bankrupt/out of business? This week, two web services I use announced they are calling it quits. One gave me 12 days notice to do something different. The other service gave me none.
In the on-premises world, customers could at least demand a copy of all source code and, failing that, could stipulate that an escrow service maintain a copy of the software for a customer should the firm fail. What this meant is that should the vendor disappear, the software and data remained on the customer’s data center. While upgrades were no longer forthcoming, there was no interruption to the business. Plus, there would be time to seek another solution.
When a cloud application software vendor fails, the customer is screwed. They have no copy of the code upon which to fall back. Worse, they have no fast way to port the information to another vendor’s solution. The business is disrupted – possibly in a fatal way.
Now, you might argue that this scenario is far-fetched or exceedingly rare. Maybe, but what executive would assume such risk without a contingency plan?
An escrow solution is needed and now!
How about this scenario. A cyber terrorost organization or state sponsored cyber thug takes your provider down. When a vendor’s site is taken down, or your data is compromised, your firm will want to be operational again fast. Where else can the vendor’s software operate other than the vendor’s primary and failover sites? The grim answer in many cases: nowhere else.
Software vendors need to create distinctly different (and somewhat disconnected) sites that are like failover sites of last resort. These should be independent of the vendor. How well vendors can support this seems to vary – check it out.
In another example; your vendor’s solution uses in-memory database technology as its primary processing/storage mechanism. The vendor may use a persistent storage database to hold older, slower or a backup copy of data. If you want to make a copy of your firm’s data in this environment, understand that it might be a huge data file and that some latency will exist between the data in use (in-memory) and the archival information on the persistent drives.
While none of this should be a major technical issue, how’s your IT shop going to know that it is backing up the right data and where it should be getting this data? This is especially tricky if the vendor has no utility to give customers for this backup activity.
And, suppose you’ve got to restore or re-populate one of these in-memory databases. How long would that take and how do you do it? These are questions that need more practical answers.
Finally in the nightmare scenario setup, you have a malicious/disgruntled (ex-) employee who goes online and deletes large numbers of records, sub-ledgers, master files, etc. – How do you recover all of this? The stock answer from several vendors is to call the vendor as soon as you detect the problem. That might work, and the vendor might charge a lot (or take their time resolving it). One vendor challenged me and said who would ever let a person like that get Administrator level privileges? I see it happen enough to know it needs protection.
Counsel
Before you sign that next cloud application subscription deal, do the following:
- Understand what the vendor provides and, more importantly, what they think you’re on the hook for personally.
- Determine if you’ll need to acquire specialized backup and recovery tools. Backup only tools solve just a small piece of the problem. If you can copy the data, how do you get it back into the system and how do know what records you need to recover within that file?
- Get contractual assurances as to how you’ll access both your data and the application software in the event of a catastrophic event. Hopefully, you trust your cloud application provider. It’s some miscreant hacker, a disgruntled or careless worker that are the folks to guard against.
- Develop some potential backup and recovery scenarios to see how well and how elegantly the vendor can address each. Be sure to consider how much data you’ll have in the system over time and how many new data types (e.g., image files, scanned documents, sensor data, etc.) will also need backing up and recovering.
Image credit: Industry Leaders