As most of our readers apparently already know, the three keys to preventing incidents like last month’s big Amazon Web Services outage from bringing down your customers’ workloads are redundancy, redundancy, and redundancy.By Rich Freeman
If you or your customers make use of Amazon Web Services, the world’s most popular public cloud, you’re probably still recuperating from the misery you endured a couple of weeks ago when a random administrator’s classic “fat finger” error brought down the S3 storage service—hard—for over 4 agonizing hours.
It was the kind of painful, high-profile incident that plays right into the firmly held conviction of cloud skeptics (and there are still a few of them out there) that solutions like S3 simply can’t be trusted with important workloads.
And maybe they have a point. After all, while multi-hour AWS outages are rare, the SLA for S3 only guarantees 99.9 percent reliability. Which is to say it’s perfectly acceptable from Amazon’s point of view for the service to go down a little over 43 minutes every month.
How many customers will put up with that kind of downtime? We have no idea here at ChannelPro, but figured our readers would. So we asked them (or those of them who offer cloud solutions, at any rate) in our latest reader poll, and here’s what they told us:
Now to be fair, just because Amazon and many other public cloud providers deliver no better than 99.9 percent uptime doesn’t mean their services routinely go down 43 minutes a month all at once. That’s a cumulative figure. But if an online solution were to go down 43 straight minutes, it looks like about one-fourth of you believe most or all of your customers could handle it. Twice as many of you, however, think that less than half of your clients would put up with that kind of downtime, and just under 19 percent of you think that none of your customers would.
So no more cloud for most of you, right? Not so fast. Because as many hosters and vendors and distributors have pointed out lately, most victims of the big S3 crash are at least as responsible for the downtime they suffered as Amazon’s clumsy (and, one fears, possibly unemployed) tech. Because anyone with a well-crafted cloud redundancy strategy can avoid the harmful effects of most cloud service interruptions.
Generally speaking, that means distributing workloads across availability zones, data centers, and/or cloud vendors, which sounds easy enough. But how many of you do it? Only one way to know for sure, so we asked that too, and learned the following:
Good news! The vast majority of participants in our survey keep their cloud workloads available by avoiding a single point of failure. Unfortunately, however, 28 percent of them don’t. And if you don’t either, now’s the time to do something about it. 99.9 percent availability might be good enough for a big public cloud vendor, but a lot of your customers, it seems, hold you to a higher standard.