Disaster recovery and the cloud should be a match made in heaven. Take a function that enterprises love to hate and address it with an outsourced, efficient cloud service that makes it easier and less expensive to reach recoverable nirvana, and presto - instant success. Well, not so fast.
During the past 12 to 18 months, analysts say the disaster recovery as a service (DRaaS or RaaS) market has exploded, with legacy DR vendors pivoting to offer cloud-based services and a crop of new pure-play cloud providers leveraging automation and shared resource efficiencies to break into the game.
But with a nascent technology in a critical industry, analysts say enterprises should tread carefully. Some question if cloud-based DR systems have proven they work at scale and wonder if they've yet earned their stripes in helping companies recover from major disasters. Plus, there are a whole new crop of challenges and technical specifications that need to be considered when implementing a cloud-based DR strategy compared to traditional methods. All that adds up to a big caution sign for the enterprise.
AVOIDING DISASTER: Can you trust data-recovery service providers?
DR has gone through an evolution during the past few decades. Collocation and managed service offerings began replacing tape-based DR solutions in earnest more than a decade and a half ago, says Forrester DR analyst Rachel Dines. And just in the past few years cloud-based services have emerged. But are cloud services really new, or just existing managed DR services redefined using the latest buzzwords?
DRaaS is fundamentally different managed hosted DR options in two key ways, Dines says: automation and multi-tenancy. DRaaS is almost fully automated, she says, requiring little human intervention to kick off the recovery process, provision the virtual machines and spin up the necessary applications from the correct storage bins. This is in contrast to a managed service DR plan in which there may be a checklist of technical steps a system admin has to perform. That automation, she says, can significantly improve the speed of recovery and personnel needed to execute a recovery plan after a disaster.
The second difference is the multi-tenancy aspect. When providers can support multiple customers from the same server racks, it creates efficiencies, which lead to generally lower prices compared to managed DR services on dedicated infrastructure for individual customers.
DRaaS began rolling out with four or five providers who retooled their systems for cloud-based offerings a few years ago, says John Morency, Gartner's DR analyst. Now the market has ballooned to more than 100 vendors, ranging from startup pure-play DRaaS vendors to legacy DR heavyweights.
Morency estimates DRaaS is a $425 million to $450 million industry, still only a fraction of the overall $3.5 billion DR market. But Dines characterizes DRaaS as "a pretty big game changer that's shaking up the industry. It's a paradigm shift offering better recovery objectives for the same or less as traditional or outsourced models."
The arrival of the cloud services, she says, is coinciding with organizations looking to update or renew their DR practices to keep pace with their adoption of virtualization technologies.
The market has become an eclectic mix of traditional DR vendors, such as IBM, HP and SunGard who are each now offering cloud-based DR, while some other big players have taken to buying up or partnering with specialists. Dell, for example, teamed with Nirvanix, and CenturyLink scooped up Savvis to get into the cloud and DR market. Then there are the pure-play and startups that have focused their business on DRaaS, including Geminare, Bluelock, Doyenx, eVault, nScaled and Hosting.com, among others.
WHEN DISASTER STRIKES: After a hack: The process of restoring once-lost data
The new cloud recovery model, however, brings new challenges. One of the biggest is bandwidth, especially for customers that have highly transactional apps, Dines says. DRaaS involves placing copies of applications and virtual machine images in the provider's cloud. When a disaster strikes, those apps and VMs are automatically brought out of storage and spun up. This allows users to not avoid paying for the reserved instances of VMs, or having dedicated infrastructure in a managed service model. The problem is those apps and VM images must be constantly updated. "You need to keep your DR site up to date, and that could mean moving large amounts of data daily," Dines says.
Apps with high rates of change, anything above 20% daily, may be uploading a lot of data to the cloud each day, she says, which could strain bandwidth. Because of this, Dines says the most common apps supported by DRaaS tend to be less complex ones that can be easily booted cold from a VM, and especially ones that already run on virtual machines. CRM, ERP and HCM apps may fit the bill, but highly transactional databases may not.
To deal with the bandwidth issue many providers team with a partner with a WAN optimization firm, or use types of caching to only send selective updates to the cloud, Dines says.
Richard Cocchiara, IBM's CTO for business continuity and resiliency services (BCRS) - one of the legacy vendors that is pivoting to offer cloud-based DR services - says the company works with customers to provide estimates of how much data upload/download will be needed as part of the vetting and customer acquisition process.
Another concern in the cloud is around providers overprovisioning their facilities. In a DRaaS model, providers typically support customers in a multi-tenant environment. While a provider may be able to accommodate a disaster that befalls one customer, can they adequately support multiple customers if the disaster is regional? Morency says DRaaS just hasn't been proven at scale that they can survive a major 9-11 or Katrina-like disaster. It's one thing to restore 15 to 20 VMs during a test, it's another to have hundreds of customers all declaring disaster at the same time and expecting a two-hour recovery window, he says. "That can be tight for a single organization, let alone a provider serving hundreds of customers at once," Morency says.
Some companies want priority access to ensure their systems are restored as fast as possible, but Morency says many providers instead restore customers based on a first-come, first-serve basis. That's how Cocchiara says IBM runs its DRaaS but he notes that the company has invested significantly in analytics to predict the likelihood of when disasters may happen and the company prepares its data centers for such events by shutting down testing of equipment and focusing server space completely on restoration services.
Enterprises also have to consider what data and applications they're willing to have run in the cloud if they do declare a disaster. Some organizations may have compliance issues and need to ensure their data is encrypted when it's stored in the cloud, for example, Morency points out.
DO IT YOURSELF DISASTER TESTING: Netflix uncages Chaos Monkey disaster testing system
All these issues add up to enterprises not being quite ready to fully jump on board, Morency says. "The penetration into the larger enterprise market has been slow at best," he says. "They have their own approaches, their own methodology [to DR], and being able to port a lot of that [to the cloud], from a test and procedure point of view, is going to be less than straightforward for a lot of organizations."
That's resulted in many enterprises taking baby steps to DRaaS, Dines says, moving highly virtualized environments of medium criticality that do not yet have robust DR plans in place to the cloud. The speed derived from automation, combined with the cost savings from a multi-tenant environment can yield attractive cost-savings for customers compared to managed hosting or collocation DR options, which is driving interest, she says. As the industry matures and advancements are made, trust will be developed and customers will become more comfortable with using DRaaS for more mission critical apps, both Dines and Morency predict.
Cocchiara, the IBM rep, says they're already seeing customers embrace the cloud. "To me, this is an exciting development and new opportunity for the industry," he says. "Before, when we had customers who said I've love to use your service but I can't because I need to be recovered in hours, not days, now we can say we can recover them in minutes." The increased automation of DRaaS compared to managed service DR plans make this possible, but with that comes a whole set of new issues as well, such as bandwidth, compliance and scaling. Cocchiara says despite that, there has been enterprise adoption of DRaaS. Not everyone is ready to jump on board with the cloud though, he admits, which is why companies like IBM and legacy DR providers still offer the managed service and collocation options.
Network World staff writer Brandon Butler covers cloud computing and social collaboration. He can be reached at [email protected] and found on Twitter at @BButlerNWW.