The adoption of cloud computing has been at a moderate pace despite having plethora of cloud service providers offering various aspects of cloud computing services. One of the primary reasons of this is the concerns over loss of data in the cloud where the cloud service customers do not have any control on where it gets stored and how it is secured. Nevertheless there are certain measures that significantly minimize, if not rule out, the possibility of data loss on the cloud. The onus of taking measures to avoid possible cloud data loss is on both the cloud service provider and also the cloud service user. http://smartwebsiteideas.com/
In the sections below, let us review the aspects of cloud data loss/data unavailability mentioning some real life cloud outages observed in the recent years. Approach to handing features like data backup, redundancy and protection may indirectly lead to risky scenarios of data loss or data unavailability as we will discuss shortly.
Cloud Storage – Data unavailability
Cloud storage is the concept of hosting the networked storage at a service provider’s data center(s) and making it accessible via web based interfaces (APIs) ubiquitously on a pay-per-use model by means of a connection to the internet. Obviously storage on the cloud takes away the overhead of managing storage infrastructure in-house. However there are inherent risks of unavailability of data hosted on cloud even at the best of the service levels. This unavailability may be transient – as in data is not accessible temporarily, or may be permanent – as in complete data loss due to severe outage at the cloud storage provider’s premises.
There are two different kinds of issues causing data loss/data unavailability – one is due to natural calamities such as lightening, storms or earthquakes causing power failures, external network connectivity failures etc. resulting in damage to the cloud data centers and the data; another is due to human errors in configuration, maintenance operations or unhandled exceptional failure scenarios in the automated scripts that are meant to support the failure recovery processes. In any case the bottom line is the impact to the customer business, minor or major, depending upon the kind of cloud storage service being used.
Cloud service outages
There have been many outages reported in the recent past by almost all major cloud storage vendors. Most of them, caused in the cloud service providers’ premises, were transient with no loss of data and were caused by configuration errors or hardware failures (Disruptions to e-mail services provided by Google, Microsoft, Yahoo and the downtime of online services provided by AWS, Intuit, Salesforce.com); however there are instances of outage causing permanent loss of user data (Refer to the related InfoWorld article on Microsoft losing sidekick users personal data). Such occurrences raise questions on the viability of cloud computing. Nevertheless in most of the cases, implementing best practices exhaustively across the cloud service providers’ premises and periodic audits to ensure compliance will drastically reduce the possibilities of outages due to configuration errors.
A recent outage at AWS data center in Europe was root-caused to have occurred because of power failure; the backup power supply could not kick off in time due to a technical failure causing disruption of services. A large number of storage volumes had to be recovered using offsite storage which took considerable time for recovery of services. Summary here is that sometimes the components meant to provide high availability may fail causing data unavailability. A cloud storage site should be equipped with additional backup power to prevent failure in case of a power outage.
Cloud based backup
A popular use case of cloud storage is the data backup. It has cost advantages in terms of relieving one from the infrastructure maintenance overhead and provides the ability to elastically scale the storage based on requirements without having to procure any hardware / software. However there are certain limitations that need to be addressed. When the data to store in the cloud is of the order of hundreds of gigabytes, then the organization (or even the individual user for that matter) needs to rethink of their decision to use cloud storage. The internet service providers mostly impose limits on consumption of bandwidth or capped usage. In such a situation, the restore time for retrieving such huge data on a limited bandwidth internet connection takes ages, defeating the whole purpose of cloud backup. Hence the size of data and the restore time are important factors to be taken into account before using cloud based storage backup services.
In order to leverage the cloud storage model, different approaches can be considered. First, one can maintain a copy of data locally on the disk before pushing the data on to cloud storage for backup. This approach ensures that the data being backed up is readily available locally on the servers for quicker restores when needed. Second, in addition to having cloud storage from one vendor, it is advisable to opt for another cloud storage vendor for replication of backup data, to account for the case of data loss from the first cloud vendor however remote the possibility may be. Obviously these options will drive the costs up but it is worth the effort to have readily available data recovery option.
The cloud storage services are typically offered with multiple levels of redundancy at different price points. Based on the type and the criticality of data being stored on cloud, the organizations need to decide upon the level of redundancy needed. Higher redundancy always implies a greater premium but at the same time, reduced redundancy generally means lower costs but greater risks of data loss. Hence the organizations should strike a balance between the cost and the risk.