Cloud Storage Basics

By

This article was published by ComputorEdge, issue #2810, 2010-03-05, as a feature article, in both their PDF edition (on pages 11-12) and their website.

Even though computer hardware has become only more sophisticated with time, the essential process is just the same now as it was decades ago when computer operators ("computors") sat at teletype consoles, and manually entered in all the data, which would then be stored on punch cards and later magnetic tape. In every computing era, the basics are the same: data is entered into the system, saved on long-term storage devices, loaded into memory to be manipulated by programs, and the results then saved back out to storage.

Despite amazing strides made in increased miniaturization and speed, each step of the process is just as vulnerable today as it was back in the early years. More importantly, any piece of hardware in the chain can be replaced if it should fail. But the data itself is possibly irreplaceable — particularly if it contains creative work, such as ideas for future products, or a novel that has taken an author many years to write. If this data is not backed up, then the failure of a single hard drive can instantly wipe away a tremendous amount of work and cause untold grief for the individual who has worked so diligently to create the original material, and given no thought to the mortality of hard drives and other storage media.

Even if computer owners are aware of the dangers of data loss, it will do them no good if they do not follow up with implementing a backup strategy. This is but one reason why the computer industry has sought for ways to minimize the odds of data loss by their customers. The latest major development along these lines is "cloud computing".

Not Just for Eggheads

To the average non-technical person, the phrase "cloud computing" may evoke images of angels lounging on billowy clouds, their cherubic fingers dancing on laptop keyboards instead of golden harp strings. Or, people might imagine the phrase to describe some absent-minded computer science professor, with his head in the clouds, lost in thought (or at least lost in the halls of a university building). But in reality, the "cloud" consists of all external computational resources that your PC can access and utilize — whether for storing data or performing calculations upon it. For all practical purposes, a cloud computing resource resides on the Internet.

An excellent example is any one of the Web-based data storage services, such as Amazon S3 (Simple Storage Service) and Mozy. They make it possible for you to back-up your critical data — in encrypted form, to keep it private — so in case your computer does experience a total hard drive failure, you won't lose all of your personal information — except of course for any changes made since the last time you saved your files into the cloud.

One of the reasons as to why the term "cloud computing" resonates with so many tech-savvy people, is that it reflects the fact that the data storage and computation is being done "somewhere" offsite, and yet it is easily accessible — as close as the nearest Internet connection. Fans of cloud resources do not know or even care where the third-party servers and processors are located, nor how the servers' hard drives are backed up, or who is constantly monitoring these processes to make sure that everything is running smoothly. All users care about is that their information is kept safe and available whenever needed.

Beware the Thunderclouds

Innumerable computer industry pundits are proclaiming cloud computing as a truly beneficial revolution in data and information technology. As evidence, they point to the quantifiable advantages provided by cloud computing to organizations of all sizes, which are able to lease data storage and computational capacity from Internet-based providers, thus significantly reducing their required capital investments in computer equipment, software, and support staff. In addition, they may no longer lose money on purchasing licenses for aging programs (and, in some cases, aging programmers!).

Critics respond that any reliance upon third-party vendors for mission-critical data storage and processing, puts any organization at nontrivial risk, should any one of those providers fail, even just temporarily. The horror stories of such failures began not long after the introduction of cloud data storage, and the list grows with each passing year. For example, in August of 2008, The Linkup, a Web-based storage service formerly known as MediaMax, lost up to 45 percent of its customers' data, forcing the service to shutter. Imagine the impact this had upon organizations and individuals relying upon that service. Even the biggest names can fail in a similar manner. The homepage of XDrive states that the service is now closed, but suggests an alternative, Box — presumably for those people who would be willing to go through that process again, with a similar possible outcome.

Proponents of cloud computing may respond that no one should rely completely on any external provider, and that one should always make and protect internal backups. Yet if this is the case, then what is the purpose of paying someone else to perform that same role — particularly if they may be less reliable? Those defenders might reply that The Linkup was a fairly small enterprise, and therefore does not represent the whole cloud storage sector. However, even the most well-known names are not immune to major problems. In July of 2008, Amazon S3 was subject to at least eight hours of downtime, as well as increased error rates, in the United States and Europe. In fact, that was not the first such incident, but rather a repeat of the crisis that occurred in February of that same year, when the service was unavailable for about three hours, bringing down with it all the Web-based applications of other organizations that relied upon S3 — including both Tumblr and Twitter, which had relied upon S3 for storing and serving various graphics files.

There are other worrisome issues with the overall cloud paradigm, including the privacy and protection of sensitive data. For example, cloud-based medical records services — such as Google Health and Microsoft HealthVault — are intended to store huge amounts of personal health information on the Internet, a public network vulnerable to hacking attempts. The potential benefits to consumers are unclear, and may be nonexistent, while the risks of security abuse and breaches, are equally unknown and could be alarmingly high. It can be argued that it would simply be too easy for someone inside such a service organization — or a subsidiary or even a foreign outsource firm — to break into any medical records stored on Web servers.

In addition, defenders of cloud computing must admit that there is always a temptation for any organization or individual possessing that data, to offer it for sale to legitimate companies — such as marketers and insurance companies — or, worse still, criminals who could use it for all sorts of nefarious aims. Also, disclosure does not have to be intentional to be damaging. In fact, it has already happened: In April of 2008, the largest US health insurer at the time, WellPoint, admitted that it had inadvertently released the records of potentially 130,000 customers.

Granted, cloud data storage and computing may have wonderful potential, but it is more clear that there are far too many ways that the cloud-based systems can fail — technically and operationally — with dire consequences. You can utilize these Web services as a secondary backup, but is not advisable to entrust your business or personal information entirely to the cloud, or you may just get rained on.

Copyright © 2010 Michael J. Ross. All rights reserved.

Content topics: