Launched in 2006, Amazon Web Services’ Simple Storage Service (AWS S3) has become the most successful cloud storage service by a wide margin. Industry analyst Gartner recently stated that AWS has ten times the capacity of its other 14 competitors — combined. Though it started as "Storage for the Internet" and was created to make web-scale computing easier for developers, it has more recently created a center of gravity for packaged applications as well. There are now hundreds of well-known consumer and packaged applications that work with AWS S3.
In addition to the ecosystem of packaged applications, AWS S3 validated the fact that HTTP is the most flexible storage protocol, and that object storage is the most scalable architecture. AWS S3’s API uses HTTP to both authenticate the application requests and transport the data. HTTP is ubiquitous and works just as well over the Internet or on private networks (like the Ethernet within Enterprises). And the S3 API itself, while rather heavyweight, has proven useful and flexible enough to continuously evolve and satisfy a greater and greater set of application and data requirements.
Object storage is a very scalable way to manage data, as proven by the many trillions of individual objects that AWS S3 is managing as a single geographically distributed storage environment (other large clouds, like Microsoft Azure, also use object storage architectures). AWS S3 also validates the linear scalability and parallelism of object storage, processing millions of S3 transactions per second.
Lastly, object storage also has the benefit of natively supporting descriptive metadata, which can be then used to execute functions en-masse. Some examples include policies to replicate certain types of data, based on their descriptive metadata. Files and file systems don’t natively support such descriptive metadata.
As AWS S3 steadily built momentum in the market, several standards bodies and open source communities tried to promote an open "standard" object API, but failed. SNIA’s Cloud Data Management Interface (CDMI) never gained traction and OpenStack’s Swift still has a relatively small user base. Proprietary APIs from EMC’s Centera and Atmos, and HDS’ HCP, also could not gain critical mass.
After ten years of the Cloud we’re left with a useful and common, but proprietary, defacto cloud storage standard in the AWS S3 API. And we’ve proven that object storage, when implemented properly, is the most scalable and arguably most flexible storage architecture.
How do these facts translate into "hybrid cloud?" While many Enterprises have adopted the cloud for some of their applications and data, many of them still require the bulk of their data on-premises, for performance, control, compliance, or cost reasons. And now that they’ve gotten a taste, they want the same high availability, anywhere accessibility, deployment agility, and scalability of cloud storage on their premises. Internal data growth is one key driver.
As everything has digitized, and data collection and analytics is ever more important, IT storage has a capacity horizon of 10 times or more in a much shorter period of time. In many cases, the need for storage is even greater for external, customer-facing applications. As traditional financial institutions, manufacturers, and retailers compete on a global scale, and with upstart cloud-based competitors, their IT must match the cloud in both usability and computing (and usually storage) scale.
Object storage software and appliances, many of which modeled their designs after AWS S3, have satisfied many of the availability and scalability requirements of Enterprises, but the interface was always a roadblock. For years, many Enterprises could not justify, or didn’t have the skill to develop their applications against a specific vendor API. The ubiquity of AWS’ S3 API, the validation of key technical attributes (e.g. scalability, application support), and the ecosystem of pre-integrated packaged applications, changed that equation.
Businesses can now write an application once, and store their data on-premises or in an S3-compatible cloud. They can do the same with a wide range of packaged applications for backup, archiving, content management, and more. With hybrid choices, customers have more fine-grained control of their expenditure (capital or on-going), risk, and deployment models. With object storage products that also incorporate the scalable routing of the largest public clouds, customers benefit from higher reliability to improve SLAs and reduce maintenance costs.
With truly software-defined object storage products, customers benefit from complete hardware choice, today and over time, improving long-term TCO. Some object storage products also support file protocols natively, bringing even existing applications into the hybrid cloud.
There are still significant differences between object storage products, but in general, in moving to hybrid cloud by leveraging the S3 API and object storage, enterprises can finally experience some of the benefits of a storage architecture and protocols designed for today’s digital business, one that AWS has ridden to great success.