View all newsletters
Receive our newsletter - data, insights and analysis delivered to you
  1. Technology
  2. Cloud
March 3, 2017

A typo by someone with their head in the cloud: Massive Amazon S3 outage blamed on human error

Incorrect command erased more Amazon servers than intended.

By CBR Staff Writer

Amazon Web Services has pointed the finger of blame at human error following the colossal Amazon S3 cloud outage which hit earlier this week.

The company said in a blog that an incorrect command led to the removal of a larger set of servers than intended. That ‘removal’ downed a huge part of the web – CBR included.

READ MORE: Sorry we’re late, AWS cloud disappeared – Top sites knocked offline in huge Amazon Web Services outage

An outage in the company’s Simple Storage Service or Amazon S3 resulted in hampering its clients’ operations for more than three and half hours.

The AWS S3 outage hit its Northern Virginia data centre in the early hours of 28 February, taking websites including Slack, Docker and Soundcloud offline.

Amazon Web Services said in a blog: “The Amazon Simple Storage Service (S3) team was debugging an issue causing the S3 billing system to progress more slowly than expected.

“At 9:37AM PST, an authorized S3 team member using an established playbook executed a command which was intended to remove a small number of servers for one of the S3 subsystems that is used by the S3 billing process.

The company said that the process of restarting the services and running the required safety checks to validate the integrity of the metadata took longer than expected.AWS S3

Content from our partners
Sherif Tawfik: The Middle East and Africa are ready to lead on the climate
What to look for in a modern ERP system
How tech leaders can keep energy costs down and meet efficiency goals

It said: “The servers that were inadvertently removed supported two other S3 subsystems.  One of these subsystems, the index subsystem, manages the metadata and location information of all S3 objects in the region. This subsystem is necessary to serve all GET, LIST, PUT, and DELETE requests.”

AWS’ S3 storage system is used by more than half of the company customers for cloud storage. The system stores three to four million pieces of data, according to the estimates made by experts.

Last year, some of the services offered by Microsoft’s cloud service Azure were hit by a two hour long outage.

AWS was not immune to outages in 2016 either, with AWS suffering from significant error rates which impacted Netflix, Tinder and Wink in September.

Topics in this article : , ,
Websites in our network
Select and enter your corporate email address Tech Monitor's research, insight and analysis examines the frontiers of digital transformation to help tech leaders navigate the future. Our Changelog newsletter delivers our best work to your inbox every week.
  • CIO
  • CTO
  • CISO
  • CSO
  • CFO
  • CDO
  • CEO
  • Architect Founder
  • MD
  • Director
  • Manager
  • Other
Visit our privacy policy for more information about our services, how New Statesman Media Group may use, process and share your personal data, including information on your rights in respect of your personal data and how you can unsubscribe from future marketing communications. Our services are intended for corporate subscribers and you warrant that the email address submitted is your corporate email address.
THANK YOU