A fortnight after “third-party routing issues” knocked 80 IBM Cloud data centres globally and most services offline for well over three hours, Big Blue faced some sweeping IBM Cloud problems again overnight.
Users were left unable to login to IBM’s Cloud Console in Dallas, Sydney, Tokyo, Frankfurt, Washington DC, London for several hours (that particular problem struck at 11:04 pm and lasted until 1.19am, a status page shows).
Those logged in, meanwhile found themselves unable to provision virtual private cloud (VPC) services, set up new Kubernetes workloads and operate many other IBM cloud services including Watson services.
The company has yet to explain the cause of the outage.
The IBM Cloud issues do not appear, superficially, to have caused significant pain for any customers: a typical such issue with one of the hyperscalers would result in a flood of complaints on social media channels: IBM’s public support channels appear to have been hit with few @s of complaint.
This may be down to usage: IBM accounts for just 1.8% of the IaaS public cloud market share according to Gartner, but has ambitions to build up its presence, particularly in financial services. (It is going to want to start demonstrating greater resilience if it is to persuade customers over at scale.)
The company’s earlier June 9 outage, which lasted for over three hours and took out the bulk of IBM Cloud services was blamed on an “external network provider flooded the IBM Cloud network with incorrect routing”.
This, IBM sad, resulted in “in severe congestion of traffic […] impacting IBM Cloud services and our data centers. Mitigation steps have been taken to prevent a reoccurrence. Root cause analysis has not identified any data loss or cybersecurity issues.”
The company has yet to provide more detail or name the third-party.