Infrastructure stretched by “larger than expected volumes of requests” was once again behind a short Microsoft Azure outage in West Europe this week, the company reports in a status update. The admission comes after an overloaded Redis cache also triggered a 17-hour multi-factor authentification outage in November.
The update this Thursday comes after customers reported service management issues between 15:20 on 27 Mar 2019 and 17:30 UTC on 28 Mar 2019, with failure notifications for some customers trying to deploy Azure resources hosted in the West Europe region.
Microsoft described the preliminary root cause as a backend component for managing regional network resources that “experienced a larger than expected volume of requests which caused the service to reach its operational threshold.” The company did not specific whether this was a Redis cache again.
“This introduced automatic throttling manifesting in service management request failures. In addition, an automatic retry logic caused incoming requests to continue to process causing further issues.” (Does someone need to upgrade their infrastructure?)
A five-hour issue with Azure Data Lake Storage and/or Data Lake Analytics the following day was caused by a “recent deployment task [that] contained a configuration change error, preventing requests from completing” meanwhile.
This has since been rolled back. Microsoft said it will publish a full root cause analysis within days. “Engineers are also continuing to monitor to ensure all customers are now fully mitigated.”