A Microsoft Azure outage on July 30 was triggered by a distributed denial of service cyberattack, the tech giant has confirmed.
It comes after users started complaining they couldn’t access several Microsoft services yesterday, including Microsoft 365 products such as Office and Outlook and Azure.
The incident—which lasted nearly 10 hours—took place less than two weeks after a CrowdStrike update caused Microsoft Windows machines to crash. Companies affected by the new outage include U.K. bank NatWest, according to the BBC.
What Happened At Microsoft?
The incident started at approximately at 11:45am UTC and was resolved at 19:43pm, according to Microsoft’s Azure status history page. According to Microsoft, a “subset of customers may have experienced issues connecting to a subset of Microsoft services globally.”
Impacted services included Azure App Services, Application Insights, Azure IoT Central, Azure Log Search Alerts, Azure Policy, as well as the Azure portal itself and “a subset of Microsoft 365 and Microsoft Purview services.”
Microsoft says the “initial trigger event” was a DDoS attack, which sees adversaries flood services with traffic in order to bring them to a standstill.
Microsoft describes an “unexpected usage spike” which resulted in Azure Front Door and Azure Content Delivery Network components “performing below acceptable thresholds, leading to intermittent errors, timeout and latency spikes.”
Most firms have protection in place to prevent DDoS from having an impact. The initial DDoS attack had activated the firm’s DDoS protection mechanisms, but an error in the implementation of defenses “amplified the impact of the attack rather than mitigating it,” Microsoft admits.
It appears that the outage was caused by DDoS attack—despite the fact Microsoft had protections in place, says Sean Wright, head of application security at Featurespace. “Similarly to the CrowdStrike issue a few weeks ago, it appears that an error occurred in the software that was used to protect against DDoS attacks,” Wright says.
This is highlights the importance of testing software thoroughly, he says.
What’s Next?
The CrowdStrike incident had already—and unfairly—created bad optics for Microsoft, so the timing of this new outage is unlucky. Microsoft knows this and has communicated clearly throughout the outage, saying it will publish a Preliminary Post Incident Review within approximately 72 hours, to share more details on what happened and how it responded.
To get notified when that happens and to stay informed about future Azure service issues, Microsoft advises you configure and maintain Azure Service Health alerts.
For now, it looks like Microsoft services are back up and running.