What Happened on 19 July - The BSOD/ CrowdStrike Incident - What Can We Learn from This
Updated: Oct 7
What is BSOD?
Were you and your colleagues met with the Blue Screen of Death last Friday? The Blue Screen of Death (BSOD) is a critical error screen displayed by Microsoft Windows when the operating system encounters a severe issue that forces it to crash. It’s officially known as a “Stop error.” When you see the BSOD, it means your computer has reached a critical condition and needs to reboot.
In older versions of Windows, BSODs appeared on dark blue screens with detailed text, but in recent versions like Windows 11 and 10, they can appear on screens with different background colours. If you encounter a BSOD, diagnosing and resolving the issue promptly is essential!
The 2024 BSOD/ CrowdStrike incident was caused by a faulty software update distributed by the American cybersecurity company CrowdStrike. So what happened?
Here are the key details:
Date: On July 19, 2024, CrowdStrike released an update for its flagship security product, Falcon Sensor.
Impact: The update caused approximately 8.5 million Microsoft Windows systems worldwide to crash and be unable to properly restart while experiencing BSOD.
Disruption: The outage affected various sectors, including airlines, airports, banks, hospitals, manufacturing, stock markets, and more.
Financial Damage: The estimated worldwide financial damage was at least US$10 billion.
Fix: A fix was released, but manual intervention was required to address the issue, leading to lingering outages.
Remember, even tech giants can face unexpected challenges, emphasizing the importance of robust software testing and contingency plans!
Learning Points
The CrowdStrike incident offers valuable lessons for both the tech industry and the general public. Here are some key takeaways:
Contingency Plans: Small businesses should have contingency plans in place for failures of cybersecurity services. Any lapse can leave them vulnerable to attacks.
Operational Impact: The incident demonstrates the operational impact of IT systems unexpectedly going offline, even for short periods. Being prepared for such scenarios is crucial for business continuity.
Test Updates: Companies may now consider adding a staging step to their update management policies to test updates in an isolated environment before deploying them live.
That being said - we should always be positive and helpful where we can. While we recognise that this incident had affected many, sharing useful updates were some positive outcomes seen on social media platforms such as X and Threads. Journalist and Tech Pioneer, Dave Troy was quick to take it to X suggesting that issue was "easily fixable" but "just a mess".
Comments