From The Verge: CrowdStrike has published a post incident review (PIR) of the buggy update it published that took down 8.5 million Windows machines last week. The detailed post blames a bug in test software for not properly validating the content update that was pushed out to millions of machines on Friday. CrowdStrike is promising to more thoroughly test its content updates, improve its error handling, and implement a staggered deployment to avoid a repeat of this disaster.
CrowdStrike’s Falcon software is used by businesses around the world to help manage against malware and security breaches on millions of Windows machines. On Friday, CrowdStrike issued a content configuration update for its software that was supposed to “gather telemetry on possible novel threat techniques.” These updates are delivered regularly, but this particular configuration update caused Windows to crash.
CrowdStrike typically issues configuration updates in two different ways. There’s what’s called Sensor Content that directly updates CrowdStrike’s own Falcon sensor that runs at the kernel level in Windows, and separately there is Rapid Response Content that updates how that sensor behaves to detect malware. A tiny 40KB Rapid Response Content file caused Friday’s issue.
Updates to the actual sensor don’t come from the cloud, and typically include AI and machine learning models that will allow CrowdStrike to improve its detection capabilities over the long term. Some of these capabilities include something called Template Types, which is code that enables new detection and is configured by the type of separate Rapid Response Content that was delivered on Friday.
View: Full Article