To Fix CrowdStrike Blue Screen of Death Simply Reboot 15 Straight Times, Microsoft Says (404media.co) 173
Microsoft has a suggested solution for individual customers affected by what may turn out to be the largest IT outage that has ever happened: Just reboot it a lot. From a report: Customers can delete a specific file called C00000291*.sys, which is seemingly tied to the bug, Microsoft said in a status update published Friday. But in some cases, people can't even get to a spot where they can delete that file. In an update posted Friday morning, Microsoft told users that they should simply reboot Virtual Machines (VMs) experiencing a BSoD over and over again until they can fix the issue.
[...] "We have received reports of successful recovery from some customers attempting multiple Virtual Machine restart operations on affected Virtual Machines," Microsoft told users. "We have received feedback from customers that several reboots (as many as 15 have been reported) may be required, but overall feedback is that reboots are an effective troubleshooting step at this stage."
[...] "We have received reports of successful recovery from some customers attempting multiple Virtual Machine restart operations on affected Virtual Machines," Microsoft told users. "We have received feedback from customers that several reboots (as many as 15 have been reported) may be required, but overall feedback is that reboots are an effective troubleshooting step at this stage."
Beautiful (Score:3, Funny)
I mean.. this joke writes itself
Re:Beautiful (Score:5, Funny)
The next Microsoft Certification:
MCRE == Microsoft Certified Reboot Engineer
The test is a bitch on your fingers physically....constantly hitting that button.
Re: (Score:2)
The test is a bitch on your fingers physically....constantly hitting that button.
That's covered in MCRAE. Microsoft Certified Reboot Automation Engineer.
Rumor has it that it involves one of those drinking bird toys.
Re: (Score:3)
Re: Beautiful (Score:2)
Re: Beautiful (Score:3)
If it makes you feel any better I had to bring my machine in to the office to have it fixed. Rebooting many more than 15 times didn't do it.
Took about 3 minutes.
Re: (Score:2)
Thinking is hard.
So, how does this work exactly? (Score:5, Interesting)
I'm curious how this "solution" works. Does Windows determine on it's own what system file is causing the system to repeatedly BSOD on startup and removes it on it's own? If so, that's kinda clever.
Re: So, how does this work exactly? (Score:5, Funny)
Now if only we could get rid of extra apostrophes on their own!
Re: (Score:3)
Because that ruins everything to the point that the post is indecipherable to you.
Re: (Score:3, Insightful)
You know what's another anachronism? Vowels. Y'd hv t b stpd nt t b bl t rd ths. Dsn't mn tht y wnt t.
For the rest of the people reading this who might be tempted to buy in to dropping capitals, you can come up with dumbass reasons for everything. You know what capitals buy you? Word shapes. Visual cues of sentence breaks. And highlights of the important parts of sentences (proper nouns). It's also a way to identify visually the difference between words used as a collective proper noun, like book/mov
Re:So, how does this work exactly? (Score:4, Interesting)
Re: (Score:3)
no idea but sounds like a timing issue in the boot sequence, maybe some variable hardware latency that's just right to prevent the bug from manifesting.
those are some very nasty bugs to track and reproduce, but it will be always faulty design/coding that introduces dependence on such variables.
Re: (Score:2)
It's probably some kind of automatic snapshot feature. It detects a number of failed boots and decides to roll back. More like a last resort.
Re: (Score:2)
Addressing the home edition, I've applied updates that gave me BSOD, and the computer self-boots about three times and then says, "There's a problem. Wait until I fix it."
Re:So, how does this work exactly? (Score:5, Interesting)
Reminds me of high school days. A friend got a job maintaining a PDP-8 that controlled CNC machining equipment at a small factory.
Sometimes the PDP-8 would get wedged. The problem is that turning the power off and on didn't help because the wedged state was preserved in the magnetic core memory.
So he would rapidly cycle the power until a power glitch invalidated some of the memory, then the computer would crash and could be rebooted.
Re:So, how does this work exactly? (Score:5, Funny)
Yeas ago, I worked on a CAD system that would occasionally hang. The solution was to get up from the seat, run on the spot until you developed a static charge, then touch the chassis of the computer. Usually, this worked.
Re: (Score:2)
Re: (Score:2)
The PDP-8 was slightly before my time; I started on a PDP-11/23, but we had a PDP-8 as a control computer for a large testing station. You're right, state was preserved on power outage, which was one of the advantage of core memory, which worked against you in cases like that. As I recall (I never did this, just saw it done) one solution was to power off, pull a bank of core memory, and power it back on.
Fun fact, when we switched to the PDP-11, (shortly before I started) we had trouble convincing the mana
Re: (Score:2)
Re: (Score:2)
Re: So, how does this work exactly? (Score:4, Funny)
A long time ago, I was working as tech support for an ISP. This was the era of USB DSL modem. Every thing in the installer of the driver and in the configuration of the tcp/ip stack was subject to race conditions.
There were problems where the solution was reinstall the drivers in a loop until the race conditions resolves in the correct way. I remeber doing that with a customer on the phone 8 times. The guy was losing it after the 4th time. But on the 8th it worked.
Crappy software is crappy And it is all closed source, so you can't fix it yourself.
Re: (Score:2)
> Crappy software is crappy And it is all closed source, so you can't fix it yourself.
This belongs on a plaque.
Have you tried turning it off and on again? (Score:4, Funny)
and again. and again. and again ...
Re: (Score:2)
This is huge (Score:5, Insightful)
Thousands of cancelled flights and massive financial disruptions.
Will anyone learn anything?
Unlikely.
Re:This is huge (Score:4, Interesting)
Re: (Score:2)
Re: (Score:2)
Which is precisely why I turn off all automatic updates in Firefox, and when I used to run Windows at home, same thing. No updates unless I specifically allow them.
Now if only Firefox would rollback to a time where the option to never install an update was available so they'd stop harassing me there is an update ready to download. How difficult is it to not tell someone something?
Re:This is huge (Score:4, Insightful)
Yes, in other words, updates should be curated. Companies that don't do this suffer the consequences.
it seeams like crowdstrike is autoupdate by defalt (Score:2)
it seems like crowdstrike is autoupdate by default and maybe little to no control for IT to edit that setting?
Re:This is huge (Score:5, Interesting)
Or set up a proper canary test environment that you deploy updates to first, which has been best-practice for at least 10 years now.
Re: (Score:2)
I shit you not.
Re: (Score:2)
Thousands of cancelled flights and massive financial disruptions.
Will anyone learn anything?
Unlikely.
Alternate possibility: We learn something. We decide it is too risky to run cyber security audit software, too risky to run antivirus, to risky to run management software with escalated privileges.
We simply setup every computer with a local account and a local administrator, relying on the user to keep the system up to date and secure.
How often do you suppose they will have to cancel flights then?
I'm not being funny here, I'm legit wondering what you hope we learn here. The problem here stems from the very
I Hope This Isn't the New Normal (Score:5, Funny)
Me: Of course!
Tech Support: Did you try restarting it 15 times?
Me: WTF?
Re: (Score:2)
In all fairness, 50% of the responses will probably be "Of course!" to that question as well.
Um no (Score:2)
Re: (Score:2)
Could be a race between getting updated and the crash.
I know that Windows for it's self-updates has a rollback if boot can't succeed. Guessing third party kernel modules don't get that benefit though.
Re:Um no [Efficiency] (Score:2)
Actually you reminded me of recent problems where the performance of Windows goes to shite and the only interesting clue that I've found so far is an "Efficiency" annotation in "System" and some other program. Maybe someone around here has figured it out already? It might involve Firefox or other "enemy" software running in the dragon's den of Microsoft...
My theory is that Windows has detected some kind of performance problem and the System and the affected program have been triggered to run some kind of re
Re: Um no (Score:2)
Seriously? (Score:2)
For every single machine? Is this real life?
Re: (Score:2, Funny)
For every single machine? Is this real life?
Is this just fantasy?
Caught in a landslide, no escape from reality...
Re: (Score:2)
Open your eyes, look up to the skies and see
I'm just a poor boy, I need no sympathy...
Re: (Score:2)
Anybody who has to reboot a server 15 times has my sympathy.
Re: (Score:2)
woosh
Re:Seriously? (Score:5, Informative)
What is missing from the summary in a rush to get posted so people can shit on Microsoft: this is recommended for Azure VMs. Not physical hardware.
Re: (Score:2)
Is it only the VMs that were affected? Because that's not what other reports sound like. And if it's not, are they just being hung out to dry?
Re: (Score:2)
No, anything running Crowdstrike and Windows with an active internet connection was affected. But for non-VMs you can get into safe.mode and delete / move a single file and reboot.
C:\Windows\System32\Drivers\CrowdStrike\C-00000291*.sys - get that out of there and you're fine.
Re: (Score:2)
Re: (Score:2)
No, only VMs can be fixed like that due to some Azure specific failsafes.
If you're running an normal version of windows you have alternate options, such as logging in as administrator and deleting a single file. If you can't do that, complain to the person who owns your computer - since it's clearly not under your control.
Clarification needed (Score:2)
Do we need blood from a virgin and hair from a toad for this to work? Should we wait for full moon before attempting it?
Re: (Score:2)
I did search the thread for "dead chicken" but nobody else has got there yet.
Re:Clarification needed (Score:5, Funny)
Blood from a virgin? C'mon, there's no such thing! Get real, bro.
What are you talking about? They're all over the place on this site...
Re: (Score:2)
Somewhere around eight or nine (Score:2)
Re: (Score:2)
"I just left it off and ordered an adult computer system."
Do you need one of those to access porn sites?
Re: (Score:2)
Re: (Score:2)
How does it work is a good question.
It doesn't get you past a security lockout, though, because it's not a feature which lets you into recovery when you're not allowed or whatever.
The theory that Windows eventually automatically goes to a snapshot is a rational one, I don't know if it's true but it sounds reasonable to me based on what I know and have experienced about/with Windows in the past.
On the other hand, I've been rebooting for almost an hour now, I don't know how many attempts it's been but it has
Re: (Score:2)
Re: (Score:2)
You're right, but notice that a large part of the problem is monoculture. Even BSD will occasionally have a problem. (I think the Morris worm started on a Unix system.) And this one isn't affecting Apple or Linux devices...but there will be one analogous. (Though Linux itself is varied enough to limit the effects.)
One should always be wary of monocultures. Sometimes they're REALLY necessary (communication is the only one that comes to mind), but ALWAYS be wary of them.
I've been rebooting for 45 minutes so far (Score:2)
Two blue screens per power on.
The nostalgia is real! I haven't seen a blue screen in a while... I stopped using Windows on my own machines :D
Comment removed (Score:4, Interesting)
Re: (Score:2)
Sorry, but it's easy to design systems with built in race conditions. People have to work to avoid it. If you want to ensure a race condition at boot time, have a variable depend on, e.g., the clock that each process sets separately.
Re: (Score:2)
Absolutely false. If a system detects a fault in an automation it should not repeat that automation ad-infinitum. In an ideal world you end up with a recovery system, e.g. Azure rolling back the VM to a prior image.
Having 100% the same behaviour in both good *and* detectable bad scenarios is a recipe for FUBARing your PC.
but do not do 20 as then the main power goes down (Score:2)
but do not do 20 as then the main power goes down takeing the raptor fences with it.
Re: (Score:2)
but do not do 20 as then the main power goes down takeing the raptor fences with it.
Not to mention it will summon the bane of schoolboys everywhere - "Bloody Mary"
And yet the richest company in the world is (Score:5, Funny)
Windows problems - reboot.
Linux problems - be root.
Re: (Score:2)
Mod parent Funny, but root is hard work. And dangerous, too.
reboot Virtual Machines experiencing a BSoD.... (Score:2)
Have any deaths resulted from this? (Score:2)
People dying because of this would be a good reason to sue Microsoft into oblivion.
Also, I wonder how the financial institutions affected by this are going to feel.
Re: (Score:3)
It's not really Microsoft's fault though. This was caused my Crowdstrike. Also, don't these massive companies have a test lab for patches BEFORE they roll out to production? Seems like failures on numerous fronts.
Re: (Score:3)
Re: (Score:3)
100% agree! This seems like a failure to test patches before pushing to production. If these multi-national corporations can't be bothered to protect themselves and want a 3rd party vendor to cause these problems, that's on them as much as the shoddy vendor.
As you said, it's companies being cheap on IT. Instead of seeing IT as an investment into productivity, they see IT as an expense that needs to be cut.
This is a form of karma for the world.
Re: (Score:2)
Example, you are a comany that makes widgets. Production is both hated (as it costs money and has workers making stuff) and loved a little as the widgets are sold (Yeah! Sales is loved! Marketing is loved!) But the workers making shit? We gotta find a way to fire them all! I have it, Outs
Re: (Score:3)
basic CYA - If your responsibility is IT security and the company gets pwnd/ransomeware'd etc because your policies delayed pushing the latest CS definitions - YOU get blamed; a bunch of boxes crash because CS pushes a bad update CS gets blamed.
Take this mentality in a bunch of mid level managers and roll up to CIOs accross a buch of large Enterprises and here we are.
Security boils down to people and process - which boils down to culture. What we see here today is our culture.
Re: (Score:2)
It's not really Microsoft's fault though. This was caused my Crowdstrike. Also, don't these massive companies have a test lab for patches BEFORE they roll out to production? Seems like failures on numerous fronts.
We are lucky to have Microsoft, the one and only OS that is never at fault for anything. It does take a brave person to stand staunchly behind Microsoft, apparently not mentioning the ridiculous fix. That makes Microsoft look like the most amateurish outfit that ever blessed humanity with their perfection.
A pity that it is impossible to bake in the protection needed instead of relying on an outside agency, to do something Microsoft isn't up to doing. Or maybe not capable.
Re: (Score:2)
Hey, don't look at me as an MS defender. I run Linux to avoid them. I'm just saying, this isn't all on Microsoft. They are part of the problem but they are not the entire problem.
Others have pointed out that Crowdstrike does have a Linux option and not testing a patch before rolling it out to Linux could also cause this issue. That's why test labs should exist before things are pushed to production.
Is it the year of the linux desktop? (Score:2)
I must laugh. Twice. (Score:2)
Switched on the news this morning and heard about the ClownStrike FUBAR. Switched around between a couple of newscasts. The only station that reported "does not affect Linux" was our (much hated on Slashdot) local right wing station.
So the answer is... (Score:2)
Someone upthread noted that the error was preserved through reboots - capacitor. There's a simple answer to that, which I learned from an on-site tech call: unplug your computer, *then* hit the start button, until there's no flash of light. You've now discharged the cap, and can plug it back in and boot.
Re: So the answer is... (Score:2)
Virtual capacitors, as the fix applies Azure VMs, right?
Re: (Score:2)
Not necessarily the cause, it could be a bad file somewhere in the boot chain, but your point still stands. It's a good thing to try.
They're Just Catching Up (Score:2)
I've been using Windows since the 3.0 days in 1990. Back then you had to reboot any time you installed any piece of hardware, or any significant piece of software. Microsoft promised us they would work on reducing the number of reboots needed. But really they were just saving them all up so you can spend them today, on Reboot Friday. Happy rebooting everybody!
Verified (Score:5, Informative)
My organization has several virtual desktop instances (VDI) running in AWS Workspaces. They got caught in a boot loop and just kept rebooting during the middle of the night, according to logs.
Around 5:30 a.m., after CrowdStrike started pushing the fix (revert), they started getting lucky and getting the update before crashing. One last reboot and they were stable.
This also happened with several AWS EC2 instances. A couple of physical machines needed manual intervention, as did any laptop that was impacted. Laptops that were offline or sleeping when this went down weren't impacted, as long as they weren't powered on until after 5:30 a.m., when the fix was being pushed.
The CrowdStrike agent process starts very early in the boot sequence, and this "fix" is just it phoning home and getting an update before it crashed again.
In a better world Crowdstrike would be blacklisted (Score:2)
If there was a corporate death penalty this should be it. But in a week or two the Crowdstrike company will be back to normal shareholder operations.
Facepalm (Score:2)
Miiiii-crooooo-sofffft! 3000 times, which will immediately summon the bane of our childhoods Bloody Mary.
Then you will strip naked, and twirl around until you lose your balance.
All of this recorded on webcam and sent to Microsoft so that the fix will be maniufested by Beelzabub, their security officer.
Might as well - Microsoft is a joke, and rebooting 15 time just shows how much of a joke they are. I can hardl
BOFH (Score:3)
I have a hazy memory of one of the first BOFH segments, where the BO tells a lUser to turn his computer on and off fifteen times or so. Over the phone he hears the power supply explode and goes back to his Lemmings game happy in the knowledge of a job well done.
How secure can it be? (Score:2)
If security software is crashing itself when the PC is booted up how secure can the product be? If the design and QA is such that obvious egregious flaws are able make the front page of the NYT why should anyone expect the kernel driver and associated software not to be riddled with large numbers of exploitable bugs?
Cure cancer (Score:2)
simply torch the tumor
Re:So like Patch Tuesday? (Score:5, Interesting)
What a sweet gig Microsoft has. They fail, and fail, and fail.....and rake it in!
And then they just keep on failing, and keep on raking in ridiculous amounts of money. It just doesn't stop. The leaning tower never falls.
It's amazing!
Re:So like Patch Tuesday? (Score:4, Informative)
I posted to soon.
I wanted to add that Microsoft succeeds at the one and only thing that actually matters: marketing.
Re: (Score:2)
I wanted to add that Microsoft succeeds at the one and only thing that actually matters: marketing.
They most certainly do not. If they did they could have sold Windows Phones.
Microsoft succeeds at the one and only thing that actually matters: lobbying.
Practically every government computer not used for nuclear blast modeling or some other similar task runs Windows.
Re: (Score:2)
You're right, but I think I'd argue that Microsoft marketing was good enough to keep Windows Phone afloat for much longer than it would have otherwise.
I was issued one of the damned things at work back in the day. The company had standardized on them because a Microsoft sales team had weaseled into the decision chain. When the person responsible for the decision said no, we're going with Blackberry, Windows salesmen went over his head and started a campaign to discredit him. I don't recall that he was fi
Re: (Score:2)
Mod parent Funny in the saddest possible way.
Re: (Score:2)
Re: (Score:3)
I love how Crowdstrike causes an outage but because it affects Windows MS is being blamed. Sorry, if you choose to install Crowdstrike then it is not MS's fault.
Re: (Score:3)
As much as I would love to take a shot at microsoft, this one isn't on them. This is on clownstrike. Microsoft is simply trying to help their customers recover and is sharing what they found. This is what they should be doing.
Clownstrike, on the other hand, is yet another shot across the bow that we're going to have a big problem in the future if IT stays on its current trajectory. This time it can be fixed by reboots, or going around to every single impacted device, booting into safe mode and manually dele
Re: (Score:3)
Microsoft has a fix coming: In the future you won't be able to be administrator on your own windows machine, and no software will run with any privileges beyond the user. No more custom UIs, explorer replacement, no more management software, no more antivirus (other than Microsoft's sanction one), nothing.
That's what you want right? A world where it's impossible to fuck up an OS? That seems to be what you're asking for by pinning the blame here on Microsoft for a 3rd party program. Let's lock it all down so
Re: (Score:2)
Re: (Score:2)
I called our IT. They have more than 500 users out.
They said they're gonna try this out.
If it works I'm asking for a raise!
Hard tom imagine the king of enterprize, the most popular OS in the world, the standard by which all other inferior OSs must be judged agains, those OS's always found lacking - tells us that millions, perhaps billions of computers have to be rebooted 15 times in order to fix a problem.
You'll deserve that raise.