Microsoft Outage Hits Users Worldwide, Leading To Canceled Flights (wsj.com) 168
Microsoft grappled with a major service outage, leaving users across the world unable to access its cloud computing platforms and causing airlines to cancel flights. From a report: Thousands of users across the world reported problems with Microsoft 365 apps and services to Downdetector.com, a website that tracks service disruptions. "We're investigating an issue impacting users' ability to access various Microsoft 365 apps and services," Microsoft 365 Status said on X early Friday. On its status page for Azure, Microsoft's cloud computing platform, the company said the issue began just before 10 p.m. ET Thursday, affecting systems across the central U.S. In an update, Microsoft said it had determined the cause and was working to restore access to its users.
Does microsoft use crowdstrike? (Score:5, Insightful)
Re:Does microsoft use crowdstrike? (Score:5, Insightful)
Wouldn't be surprised.
Also wouldn't be surprised if nobody cared. When a small service provider has an issue for half an hour, customers pop a vein but when something like this happens, everybody just shrugs...
Re: (Score:3)
It's a difficult situation for anti-virus vendors. On the one hand everyone wants the latest updates as soon as they are released, to protect from zero day malware. On the other hand the vendor can't realistically check every possible configuration for issues, and the usual way of handling that is to do a slow roll-out and stop if you get reports of crashing in the first few hours/days.
Vulnerable to zero days, or vulnerable to bugs in the AV software, pick your poison.
Re: (Score:2, Insightful)
Indeed. The fundamental problem is a different one though: The crappy security level of MS Windows. On operating systems with decent security, you do not even need AV. AV is a crutch, nothing else, a band-aid on the excessively bad situation with Windows security. Hence while a vendor in this space triggered the problem, the majority of the blame lies entirely with Microsoft.
Re: (Score:2)
Which decently secure OS are you thinking about? Linux also gets malware ...
Either OS are tiny and practically irrelevant (e.g. Redox OS), or big and therefore vulnerable.
The in-betweens like FreeBSD are just not popular enough to warrant lots of malware, but that doesn't prove that they are secure.
If most corporate (human-facing) devices ran MacOS or some BSD, there would also be magnitudes more malware for them.
Re: (Score:3)
If most corporate (human-facing) devices ran MacOS or some BSD, there would also be magnitudes more malware for them.
The most valuable data isn't on human-facing devices. The most impactful disruptions aren't to human-facing devices. In both cases the most valuable targets are servers. Linux is very well represented there. Yet there are still many, many more serious exploits for Windows than there are for Linux. Why is that?
Re: (Score:3)
Simple: Because writing malware for Linux is orders of magnitude harder and has a lot less success rate. But the Windows fanbois cannot understand that. Hence they try one invalid apology for Microsoft's crappy products after the other.
Re: (Score:2)
Re: (Score:3, Informative)
The problem is Crowdstrike on Windows causing a BSOD. Microsoft's own services are not part of the problem.
Re: (Score:2)
Agreed that this wasn't instigated by Microsoft, but they're getting (some of) the blame - I think reasonably so.
Why should we 'blame' Microsoft?
- It doesn't affect Linux, Mac, Android, IOS, etc - so it's a "Microsoft specific problem" (as opposed to an Oracle problem, a Slack problem or whatever)
- BSODs are pretty low level stuff - there's room to say that such things shouldn't happen, even if a virus scanner does an update
I suspect the update said "your system needs to reboot", and that's where the proble
Re:Does microsoft use crowdstrike? (Score:4, Interesting)
Re: (Score:3)
I get that this software is supposed to save us from emerging threats, and that those threats come thick and fast. However, I think the biggest issue has been blindly allowing the CrowdStrike agent to update everywhere without vetting it on some test servers (or perhaps a one or two of servers in each server farm) first for a day or two. Live and learn... Oh, if they're on the stock exchange and you have shares, it's gotta be way too late to start selling them now.
Re: (Score:2)
It's highly invasive, and all software has bugs, and odds are that one day, a particular bug will break a lot of things all at once, predictably.
Why we collectively disregard this, is a bit of a question about human nature.
Re:Does microsoft use crowdstrike? (Score:4, Insightful)
Why we collectively disregard this, is a bit of a question about human nature.
The problem starts earlier: Why are we using a crap-security OS like Windows, which makes things like Crowdstrike necessary?
Re:Does microsoft use crowdstrike? (Score:5, Informative)
Yes. It is. [theverge.com]
They've released a work around for the matter.
* Boot into safe mode or the Windows Recovery Environment.
* Navigate to C:\Windows\System32\drivers\CrowdStrike
* Locate the file that matches "C-00000291*.sys" for your machine. Delete that file.
* Reboot machine
Directions from the TA posted on the matter by Crowdstrike [crowdstrike.com] Someone from Twitter posting a screenshot of the TA [x.com]
I would say someone is getting fired, but someone is going to be answering to Congress soon enough.
Re: (Score:2)
The BBC is tracking the outages (Score:5, Informative)
They're giving a rolling update here: https://www.bbc.co.uk/news/liv... [bbc.co.uk]
I'm pretty sure it's not DNS (Score:4, Funny)
Re: (Score:2)
Re: (Score:3)
I think someone just needs to reboot the internet and we'll be good.
Just be careful with the CAD sequence. The Left-Alt and Alt-Right keys will boot up very different Internets. :-)
Centralize, centralize (Score:5, Insightful)
What possible harm can come from having half the world using cloud services from a tiny number of cloud providers?
I've used them too, but nowadays the stuff I manage is either at a local provider, or in my own hardware. Smaller providers tend to be a bit more expensive, but they also value your business more than AWS, MS, & Co..
Re: (Score:2)
Your local provider likely doesn't have the manpower for proper security.
Re: (Score:3)
I don't think proper security requires that much "manpower", at least in terms of the number of people involved, a few people and even one person with brains is definitely needed although. Proper coordination and oversight are also of primary importance so one hand doesn't open security holes the other hand doesn't know about and the bigger the team, the more people involved, the more that's likely to happen.
Re: (Score:2)
Keep it simple...
Keeping a small environment secure is much easier than a large sprawling mess. The more complex a system is, the greater the chance that someone has fucked up somewhere and left a vulnerability open.
Re: (Score:2)
Indeed. And that is why "software engineering" and "IT systems engineering" routinely does not qualify as "engineering" these days. Proper engineering always respects and follows KISS.
Re: (Score:2)
This is such a stupid statement. It's like saying you can't protect your house unless you are a multinational corporation.
Re: (Score:2)
Re: (Score:2)
Probably. Why are they using Windows then? Windows does not come with reasonable security and that is well known. If you do not have the manpower to fix it, then do not use a "solution" that is defective by design.
Re: (Score:2)
A smaller provider it as likely to have an outage as one of the major cloud providers' data centres. The overall impact worldwide will be smaller, but if that's where your stuff is hosted then you don't really care about that.
Re: (Score:2)
Not if local provider has redundancy and replication in several data centers geolocated far enough from each other. I have that and I am very small. I only rent bare metal servers.
Re: (Score:2)
Doesn't matter since the provider don't need to be related/associated.
Re: Centralize, centralize (Score:2)
They're not small, but I use SSDNodes as my VPS provider. Zero issues over 6+ years, very happy.
Re: (Score:2)
Actually, you do. If that small provider has problems, you will be able to get help and consulting. If a major provider has issues, the market for that will be empty. And that is one of the reasons why tons of eggs in a single or small number of baskets is such a bad idea.
Re: (Score:2)
Today the problem is a CrowdStrike update causing a BSOD. It's only a "Cloud" problem in that the update got to a lot of PCs very quickly. Microsoft's own Cloud services are not involved.
lack of update control / big updates the cover lot (Score:2)
lack of update control / big updates the cover lots of services in one installer is an other issue.
Linux systems are setup with more control and each service as it's own updates.
Re: (Score:2)
Aside from the fact that the issue isn't cloud related as much as it was a Crowdstrike update, the question is do you shoot someone in the head or give them death by a 1000 papercuts.
The entire world being down for a couple of hours is not really any different from an end user point than each individual customer being down at a different time. The other issue with clouds is that many smaller providers were *worse* at this than major ones running the cloud. The latter makes the news, the former was largely u
Re: (Score:2)
What possible harm can come from having half the world using cloud services from a tiny number of cloud providers?
And yet, no one using those services will learn a thing yet again.
Re: (Score:2)
Obviously. People would have to realize they have been stupid. Many will rather die than admit that.
Re: (Score:2)
What possible harm can come from having half the world using cloud services from a tiny number of cloud providers?
What benefit could there be? Well for starters Microsoft's cloud was back up and running within short order.
Windows 365 deployments from Azure were back up and running within short order.
It seems the only people who have lasting effects right now are those people who had Crowdstrike services installed on non cloud provisioned systems. I.e. they would have been better off* running their entire OS in the cloud.
The asterisk is because anyone who has ever used Windows 365 knows it's painfully slow to use a remo
Re: (Score:3)
Except the cloud stuff was painful, but fixable (for us). The REAL nightmare is the local stuff, specifically laptops and machines that have BitLocker installed. The current fix for that is hands-on with the BitLocker recovery key and admin rights.
I'm seeing forum posts of companies with globally dispersed teams with 5,000+ laptops that are just staring into the abyss.
Re: (Score:2)
Lets hope so. Too many people are far too irrational about the cloud and see it as a kind of thing that fixes all problems.
Crowdstrike (Score:2, Informative)
Re: (Score:2)
Give us a perspective of what "Single Point of Failure" means in an internet-connected world.
Re: (Score:2)
^This^
Re: (Score:3)
It actually is a nice combination of the defects of Crowdstrike and the defects of Microsoft Windows. If MS Windows was not such an insecure PoS, something like Crowdstrike would not be needed and would not have to sit deep in the system internals.
Re: (Score:3)
Furthermore, Windows should not be so fragile. That Windows doesn't trap internal errors in subcomponents and handle them gracefully is a Windows defect.
Re: (Score:3)
Indeed. And it is a gross architecture and design error on top of that. Obviously, the blind MS fanbois are now screaming "Cloudstrike!", because these people tipped the house of cards over. In actual reality, most of the blame lies with MS.
Re: (Score:2)
While I'm not going to argue against your point, installing software that modifies the kernel is introducing fragility.
Sure. That is why you should avoid these or keep them to a minimum. On Windows, it seems that is risky and you either get raw Windows (which is insecure and unreliable) or Windows with 3rd party in-kernel elements (which is insecure and unreliable). The actual real problem here is the choice of OS.
Re: (Score:2)
Sorry but horseshit. Linux, Unix, MacOS are not immune from bugs, they are not magic OSes that require not remote management. It is clear you have never managed more than the 2 PCs in your house and thus have zero concept of what these software packages do, or what it means to run management software which has administrative privileges on your OS, or what damage that kind of software can do to *ANY* OS.
Back in 2011... (Score:2)
Let's migrate to cloud!
Re: (Score:3)
Re: (Score:2)
And yet the cloud is out.
It has everything to do with cloud. The cloud is out of your control. I didn't even know about this outage and my work day was not affected.
Re: Back in 2011... (Score:2)
Were you even born in 2011?
Wait, who cares?!
*goes back to enjoying retirement*
Coast guard... (Score:5, Funny)
So Crowdstrike protects our computers from viruses. But who protects us from Crowdstrike?
Re:Coast guard... (Score:5, Funny)
While BSODed, your computer is 100% safe from any remote threat. That's a heck of a security record if you ask me.
Oh for points (Score:2)
Such insight, such humour... :)
Re: Coast guard... (Score:3)
Thanks to the Intel Management Engine, I doubt that that's the case.
Re: (Score:2)
Not Microsoft.
Yup. (Score:2, Insightful)
Blame Cloudstrike and not Microsoft (Score:3, Informative)
Meanwhile, the world continues to marvel at the irony of a cybersecurity company taking down the systems around the world due to a botched update.
Re: (Score:2)
This is one reason I think there should be obligatory quality controls on commercial software. The risks are too high.
Re: (Score:3)
The problem is that security software vendors like Crowdstrike often need to deliver things fast. Sure, they do need quality control, but the blame here is mostly on MS for making things like Crowdstrike needed in the first place. AV and other real-time protection cannot be 100% reliable. Hence you should actually run an OS and application landscape that does not need them.
Re:Blame Cloudstrike and not Microsoft (Score:5, Insightful)
Obviously blame Microsoft. They are 80% at fault here by delivering an OS so insecure that things like Crowdstrike are needed to get it to an acceptable level.
Re: (Score:3)
can't even spell CrowdStrike properly
Yes, let me help you:
C - L - O - W - N - S - T - R - I - K - E
You're welcome.
Problems with Microsoft 365 apps (Score:5, Funny)
Soon to be renamed, "Microsoft 364" ... :-)
Re: (Score:3)
Yes, when you combine Microsoft Windows and the CrowdStrike Falcon agent you get Office 364. Should be cheaper... apparently :)
Re: Problems with Microsoft 365 apps (Score:2)
Re:Problems with Microsoft 365 apps (Score:4, Insightful)
Call it Microsoft 360+. That way you get cross-brand recognition with XBox 360, and everyone knows that a + is great marketing.
Even in Europa simple tasks have got impossible (Score:2)
The people cannot even open their rental cars because the app cannot establish a connection to the cloud. Stupid idea to make everything dependent from the cloud of a single company. Where are the good old keys?
Re: (Score:2)
Re:Even in Europa simple tasks have got impossible (Score:4, Insightful)
If an app doesn't work it doesn't work, the root cause to the user is irrelevant.
Please explain how this bad update would prevent a pair of standard car keys from working?
[crickets]
People who think like you are part of the problem.
Re: (Score:3)
As soon as you run software you didn't write yourself, you've chosen to make yourself dependent on external forces beyond your control. Everyone still does it, because it's impractical not to. Nobody writes their own endpoint protection software.
The problem isn't cloud, it's code quality. Most software companies place more importance on pushing new features quickly than doing proper QA. I imagine this mentality is even worse in the security space, where providing protection from emerging threats is the enti
Re: Even in Europa simple tasks have got impossibl (Score:2)
From what I understand, people aren't disagreeing with your insight on identifying root causes in the software development process -- they're saying car functionality shouldn't be dependent on network access. The OP comment mentioned some cars being unable to unlock at all. It's this coupling of unrelated functions that exponentially increases risk by introducing interdependence.
Re: (Score:2)
The problem affecting most people is the cloud systems using the app. To most people, it might as well be that particular cloud, but it won't just be Azure, it's all systems running Windows "protected" by CrowdStrike.
So yeah, blaming the/a cloud is a stretch. Blaming a cloud for relying on an app with no boundaries set, is IMNSHO fair. If you use the product, or connect to a system that uses it, you likely had a world of hurt today. If you are responsible for your system uptime, and signed off on softwa
Re: (Score:3)
The problem with a keyless-solution is multilayered. You are dependent from:
- the mobile phone working
- the mobile phone being charged (is there a power-connection nearby?)
- the app on the mobile phone being up2date and working
- ther authentication working (finger-print readable?)
- the internet connection being available
- the cluod-service being accessible
- the crowstrike-software used by the vendor
- the servers and software used by the vendor
- the bluetooth-receiver in the car functioning too with all ist
Re: (Score:3)
The people cannot even open their rental cars because the app cannot establish a connection to the cloud. Stupid idea to make everything dependent from the cloud of a single company. Where are the good old keys?
I'd say it was their fault for hiring a car where you cannot get a physical key.
Microsoft Azure down (Score:2)
Re: (Score:2)
Sounds like cheaper that possible systems administration and cheaper than possible or completely missing redundancy. Typical MS "trash" level quality. Why anybody depends on them is beyond me.
Re: (Score:2)
Indeed. That is where the "cheaper than possible" systems administration comes in. Obviously, you apply to one, then give it a week or so and see whether things continue to work. Also obviously, that has not been done here.
Re: (Score:2)
It's beginning to look like complexity of the Microsoft ecosystem is reaching a breaking point. The idea of infinite growth of the codebase/complexity may be at a logical inflection point or concl
I don't get it (Score:3)
Re:I don't get it (Score:5, Informative)
Easy. It's a 3rd party application with escalated privileges, intended to detect and fight off cyber security threats. An admin allowed it to be hooked into more privileged layers of the OS, because it needs those privileges to perform some magic that Excel/Outlook/Teams/Word doesn't need.
This is the second time I've seen CS cause issues... last time it was just burning CPU on a Linux box.
Re: (Score:3)
OK, but why isn't the OS core trapping errors correctly? It's written in C++ (so there's exception handling), and we can assume that all return codes are checked, right?.....
Re: (Score:2)
A lot of the lower parts of the OS are written in C or Assembly (thought that is probably a smaller part). Other parts are in Rust and C#. (which to your point should also be trapping exceptions)
You would hope that things are checked and trapped, but for the level that Crowdstrike is running at, it's one of those "do things are your own risk, you may crash the whole computer". Sounds like someone didn't check something assuming that "there is no way this will never not succeed." Would love to see a po
Re:I don't get it (Score:5, Insightful)
Bad design. CrowdStrike runs with kernel privileges.
The fact the crowdstrike exists at all is a symptom of bad Windows design.
Re: (Score:3)
Exactly. Bad design by MS that is. Forst, you should not need anything like Crowdstrike. And second, if you need it, then it should not need to run so that it can crash the kernel.
Re: (Score:2)
Trivially. You can fuck up any OS when you have an application with privileges required to change system settings. I had an official script from Maverik which kernel panicked my Linux machine back in the day as it attempted to force install the wrong RAID module in the kernel and set it to force load on boot.
If you can't fuck up your system with administrative privileges, then you're not really an administrator. :-)
How many times does it have to be said? (Score:2, Insightful)
THERE IS NO CLOUD. There is only somebody else's computers. When you use "the cloud", you are putting information vital to the business you built with your blood, sweat and tears under the care and control of somebody else. When push comes to shove, that person or corporation cares about your business exactly as much as whatever penalty a court might impose if something were to go wrong.
Other peoples' computers might be great for backup. Running your business from them "because savings" is rolling the d
Re: (Score:2)
Many more times. Because far too many people are not listening and are actively in denial.
One basket + mountain of eggs = problems! (Score:2, Interesting)
And a rather pathetic basket at that. Good old MicroCrap at work. I doubt the fanbois that mate their trash great will realize anything though.
Re: (Score:2)
1. Clearly reading /. noise has led you down the wrong path - this is NOT A MICROSOFT ISSUE. Sorry for yelling but your hearing is apparently impaired
2. Please - PLEASE - stop butchering the English language... please.
Re:One basket + mountain of eggs = problems! (Score:4, Insightful)
This is a Microsoft issue. Requires a few working brain cells to see that though. MS set this situation up. If MS were doing decent engineering, something like Crowdstrike would not even be needed. Instead, their crap is so bad that you need to embed 3rd party products deep within the OS to get it to an acceptable level.
So stop yelling, you are looking stupid.
Re: (Score:2)
If MS were doing decent engineering, something like Crowdstrike would not even be needed.
1. NO amount of "engineering" will fix stupid users. Ever.
2. Linux had kernel panic issues with CS not so long ago.... so maybe the OS isn't the issue?
Re: (Score:3)
Extremely bad design by Microsoft means tools like CrowdStrike are needed. This is absolutely a Microsoft issue, just not a direct one.
Re: (Score:3)
1. It is Microsoft who designed update system which automatically install updates without system administrator/computer owner confirmation
2. It is Microsoft who so misdesigned security model of the operating system, that whole class of antivirus software has to be invented.
Re: (Score:3)
1. Clearly reading /. noise has led you down the wrong path - this is NOT A MICROSOFT ISSUE. Sorry for yelling but your hearing is apparently impaired
2. Please - PLEASE - stop butchering the English language... please.
Actually, the Crowdstrike issue was one of two for the day.
The *other* issue they had affected one of my systems caused by inaccessible storage for an Azure SQL service. At the moment, an app of ours mysteriously stopped working and the logs showed entries that boiled down to "timeout waiting to connect". Fire up Azure portal, sure enough, one of the three DBs we have were offline - strange. Then two. Then three. Then one. Round and round we go. Fire up SSMS, no connection. Finally the notice from Azure abo
Re: (Score:2)
The *other* issue they had affected one of my systems caused by inaccessible storage for an Azure SQL service.
Acknowledged... Admittedly few services on our side hit because our geo-redundancy is properly configured and tested quarterly...
N.Europe and W.Europe all good.
The CS noise has had far wider impact.
Time for Professional License Requirements (Score:2)
Re: (Score:2)
Indeed. But people are stupid and do not want to listen. Evidence: How your completely true posting got modded down to -1.
Re: (Score:3)
... into the cloud that you can't get back out of it within 2 hours.
That's one of the rules I live by as an IT expert. See TFA for one of the reasons why.
Forget TFA. I just checked "the cloud's" status page and saw that it was back up and running before you even got out of bed. As an IT expert if you ran your own Windows systems managed by crowdstrike rather than running them in the cloud, you'll still be busy now recovering from the problem while the cloud users were on their merry way.
It helps to understand what the problem is before knee jerking a comment like yours. Then you'd realise that in this case the cloud provider offered the fastest possible reac