Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
Microsoft IT

Microsoft Outage Hits Users Worldwide, Leading To Canceled Flights (wsj.com) 168

Microsoft grappled with a major service outage, leaving users across the world unable to access its cloud computing platforms and causing airlines to cancel flights. From a report: Thousands of users across the world reported problems with Microsoft 365 apps and services to Downdetector.com, a website that tracks service disruptions. "We're investigating an issue impacting users' ability to access various Microsoft 365 apps and services," Microsoft 365 Status said on X early Friday. On its status page for Azure, Microsoft's cloud computing platform, the company said the issue began just before 10 p.m. ET Thursday, affecting systems across the central U.S. In an update, Microsoft said it had determined the cause and was working to restore access to its users.
This discussion has been archived. No new comments can be posted.

Microsoft Outage Hits Users Worldwide, Leading To Canceled Flights

Comments Filter:
  • by alavaliant ( 1002928 ) on Friday July 19, 2024 @02:19AM (#64636805)
    I wonder if this relates to the the problem today with Crowdstrike releasing a bad update that caused Windows computes world wide to bluescreen? I have no idea if Microsoft makes any internal usage of crowdstrike. But the timing is close enough to make it plausible...
    • by Kokuyo ( 549451 ) on Friday July 19, 2024 @03:06AM (#64636863) Journal

      Wouldn't be surprised.

      Also wouldn't be surprised if nobody cared. When a small service provider has an issue for half an hour, customers pop a vein but when something like this happens, everybody just shrugs...

      • by AmiMoJo ( 196126 )

        It's a difficult situation for anti-virus vendors. On the one hand everyone wants the latest updates as soon as they are released, to protect from zero day malware. On the other hand the vendor can't realistically check every possible configuration for issues, and the usual way of handling that is to do a slow roll-out and stop if you get reports of crashing in the first few hours/days.

        Vulnerable to zero days, or vulnerable to bugs in the AV software, pick your poison.

        • Re: (Score:2, Insightful)

          by gweihir ( 88907 )

          Indeed. The fundamental problem is a different one though: The crappy security level of MS Windows. On operating systems with decent security, you do not even need AV. AV is a crutch, nothing else, a band-aid on the excessively bad situation with Windows security. Hence while a vendor in this space triggered the problem, the majority of the blame lies entirely with Microsoft.

          • by Mushur ( 870120 )

            Which decently secure OS are you thinking about? Linux also gets malware ...
            Either OS are tiny and practically irrelevant (e.g. Redox OS), or big and therefore vulnerable.
            The in-betweens like FreeBSD are just not popular enough to warrant lots of malware, but that doesn't prove that they are secure.

            If most corporate (human-facing) devices ran MacOS or some BSD, there would also be magnitudes more malware for them.

            • If most corporate (human-facing) devices ran MacOS or some BSD, there would also be magnitudes more malware for them.

              The most valuable data isn't on human-facing devices. The most impactful disruptions aren't to human-facing devices. In both cases the most valuable targets are servers. Linux is very well represented there. Yet there are still many, many more serious exploits for Windows than there are for Linux. Why is that?

              • by gweihir ( 88907 )

                Simple: Because writing malware for Linux is orders of magnitude harder and has a lot less success rate. But the Windows fanbois cannot understand that. Hence they try one invalid apology for Microsoft's crappy products after the other.

      • Reminds me of the Keynes quote, "Worldly wisdom teaches that it is better for reputation to fail conventionally than to succeed unconventionally."
    • Re: (Score:3, Informative)

      by stereoroid ( 234317 )

      The problem is Crowdstrike on Windows causing a BSOD. Microsoft's own services are not part of the problem.

      • Agreed that this wasn't instigated by Microsoft, but they're getting (some of) the blame - I think reasonably so.

        Why should we 'blame' Microsoft?

        - It doesn't affect Linux, Mac, Android, IOS, etc - so it's a "Microsoft specific problem" (as opposed to an Oracle problem, a Slack problem or whatever)
        - BSODs are pretty low level stuff - there's room to say that such things shouldn't happen, even if a virus scanner does an update

        I suspect the update said "your system needs to reboot", and that's where the proble

    • by bloodhawk ( 813939 ) on Friday July 19, 2024 @03:13AM (#64636883)
      yeah a lot of poorly informed news places ran with saying it is a MS problem before realising crowdstrike isn't a MS product.
      • by Meetch ( 756616 )

        I get that this software is supposed to save us from emerging threats, and that those threats come thick and fast. However, I think the biggest issue has been blindly allowing the CrowdStrike agent to update everywhere without vetting it on some test servers (or perhaps a one or two of servers in each server farm) first for a day or two. Live and learn... Oh, if they're on the stock exchange and you have shares, it's gotta be way too late to start selling them now.

        • by Bongo ( 13261 )

          It's highly invasive, and all software has bugs, and odds are that one day, a particular bug will break a lot of things all at once, predictably.

          Why we collectively disregard this, is a bit of a question about human nature.

    • by slack_justyb ( 862874 ) on Friday July 19, 2024 @03:31AM (#64636917)

      Yes. It is. [theverge.com]

      They've released a work around for the matter.

      * Boot into safe mode or the Windows Recovery Environment.

      * Navigate to C:\Windows\System32\drivers\CrowdStrike

      * Locate the file that matches "C-00000291*.sys" for your machine. Delete that file.

      * Reboot machine

      Directions from the TA posted on the matter by Crowdstrike [crowdstrike.com] Someone from Twitter posting a screenshot of the TA [x.com]

      I would say someone is getting fired, but someone is going to be answering to Congress soon enough.

  • by jd ( 1658 ) <<moc.oohay> <ta> <kapimi>> on Friday July 19, 2024 @02:23AM (#64636817) Homepage Journal

    They're giving a rolling update here: https://www.bbc.co.uk/news/liv... [bbc.co.uk]

  • by psic ( 776337 ) on Friday July 19, 2024 @02:25AM (#64636821)
    I think someone just needs to reboot the internet and we'll be good.
  • by bradley13 ( 1118935 ) on Friday July 19, 2024 @02:29AM (#64636825) Homepage

    What possible harm can come from having half the world using cloud services from a tiny number of cloud providers?

    I've used them too, but nowadays the stuff I manage is either at a local provider, or in my own hardware. Smaller providers tend to be a bit more expensive, but they also value your business more than AWS, MS, & Co..

    • Your local provider likely doesn't have the manpower for proper security.

      • by ls671 ( 1122017 )

        I don't think proper security requires that much "manpower", at least in terms of the number of people involved, a few people and even one person with brains is definitely needed although. Proper coordination and oversight are also of primary importance so one hand doesn't open security holes the other hand doesn't know about and the bigger the team, the more people involved, the more that's likely to happen.

      • by Bert64 ( 520050 )

        Keep it simple...

        Keeping a small environment secure is much easier than a large sprawling mess. The more complex a system is, the greater the chance that someone has fucked up somewhere and left a vulnerability open.

        • by gweihir ( 88907 )

          Indeed. And that is why "software engineering" and "IT systems engineering" routinely does not qualify as "engineering" these days. Proper engineering always respects and follows KISS.

      • by MeNeXT ( 200840 )

        This is such a stupid statement. It's like saying you can't protect your house unless you are a multinational corporation.

        • Do you have state-sponsored actors trying to break into your house, like you might get with any system connected to the internet? If so then yeah I'd agree that you might well need some multinational corporation-level resources to protect your home.
      • by gweihir ( 88907 )

        Probably. Why are they using Windows then? Windows does not come with reasonable security and that is well known. If you do not have the manpower to fix it, then do not use a "solution" that is defective by design.

    • by pr100 ( 653298 )

      A smaller provider it as likely to have an outage as one of the major cloud providers' data centres. The overall impact worldwide will be smaller, but if that's where your stuff is hosted then you don't really care about that.

      • by ls671 ( 1122017 )

        Not if local provider has redundancy and replication in several data centers geolocated far enough from each other. I have that and I am very small. I only rent bare metal servers.

      • by gweihir ( 88907 )

        Actually, you do. If that small provider has problems, you will be able to get help and consulting. If a major provider has issues, the market for that will be empty. And that is one of the reasons why tons of eggs in a single or small number of baskets is such a bad idea.

    • Today the problem is a CrowdStrike update causing a BSOD. It's only a "Cloud" problem in that the update got to a lot of PCs very quickly. Microsoft's own Cloud services are not involved.

    • Aside from the fact that the issue isn't cloud related as much as it was a Crowdstrike update, the question is do you shoot someone in the head or give them death by a 1000 papercuts.

      The entire world being down for a couple of hours is not really any different from an end user point than each individual customer being down at a different time. The other issue with clouds is that many smaller providers were *worse* at this than major ones running the cloud. The latter makes the news, the former was largely u

    • What possible harm can come from having half the world using cloud services from a tiny number of cloud providers?

      And yet, no one using those services will learn a thing yet again.

      • by gweihir ( 88907 )

        Obviously. People would have to realize they have been stupid. Many will rather die than admit that.

    • What possible harm can come from having half the world using cloud services from a tiny number of cloud providers?

      What benefit could there be? Well for starters Microsoft's cloud was back up and running within short order.
      Windows 365 deployments from Azure were back up and running within short order.

      It seems the only people who have lasting effects right now are those people who had Crowdstrike services installed on non cloud provisioned systems. I.e. they would have been better off* running their entire OS in the cloud.

      The asterisk is because anyone who has ever used Windows 365 knows it's painfully slow to use a remo

    • by chill ( 34294 )

      Except the cloud stuff was painful, but fixable (for us). The REAL nightmare is the local stuff, specifically laptops and machines that have BitLocker installed. The current fix for that is hands-on with the BitLocker recovery key and admin rights.

      I'm seeing forum posts of companies with globally dispersed teams with 5,000+ laptops that are just staring into the abyss.

  • It appears to be Crowdstrike related rather than specifically Microsoft.
    • by khchung ( 462899 )

      Give us a perspective of what "Single Point of Failure" means in an internet-connected world.

    • by gweihir ( 88907 )

      It actually is a nice combination of the defects of Crowdstrike and the defects of Microsoft Windows. If MS Windows was not such an insecure PoS, something like Crowdstrike would not be needed and would not have to sit deep in the system internals.

      • by jd ( 1658 )

        Furthermore, Windows should not be so fragile. That Windows doesn't trap internal errors in subcomponents and handle them gracefully is a Windows defect.

        • by gweihir ( 88907 )

          Indeed. And it is a gross architecture and design error on top of that. Obviously, the blind MS fanbois are now screaming "Cloudstrike!", because these people tipped the house of cards over. In actual reality, most of the blame lies with MS.

      • Sorry but horseshit. Linux, Unix, MacOS are not immune from bugs, they are not magic OSes that require not remote management. It is clear you have never managed more than the 2 PCs in your house and thus have zero concept of what these software packages do, or what it means to run management software which has administrative privileges on your OS, or what damage that kind of software can do to *ANY* OS.

  • Let's migrate to cloud!

    • and? this has nothing to do with a cloud outage. this is crowdstrike running on local machines.
      • by MeNeXT ( 200840 )

        And yet the cloud is out.

        It has everything to do with cloud. The cloud is out of your control. I didn't even know about this outage and my work day was not affected.

      • Were you even born in 2011?

        Wait, who cares?!

        *goes back to enjoying retirement*

  • by Biotech9 ( 704202 ) on Friday July 19, 2024 @03:04AM (#64636861) Homepage

    So Crowdstrike protects our computers from viruses. But who protects us from Crowdstrike?

  • Yup. (Score:2, Insightful)

    That's what you get for using Microsoft.
  • by knwny ( 2940129 ) on Friday July 19, 2024 @03:09AM (#64636871)
    As reported in multiple places, including here [theregister.com] and here, [cyberdaily.au] this seems to be due to an issue with Cloudstrike which is causing BSODs in Windows machines. The Tech Alert posted on the Cloudstrike website [crowdstrike.com] (accessible only if you have a Cloudstike login) provides a workaround which involves deleting files matching “C-00000291*.sys” from the Windows system directory.

    Meanwhile, the world continues to marvel at the irony of a cybersecurity company taking down the systems around the world due to a botched update.

    • by jd ( 1658 )

      This is one reason I think there should be obligatory quality controls on commercial software. The risks are too high.

      • by gweihir ( 88907 )

        The problem is that security software vendors like Crowdstrike often need to deliver things fast. Sure, they do need quality control, but the blame here is mostly on MS for making things like Crowdstrike needed in the first place. AV and other real-time protection cannot be 100% reliable. Hence you should actually run an OS and application landscape that does not need them.

    • by gweihir ( 88907 ) on Friday July 19, 2024 @07:23AM (#64637269)

      Obviously blame Microsoft. They are 80% at fault here by delivering an OS so insecure that things like Crowdstrike are needed to get it to an acceptable level.

  • by fahrbot-bot ( 874524 ) on Friday July 19, 2024 @03:12AM (#64636881)

    Soon to be renamed, "Microsoft 364" ... :-)

  • The people cannot even open their rental cars because the app cannot establish a connection to the cloud. Stupid idea to make everything dependent from the cloud of a single company. Where are the good old keys?

    • It is NOT because they are dependent on the cloud. Crowdstrike is security software installed on the local PC's Crowdstrike sent out an update around the world that broke their product and BSODed machines. nothing to do with cloud except crowdstrikes servers that sent the patches were probably cloud based. whether you use cloud or not, if you had crowdstrike you are fucked.
      • by Viol8 ( 599362 ) on Friday July 19, 2024 @03:29AM (#64636915) Homepage

        If an app doesn't work it doesn't work, the root cause to the user is irrelevant.

        Please explain how this bad update would prevent a pair of standard car keys from working?

        [crickets]

        People who think like you are part of the problem.

      • by Meetch ( 756616 )

        The problem affecting most people is the cloud systems using the app. To most people, it might as well be that particular cloud, but it won't just be Azure, it's all systems running Windows "protected" by CrowdStrike.

        So yeah, blaming the/a cloud is a stretch. Blaming a cloud for relying on an app with no boundaries set, is IMNSHO fair. If you use the product, or connect to a system that uses it, you likely had a world of hurt today. If you are responsible for your system uptime, and signed off on softwa

      • by MS ( 18681 )

        The problem with a keyless-solution is multilayered. You are dependent from:
        - the mobile phone working
        - the mobile phone being charged (is there a power-connection nearby?)
        - the app on the mobile phone being up2date and working
        - ther authentication working (finger-print readable?)
        - the internet connection being available
        - the cluod-service being accessible
        - the crowstrike-software used by the vendor
        - the servers and software used by the vendor
        - the bluetooth-receiver in the car functioning too with all ist

    • by mjwx ( 966435 )

      The people cannot even open their rental cars because the app cannot establish a connection to the cloud. Stupid idea to make everything dependent from the cloud of a single company. Where are the good old keys?

      I'd say it was their fault for hiring a car where you cannot get a physical key.

  • Microsoft are reporting a misconfiguration in Azure that caused compute resources to lose access to Azure Storage [azure.com] (link may require an Azure subscription). This has caused a large number of Azure-based systems to fall over and MS are picking up the pieces now.
    • by gweihir ( 88907 )

      Sounds like cheaper that possible systems administration and cheaper than possible or completely missing redundancy. Typical MS "trash" level quality. Why anybody depends on them is beyond me.

  • by should_be_linear ( 779431 ) on Friday July 19, 2024 @04:07AM (#64636945)
    how can 3rd party application bring Windows to BSOD?
    • Re:I don't get it (Score:5, Informative)

      by Meetch ( 756616 ) on Friday July 19, 2024 @04:22AM (#64636961)

      Easy. It's a 3rd party application with escalated privileges, intended to detect and fight off cyber security threats. An admin allowed it to be hooked into more privileged layers of the OS, because it needs those privileges to perform some magic that Excel/Outlook/Teams/Word doesn't need.

      This is the second time I've seen CS cause issues... last time it was just burning CPU on a Linux box.

      • by jd ( 1658 )

        OK, but why isn't the OS core trapping errors correctly? It's written in C++ (so there's exception handling), and we can assume that all return codes are checked, right?.....

        • by Sabalon ( 1684 )

          A lot of the lower parts of the OS are written in C or Assembly (thought that is probably a smaller part). Other parts are in Rust and C#. (which to your point should also be trapping exceptions)

          You would hope that things are checked and trapped, but for the level that Crowdstrike is running at, it's one of those "do things are your own risk, you may crash the whole computer". Sounds like someone didn't check something assuming that "there is no way this will never not succeed." Would love to see a po

    • Re:I don't get it (Score:5, Insightful)

      by 1s44c ( 552956 ) on Friday July 19, 2024 @06:18AM (#64637141)

      Bad design. CrowdStrike runs with kernel privileges.

      The fact the crowdstrike exists at all is a symptom of bad Windows design.

      • by gweihir ( 88907 )

        Exactly. Bad design by MS that is. Forst, you should not need anything like Crowdstrike. And second, if you need it, then it should not need to run so that it can crash the kernel.

    • Trivially. You can fuck up any OS when you have an application with privileges required to change system settings. I had an official script from Maverik which kernel panicked my Linux machine back in the day as it attempted to force install the wrong RAID module in the kernel and set it to force load on boot.

      If you can't fuck up your system with administrative privileges, then you're not really an administrator. :-)

  • THERE IS NO CLOUD. There is only somebody else's computers. When you use "the cloud", you are putting information vital to the business you built with your blood, sweat and tears under the care and control of somebody else. When push comes to shove, that person or corporation cares about your business exactly as much as whatever penalty a court might impose if something were to go wrong.

    Other peoples' computers might be great for backup. Running your business from them "because savings" is rolling the d

    • by gweihir ( 88907 )

      Many more times. Because far too many people are not listening and are actively in denial.

  • And a rather pathetic basket at that. Good old MicroCrap at work. I doubt the fanbois that mate their trash great will realize anything though.

    • 1. Clearly reading /. noise has led you down the wrong path - this is NOT A MICROSOFT ISSUE. Sorry for yelling but your hearing is apparently impaired

      2. Please - PLEASE - stop butchering the English language... please.

      • by gweihir ( 88907 ) on Friday July 19, 2024 @06:17AM (#64637139)

        This is a Microsoft issue. Requires a few working brain cells to see that though. MS set this situation up. If MS were doing decent engineering, something like Crowdstrike would not even be needed. Instead, their crap is so bad that you need to embed 3rd party products deep within the OS to get it to an acceptable level.

        So stop yelling, you are looking stupid.

        • If MS were doing decent engineering, something like Crowdstrike would not even be needed.

          1. NO amount of "engineering" will fix stupid users. Ever.

          2. Linux had kernel panic issues with CS not so long ago.... so maybe the OS isn't the issue?

      • by 1s44c ( 552956 )

        Extremely bad design by Microsoft means tools like CrowdStrike are needed. This is absolutely a Microsoft issue, just not a direct one.

      • 1. It is Microsoft who designed update system which automatically install updates without system administrator/computer owner confirmation
        2. It is Microsoft who so misdesigned security model of the operating system, that whole class of antivirus software has to be invented.

      • 1. Clearly reading /. noise has led you down the wrong path - this is NOT A MICROSOFT ISSUE. Sorry for yelling but your hearing is apparently impaired

        2. Please - PLEASE - stop butchering the English language... please.

        Actually, the Crowdstrike issue was one of two for the day.

        The *other* issue they had affected one of my systems caused by inaccessible storage for an Azure SQL service. At the moment, an app of ours mysteriously stopped working and the logs showed entries that boiled down to "timeout waiting to connect". Fire up Azure portal, sure enough, one of the three DBs we have were offline - strange. Then two. Then three. Then one. Round and round we go. Fire up SSMS, no connection. Finally the notice from Azure abo

        • The *other* issue they had affected one of my systems caused by inaccessible storage for an Azure SQL service.

          Acknowledged... Admittedly few services on our side hit because our geo-redundancy is properly configured and tested quarterly...

          N.Europe and W.Europe all good.

          The CS noise has had far wider impact.

  • This outage affected airlines, national infrastructure, stock markets, hospitals, etc. It's completely insane that the IT industry requires no professional licenses while the local barber has to meet more stringent requirements to cut my hair. Time for the industry to mature and set a minimum bar like Realtors did several decades ago. We can't be worse than Realtors, right?

Real Programmers don't eat quiche. They eat Twinkies and Szechwan food.

Working...