Forgot your password?
typodupeerror
Cloud Bug

Dark Day In the AWS Cloud: Big Name Sites Go Down 182

Posted by timothy
from the central-authority-vs-resilience dept.
An outage of one company's servers might only affect that company's customers — but when a major data center for Amazon hits kinks, sites that rely on the AWS cloud services all suffer from the downtime. That's what happened today, when several major sites or online services (like Instagram and AirBnB) were knocked temporarily offline, evidently because of problems at an Amazon data center in Northern Virginia. From TechCrunch's coverage of the outage: "The deluge of tweets that accompanied the services’ initial hiccups first started at around 4 p.m. Eastern time, and only increased in intensity as users found they couldn’t share pictures of their food or their meticulously crafted video snippets. Some further poking around on Twitter and beyond revealed that some other services known to rely on AWS — Netflix, IFTTT, Heroku and Airbnb to name a few — have been experiencing similar issues today."
This discussion has been archived. No new comments can be posted.

Dark Day In the AWS Cloud: Big Name Sites Go Down

Comments Filter:
  • by bill_mcgonigle (4333) * on Sunday August 25, 2013 @08:07PM (#44672605) Homepage Journal

    I thought this might already exist, but I'm not finding it with a quick Google search. Seems like it's a thing that could get ad views from some decent IT audiences.

  • by Anonymous Coward

    When morons can't watch TV (or equivalent) they fuck. 9 months later you'll see a birth rate spike.

  • by JenovaSynthesis (528503) on Sunday August 25, 2013 @08:26PM (#44672699)

    That went down and I think it ate some files with it. Just before the crash my client reported 103 files being removed. They weren't by me.

  • by Guru80 (1579277) on Sunday August 25, 2013 @08:37PM (#44672779)
    I thought for sure the first comment would be "I'm on to you NSA...down time for service "upgrades" " I'm disappointed in you my tin foil hat wearing brethren.
    • by AHuxley (892839)
      An interesting part is why the brands selected to stay in one part of the USA? With all that cheap power, skilled workers and tax breaks offered by other states?
      What keeps big data clinging to the Eastern USA?
      • by i.r.id10t (595143)

        population density, closeness to physical infrastructure, larger pool of qualified workers (maybe), etc.

  • by MillerHighLife21 (876240) on Sunday August 25, 2013 @08:40PM (#44672811) Homepage

    I've run servers on both Amazon and Rackspace for several years now and I can't recall a single instance of Rackspace having an outage. On the other hand, Amazon seems to have major issues at least 2 or 3 times a year. Is this stuff tracked anywhere?

    • by AHuxley (892839)
      Would make a good site, a historic long term heat map of server outages. A lot of tech press to search back into, thankfully you can buy into digital press databases :)
    • by CritterNYC (190163) on Sunday August 25, 2013 @10:29PM (#44673289) Homepage
      It depends which data center you're in. PortableApps.com has been hosted at Rackspace for years and we had multiple major outtages due to ongoing power issues in the Dallas data center in 2009. The switch from grid to ups was failing and would take the whole wing of the data center out with every server crashing hard. It would take quite a while to come back up. Then we'd have to wait hours for the Rackspace folks to rebuild our corrupted database (fully managed account on a dedicated server). It happened two weekends in a row in June and one other time if I recall correctly, basically costing us a full day of downtime each time.
      • I was a former Slicehost user in the St. Louis data center and then was moved to Chicago after the Rackspace acquisition. Even so, there's never been so much as a blip from there in the last 5 years. Probably is data center dependent, I just never remember hearing about anything.

        Friend of mine here in town owns a web business using about 9 Rackspace servers to host 700 websites and he said they hadn't had an outage in the last 8 years.

  • Were there any problems with Amazon.com? You'd assume they use their own service.
  • Isn't this why AWS offers multiple regions?

    Such large sites should understand that having multiple availability zones means nothing if the zones are all in the same region. Oh, and your application would need to be designed for failover.

    In addition, when looking for high-availability, you don't segregate your audience to individual regions. You let the working regions take over for you.

    Or spend the extra money and set up your own co-lo arrangement.

  • by gweihir (88907) on Sunday August 25, 2013 @10:12PM (#44673231)

    That things like this will happen with a cloud infrastructure are obvious. That the reliability claims made by the cloud providers are fantasy is also obvious. As soon as they start to do "uptime or else" (meaning you get tons of money as downtime compensation), things may be different. but they will not do that. At this time, the only thing you can do is change to a different cloud provider, which will have the same issues. Uptime guarantees without penalties when failed to meet them are worthless.

    • We built a decentralized network called The Internet, even capable of withstanding global thermonuclear war -- packets rerouted moments after a city disappears from the mesh... And folks use data silos? Protip: Don't centralize services, that's daft in terms of both uptime and congestion.

      • The internet ain't what it used to be. The internet of today couldn't withstand a punk with a BB-gun, let alone a tacnuke.

    • by mcrbids (148650)

      Our contract at data center that we host at has significant penalties for downtime [qualitytech.com]. In about 6 years of hosting there, we've had exactly 2 incidents of less than 1 hour each.

      Of course, the deluge of notifications we get every time a fly causes a ballast to fail in the 3rd light down the main hallway, or when our network usage at 95% exceeds the monthly average by 0.05% get a bit annoying, but I have no complaints of the quality of service.

      • by gweihir (88907)

        Seems some people are getting it right. No surprise. One reason the "cloud" is cheap (well, it is not really if you look closely enough), is that it cuts corners that cannot be cut when reliable operation is needed.

  • by bagboy (630125) <.ten.citcra. .ta. .oen.> on Sunday August 25, 2013 @10:38PM (#44673317)
    public cloud services as "the future". I will never risk my corporate data uptime and reliability to some "location in the cloud". I'll stick to private clouds (VMWare/VCenter) where I have control of both hardware and software and reliable failsafe systems. At least then if I have downtime I also have accountability and predictability. They same cannot be said for cloud providers and no matter what anyone says once the data leaves your hardware, you have lost that control.
  • by elfprince13 (1521333) on Sunday August 25, 2013 @11:25PM (#44673665) Homepage
    Shouldn't this, technically speaking, be a "bright day" or a "sunny day"? After all, that's what I call it when the cloud-coverage breaks around here.
  • For believing and investing in some handwavy concept called 'cloud' where you abrogate responsibility take the iOS view (it Just Works) of technology.

  • by Required Snark (1702878) on Monday August 26, 2013 @01:21AM (#44674261)
    It's a thought experiment: pretend it was the FAA having a big chunk of airspace loose all ability to track aircraft, or NOAA loosing data collection so that weather forecasts are disrupted. (This, or something like it happens from time to time.)

    The right wing talking heads on TV would be squealing like stuck pigs. They would be screaming about "gubment" waste and incompetence, and start floating bills to privatize the FAA (or whomever). You'd get the same response on Slashdot as well.

    Meanwhile in real life AWS, Google, and NASDAQ have all had dramatic failures in recent weeks. Although NASDAQ got a fair amount of coverage, and Google got some mention, AWS has been pretty much below the radar for the mainstream media. No one is making dramatic statements on TV about how Google is run by a bunch of idiots, or NASDAQ, a quasi-governmental entity, should be nationalized, because when it fails the entire economy is as risk. As far a critical comments, it's the sound of crickets.

    Clearly, there is a double standard. When there are problems with technology in the public sector, it's all hostility and table thumping. Similar failures in the private sector are treated like natural disasters completely beyond human control. According to common rhetoric, the private sector is always better then the public sector. Yet when the private sector fails, no one ever compares it to the well functioning public sector.

    There is clearly a lot of hypocrisy in bashing the government. A lot of political power is at stake, and along with that goes a lot of money. This situation makes some people very happy, because they are getting what they want, both in public policy and private profit.

    • pretend it was the FAA having a big chunk of airspace loose all ability to track aircraft, or NOAA loosing data collection so that weather forecasts are disrupted...The right wing talking heads on TV would be squealing like stuck pigs. They would be screaming about "gubment" waste and incompetence.

      Because its their money being wasted.

      Meanwhile in real life AWS, Google, and NASDAQ have all had dramatic failures in recent weeks. Although NASDAQ got a fair amount of coverage, and Google got some mention, AWS has been pretty much below the radar for the mainstream media. No one is making dramatic statements on TV about how Google is run by a bunch of idiots, or NASDAQ, a quasi-governmental entity, should be nationalized, because when it fails the entire economy is as risk.

      The people who care (i.e. people who were hosting at US-East-1) know, and they have the opportunity to withdraw their custom from AWS. They can employ another provider, or bring it in-house and do it themselves.

      Clearly, there is a double standard.

      No, you are just comparing apples and oranges - people don't bitch about private companies because, worst comes to worst, they can take their custom elsewhere. Government needs to be held to a higher standard precisely because that freedom is lacking.

      • Because its their money being wasted.

        To some extent. But it is frequently pennies of their money that is being wasted, if that much. They make it sound like they're the sole supporters of whatever government agency had an issue.

        No, you are just comparing apples and oranges - people don't bitch about private companies because, worst comes to worst, they can take their custom elsewhere. Government needs to be held to a higher standard precisely because that freedom is lacking.

        In quite a number of instances, you can't - or at least, there is no comparable product to select. Example: Internet access, health insurance, airport, airline, power, etc.

        And government isn't being held to a higher standard, government is assumed to be incompetent by definition. Not worse, not less efficient, but liter

  • entertainment and social media down? who gives a shit? grow up.

  • I have to say with all of the big names having problems recently this has been one of the best weeks ever for the lowly corporate sys admin. Now if the company's email, file or web server--or even the coffee machine goes down, they can point to the big names that also have problems. It's great to be able to say that even at companies like Amazon, Google or Microsoft with all of their talents their servers also have problems. It's the greatest excuse ever for tripping over the power cord. And if that doesn

How many QA engineers does it take to screw in a lightbulb? 3: 1 to screw it in and 2 to say "I told you so" when it doesn't work.

Working...