Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Bug Cloud Microsoft IT

Azure Failure Was a Leap Year Glitch 247

judgecorp writes "Microsoft's Windows Azure cloud service was down much of yesterday, and the cause was a leap year bug as the service failed to handle the 29th day of February. Faults propagated making this a severe outage for many customers, including the UK Government's recently launched G-cloud service."
This discussion has been archived. No new comments can be posted.

Azure Failure Was a Leap Year Glitch

Comments Filter:
  • by firex726 ( 1188453 ) on Thursday March 01, 2012 @11:41AM (#39209035)

    What is with MS and their apparent inability to cope with leap years?

  • by Anonymous Coward on Thursday March 01, 2012 @12:04PM (#39209447)

    ...they just had the most publicly catastrophic failure. I just noticed that all of the Google Chat messages I received yesterday were sent to me at various times on December 31, 1969.

    And it also seems that I didn't even receive any of them until today, March 1, implying that they were incapable of even sending them yesterday.

  • by UnknowingFool ( 672806 ) on Thursday March 01, 2012 @12:27PM (#39209897)
    No it came from Freescale in a driver that Toshiba used. Not many know that the original Zune was a Toshiba Gigabeat with a new UI and outer shell.
  • by Greyfox ( 87712 ) on Thursday March 01, 2012 @12:35PM (#39210027) Homepage Journal
    Dealing with time is hard, but it's been amusing to watch them experience problems solved by UNIX decades earlier. Daylight savings time was a constant problem for them in the early days, though they seem to have mostly got that ironed out. Every so often they seem to have a regression for a piece of new hardware. Maybe they'll eventually get it right.

    Funnily enough, I used to work at IBM doing OS/2 tech support. OS/2 and Windows NT share a common heritage, so a lot of the behind-the-scenes problems I witnessed in OS/2 were (And sometimes still are) problems with Windows. I'm not sure if this is one of them, but I got a call once from a guy who was trying to use his OS/2 system to track satellites. The problem was, the OS/2 timer API specified that you could set milliseconds but it didn't seem to work. I tracked it down to a timing driver which tracked two separate interrupts. The first interrupt happened every few milliseconds and would update the clock millis when that happened. However, if the system was busy it was possible to not handle that interrupt. There was also a system periodic interrupt every 1 second. When that occurred, the system hard-reset the milli time and incremented the seconds. So you could set the millis, but the clock would become inaccurate 1 second later. Just one example of how time has been a thorn in my side for my entire career. I wrote an APAR up on it which was promptly closed "Working as Designed." Dunno if he ever got it fixed...

  • by UnknowingFool ( 672806 ) on Thursday March 01, 2012 @01:38PM (#39211033)
    According to the details I know it had to do with certificate validation. So part of Azure is using some code that doesn't use standard Windows APIs. Not shocking is that MS does not conform to standards. Shocking is that they don't conform to their own standards.
  • by jc42 ( 318812 ) on Thursday March 01, 2012 @02:59PM (#39212359) Homepage Journal

    What is with MS and their apparent inability to cope with leap years?

    I would like to know the same thing. This seems to be systemic.

    Yeah; it's systemic. Or at least it used to be a few years back, and I wouldn't be surprised if they haven't fixed the basic problem yet. The problem is fairly simple: Windows' internal clock is in local time.

    To a programmer with experience writing date/time code, I've found that this is all you need to tell them. Any software whose internal clock is in local time will be buggy, and it will never be completely fixed. Attempts to fix bugs will merely introduce bugs elsewhere in the chains of date/time handling. The sensible solution is to adopt a "universal time" internally, and convert at the last stage when you present the date/time to a human user. Yes, you theoretically can work with local time internally, but (teams of) humans can't actually make this work in practice. The best they can do is make it work in the "normal" cases. Bug fixes then tend to just move the time bugs around to different places in the code. But it can be very difficult to get management to accept this and agree to UT-only internally.

    Java also used to specify local time internally (and may still do so, but I haven't used it in years). I worked on a number of projects where, after repeated date/time disasters at every switch to/from DST and every Feb 29, java was abandoned and everything was rewritten in a language (usually C++) whose libraries supported a UT timestamp and didn't have all those time bugs.

    Does anyone know if MS Windows has introduced a UT internal time yet? If not, then we can reliably predict that such bugs will continue to plague their users.

  • In all fairness, Microsoft never figured anyone would still be using this service by the time a leap year rolled around.

    Ah, that explains why Zunes went dark on New-Years 2009... [pcworld.com]

    Think about this. You're a software dev, and you use a MS C++ compiler. They wrote their standard libs, including the "time.h" / &ltctime&gt code... you use their time libraries.
    Now two things:
    0. MS employs some real nut-jobs that can't even use the standard time functions and instead write their own for each project...
    or
    1. MS doesn't even trust their own compiler / libraries to do the right thing?

    It scares me to think that MS makes operating systems... IMHO, they should get back to BASICs.

Top Ten Things Overheard At The ANSI C Draft Committee Meetings: (5) All right, who's the wiseguy who stuck this trigraph stuff in here?

Working...