Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×
Bug IT

The Leap Second Is Here! Are Your Systems Ready? 284

Tmack writes "The last time we had a leap second, sysadmins were taken a bit by surprise when a random smattering of systems locked up (including Slashdot itself) due to a kernel bug causing a race condition specific to the way leap seconds are handled/notified by ntp. The vulnerable kernel versions (prior to 2.6.29) are still common amongst older versions of popular distributions (Debian Lenny, RHEL/CentOS 5) and embedded/black-box style appliances (Switches, load balancers, spam filters/email gateways, NAS devices, etc). Several vendors have released patches and bulletins about the possibility of a repeat of last time. Are you/your team/company ready? Are you upgraded, or are you going to bypass this by simply turning off NTP for the weekend?" Update: 07/01 03:14 GMT by S : ZeroPaid reports that this issue took down the Pirate Bay for a few hours.
This discussion has been archived. No new comments can be posted.

The Leap Second Is Here! Are Your Systems Ready?

Comments Filter:
  • Irony (Score:5, Funny)

    by bughunter ( 10093 ) <bughunter.earthlink@net> on Saturday June 30, 2012 @05:57PM (#40507561) Journal

    Leap years = no problem.

    Leap seconds = kernel panic.

    I fear for teh internets if we try a leap millisecond.

    • Re:Irony (Score:5, Interesting)

      by at10u8 ( 179705 ) on Saturday June 30, 2012 @06:03PM (#40507587)
      The time service bureaus used to insert leap milliseconds at almost any time. See the bottom plot at http://www.ucolick.org/~sla/leapsecs/amsci.html [ucolick.org] where there were 29 leaps in 3 years.
    • It was not the largest of timer overflows that killed them, but the tiniest leap second...

    • Jokes aside, the issue is that leap seconds are not deterministic.

      TAI, based on whatever flavor of atomic clock is currently the going standard, is markedly more regular than the observed time based on the movement of the earth(UT1). UTC ticks at the same rate as TAI; but is supposed to correspond to UT1, which ticks at an unpredictable rate. So, whenever UT1 and UTC drift too far apart(DUT1 approaches .9 seconds), UTC gets a leap second. The UTC tick rate is constant; but the leap-second days are 1 seco
    • Ziggy must be on the sauce again.

    • the feds of every country cant bring TPB down but a leap second can??? now THATS irony
  • by Anonymous Coward on Saturday June 30, 2012 @06:13PM (#40507639)

    For those wondering whether they get one more second of sleep tonight or one less, the rule is 'spring forwards, fall back, summer stand there looking confused'.

  • by Joe_Dragon ( 2206452 ) on Saturday June 30, 2012 @06:16PM (#40507661)

    what about the metric time system?

  • by GoodNewsJimDotCom ( 2244874 ) on Saturday June 30, 2012 @06:25PM (#40507705)
    Hello, Some of us code our systems somewhat like a finite state machine, and we figure our machine will never operate outside it.

    If you're testing if something that increments ever hits a number(like 10) and goes back to 0, instead of checking if it ==10, check if it is >9.
    There are a lot of defensive coding mechanisms you can use. The downside of this is that when you debug, something can sneak by and put you outside of a state you want, so it makes it ever so slightly harder to debug. But if you're making software that will be used by the public that is hard to give updates, defensive programming can save the day here and there. ,Jim
  • by daff2k ( 689551 ) on Saturday June 30, 2012 @07:52PM (#40508067)

    Looks like Reddit's systems weren't ready for the leap second. It been down since around midnight (UTC). You'd think a site as big as that would be ready for such an event.

  • Terminology? (Score:2, Interesting)

    by Anonymous Coward

    If 2012 is a leap year, doesn't that make 2012-06-30 23:59 a leap minute?

  • ...at least not on any of my servers, so what's a leap second between friends?

  • They've been "Down for Emergency Maintenance" for quite some time now.

    One would think they'd have seen this coming, because I'm pretty sure there's a /r/leapsecond subreddit that covers this.
  • by jasnw ( 1913892 ) on Saturday June 30, 2012 @09:32PM (#40508453)
    This is also an issue for software that works with GPS data and time. The GPS clocks do not "speeka da leapsecond" so the software needs to keep track of things. There was a 15 second offset, and now it's 16 seconds. This has happened often enough that most areas where this might have been a problem have been discovered, but as slashdotters know, there's new code written every second (even leap seconds), and it ain't all finest kind.
  • by mwhahaha ( 172475 ) <mwhahaha AT vt DOT edu> on Saturday June 30, 2012 @09:43PM (#40508483)

    /etc/init.d/ntpd stop; date; date `date +"%m%d%H%M%C%y.%S"`; date;

    Fixed the issues I was having. Credit goes to https://twitter.com/SilvioSantoZ/status/219250677522767872 [twitter.com]. I didn't have to restart anything after running it. YMMV

    • That's a fix? Other than shutting down ntp, what does it do? date `date +"%m%d%H%M%C%y.%S"` is pretty much a no-op.
  • by Urban Garlic ( 447282 ) on Saturday June 30, 2012 @10:19PM (#40508593)

    So the comments are confusing to me as to whether Debian "squeeze" is supposed to have a problem or not, but I have about fifty of these systems running, and as far as I can tell, they're all fine.

    I got a whole bunch of these in the logs:
    > Jun 30 19:59:59 kernel: [timestamp] Clock: inserting leap second 23:59:60 UTC

    I have three of the machines configured as NTP peers to each other, and looking at a few tier-1 time servers. The rest of the machines all use the three local peers as time servers.

    My Debian desktop systems at home also seem to be fine.

  • My Exede satellite internet service was out from 8:00 EDT to 9:50 EDT... I have no way to verify it was caused by the leap second, but it seems a little coincidental.
  • Described here (w/dump): https://lkml.org/lkml/2009/1/2/373 [lkml.org]

    Simple version:
    "dont kill the messenger" except when the messenger is going to kill you. Its printk sending notice that the leap second happened that deadlocks against the timer doing the leap second (both vying for xtime_lock). Call it a "feature" of the NTP code. Hence the "turn off NTPD" workaround, if NTP doesnt get notified it should implement the leap second from somewhere upstream, it wont notify about it to the kernel, and the printk sho

  • by jampola ( 1994582 ) on Sunday July 01, 2012 @06:23AM (#40509793)
    ...And some are not. Note to self: Do not take a holiday during a leap second!

    I had 2 Debian Squeeze Blade servers in Thailand kernel panic on me at 3am (AEST). What strikes me as odd as out of the 6 blades that we have Debian running on (all running squeeze and kernel 2.6.32 with identical packages) only 2 of them had a Panic, and so much for the advisory saying it only affects kernel 2.6.29. There might be more to it than the kernel but sheesh, I'm on holiday!

Avoid strange women and temporary variables.

Working...