Slashdot Log In
Data Storm Caused Nuclear Plant To Shut Down
Posted by
kdawson
on Sat May 19, 2007 05:00 PM
from the how-not-to-ignore-bad-input dept.
from the how-not-to-ignore-bad-input dept.
rs232 writes to let us know that the US House of Representatives Committee on Homeland Security called this week for the Nuclear Regulatory Commission to further investigate the cause of excessive network traffic that shut down an Alabama nuclear plant. Investigators want to know whether the data storm could have been initiated from outside the plant.
This discussion has been archived.
No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Full
Abbreviated
Hidden
Loading ... Please wait.

Re: The reason? (Score:5, Funny)
Storm in the tubes (Score:5, Interesting)
1) They can't describe what happened
2) They can't tell if outside interference, whatever the nature occurred
3) That this might have an internal/design cause
Re:Storm in the tubes (Score:5, Insightful)
Re:Storm in the tubes (Score:5, Informative)
I used to work as embedded developer, and we used that term.
It was used in embedded communications when one or several devices went bonkers and flooded common bus.
Bit like packet storm, but without IP or other packet protocol, so it was called data storm.
It stands to reason that in nuclear plant there are a lot of old fogeys, so company jargon might be bit outdated and odd sounding to outsider.
Shut down? (Score:5, Insightful)
Do invesigators also want to know how a "data storm" could have caused a nuclear plant to shut down?
nothing to see, move along. (Score:5, Insightful)
Some choice quotes, emphasis added:
An investigation into the failure found that the controllers for the pumps locked up following a spike in data traffic -- referred to as a "data storm" in the NRC notice -- on the power plant's internal control system network. The deluge of data was apparently caused by a separate malfunctioning control device, known as a programmable logic controller (PLC).
"Conversations between the Homeland Security Committee staff and the NRC representatives suggest that it is possible that this incident could have come from outside the plant," Committee Chairman Bennie G. Thompson (D-Miss.) and Subcommittee Chairman James R. Langevin (D-RI) stated in the letter. "Unless and until the cause of the excessive network load can be explained, there is no way for either the licensee (power company) or the NRC to know that this was not an external distributed denial-of-service attack."
Wow. Just...wow. As if you needed more proof that this wasn't a hacking attempt:
"The integrated control system (ICS) network is not connected to the network outside the plant, but it is connected to a very large number of controllers and devices in the plant," Johnson said. "You can end up with a lot of information, and it appears to be more than it could handle."
Seriously, how stupid do you have to be to think "OMG, Haxxors?" Answer: work at Homeland inSecurity, or be a Congresscritter. They already figured it out. It was a controller for a specific piece of equipment that flooded the network and triggered a bug in the variable-frequency-drive controllers for pumps.
You missed one.... (Score:4, Interesting)
Sounds to me that the vendors under-engineered their network and still charged mega-bucks for it. The auditors, I'm sure, are making the most out of this to justify their fee.
Nothing to see, move along - I'll say!
Re:nothing to see, move along. (Score:5, Informative)
However, I will fully put the blame on the PLCs. Those little suckers come in handy but if you don't completely understand every line of code and every instruction they can f_ck you over.
I also love how they say "well if you can't prove it wasn't, then it must have been".
Re:nothing to see, move along. (Score:5, Informative)
It's not stupid. (Score:5, Insightful)
Seriously, how stupid do you have to be to think "OMG, Haxxors?" Answer: work at Homeland inSecurity, or be a Congresscritter. They already figured it out. It was a controller for a specific piece of equipment that flooded the network and triggered a bug in the variable-frequency-drive controllers for pumps.
As someone who used to work in system's engineering for a sister BWR, I think the inspection is a good idea. Oh, there's dumb and there's nuclear dumb but this is not a case of either. Nuclear dumb involves putting machine guns nests inside the plant. Finding the root cause of the accident is a good idea.
Handwaving about a PLC device won't do. What ultimately caused the PLC malfunction needs to be answered at a component level. There's going to be something wrong with it and that should be reported and every other device like it needs to be ripped out and trashed. If there is not component failure, there's a software problem which also must be understood.
Yes, it could have been hackers. The "internal control network" might at some point hits a desk that's connected to the wider world. It could be something mundane and unintentional, like an operator's virused up laptop.
An outage like that is something that's going to have both NRC and corporate ass-chewers looking at everything. Corporate might want to paint a nice picture for the NRC, but the poor devil that lies to them goes to jail. In either case, the problem will be identified and eliminated.
You might also have noted in the article that this is not the first plant to go thumbs down over some winblows born virus. In 2003, the slammer worm caused havoc at an offline Ohio plant [securityfocus.com]. Yes, that was hackers. They did not mean to do it, but the plant's systems were open to it and failed. That's not acceptable from any standpoint.
Despite the better advice of the computer people at the plants, Entergy is a big M$ Partner. They take the big dogs out fishing and sell them the works. Ten years ago, M$ had something worth while and interesting. It was used in places it should not have been. Worse, the flaws from ten years ago have not been addressed or fixed. A good clean up is in order.
Standards! (Score:5, Insightful)
You'd hope that in something as critical as a nuclear power plant the answer would be, very quickly, "no, it didn't come from an external source because that's impossible". Followed by detailed analysis of the logs to determine which internal system screwed up.
That said, the article is a bit sparse on actual technical details, so my derision may be unwarranted.
Re:Standards! (Score:5, Insightful)
Actually, power plants have to have a connection to the outside world. Why? Load-balancing for the power grid. If another plant goes down somewhere, this plant needs to know about it so that it can adjust output to compensate. For that, all the plants need to be hooked to a communications grid, which could conceivably be hacked (even though -- I would hope -- it's not connected to the Internet).
Re:Standards! (Score:5, Interesting)
Given that, any hacking would have to include a social engineering element designed to fool the operators into making the wrong decisions. If we include that stipulation, yes, it's quite conceivable. If we postulate someone bridging the air gap, maybe by something as simple as hooking a laptop that also contains a wireless card into the control network, then a non-social engineering attack becomes conceivable, but not really otherwise.
DOE and NRA doctrine is that adjusting reactor output based solely on a trigger event outside the core instrumentation is supposed to always require a high level human decision. Supervisors are also at least supposed to be trained to the point where they can make these decisions without adding any more response time than a conventional, (i.e. hydroelectric or coal based), plant would need for their human level decision events. (Yes they have them. For example the four TVA dams that supply Alcoa aluminum face a whole series of individual and joint human level decisions every time Alcoa's main furnace system glitches, and these have to include how long Alcoa expects them to need to dump power elsewhere, and for each of them, what options the other three dams are considering).
The DOE does not legally presume that reactors are even as responsible for balancing the grid as conventional plants, but given how much older a lot of the conventional plants are, it's pretty easy to do much, much better than is strictly required, and it should be noted that, in the last New York blackout all the cascade effects and switching failures happened in 1940's era or earlier fossil fuel plants, and the worst points were 1930's or even 1920's era designs. Still, the rules are that if the conventional plants are failing at load balancing, even if the grid is experiencing severe cascade failures, the nuclear sites will let the whole thing crash rather than take the risks of trying to stabilize the grid by actually modulating their reactions.
Re:Standards! (Score:5, Informative)
After some R&D and building some prototypes of promising new designs I'd be right with you - but our current best bets are things out of South Africa (pebble bed) and India (accelerated thorium) done on very small buidgets with very small teams and they need more work. The mainstream is just chasing taxpayer supplied pork. If they were after more than a handout they would be putting in some effort - instead they spend orders of magnitaude in PR, advertising and outright bribes than R&D.
As for costs - you can't just conveniently ignore capital costs. If you could hydro, wind, solar etc would win every time even in those places where it would be a stupid idea or where the capital costs are far too large for the return. Nuclear power is a possiblity in those places that have the infrastucture of a weapons program but everywhere else you would have to build up an entire industry from scratch. Iran is the best example currently where that is taking place and it has cost them a fortune to do so - hence few people think it is for purely civilian purposes there. In South Africa it was possible to take people from the weapons program to develop pebble bed. It is also far too big an investment for private enterprise - hence no new plants getting built while governments had cold feet on the issue and the "new generation" designs from companies like Westinghouse are just tweaked 1950s designs painted green.
Political FUD (Score:4, Interesting)
What network technology were they using? (Score:5, Interesting)
Re:What network technology were they using? (Score:4, Insightful)
Ethernet isn't perfect but it's the only realistic option. Managed properly, it can be very reliable. The biggest problem I see from this article is that there is a lack of regulation and testing of the equipment that goes in to these plants. These poor TCP/IP stacks should have never gotten past the testing phase when it comes to a nuclear power plant.
Re:What network technology were they using? (Score:4, Interesting)
Even stupider (Score:5, Insightful)
"What is happening in this marketplace is that vendors will build their own (network) stacks to make it cheaper," Peterson said. "And it works, but when (the device) gets anything that it didn't expect, it will gag." So you mean to tell me pretty much there is no enforcement for manufacturers to maintain compliance on their products even if those products are going into a nuclear *ANYTHING... Which on the worst case scenario could cause catastrophe, yet we have regulatory commissions on the flow of ketchup, regulatory commissions/directions/etc., on weight loss products, lipsticks, etc. (FDA), but this place is not concerned with nuclear plants. Sinful.
Brown's Ferry *AGAIN!?!??!* (Score:4, Informative)
At least their reactor failed to "off" this time...
Schwab
The way I think the conversation went (Score:5, Funny)
"It seems the problem was with the NC9828A chip"
"Oh? And what was the problem?"
"It melted, basically. It went bonkers."
"Ah, and then what happend?"
"Err... it caused the shutdown."
"But how?"
"Well, I presume the AH-982's got deluged with data, so they shut off."
"Ah, so it was some sort of data thing."
"Kind of, the failing chip would start sending data in the network t--"
"Hey, it's like a storm of data! Hah! I get it!"
"Umm, basically."
"Oh man. A data storm! I better tell the NRC"
"Ok, sure."
Later...
"Sir, I have the cause of the shutdown, it was caused what the tech guys here would call a data storm."
"A data storm? Wow. So your reactors got a bunch of bad datas, right?"
"Errr.. kind of, the microchips melted."
"Data can do that?"
"Yeah, it's like a storm on our, uh, logic networks. I guess that can melt the microchips"
"Uh oh. Maybe this storm came from outside the plant! One of those hacker attacks!"
"Hmmmm, the guy said it melted, but I suppo--"
"Oh crap I better inform Homeland Security!"
"Ok, sure."
Later still...
"Yeah, we had a data storm and it melted the reactor networks."
"How did this data storm happen?"
"I don't think they know yet, but it messed up big time."
"My God. Do you realize this could be Al Qaeda?!!"
"Could realize wha--"
"Al Qaeda! Terrorists. Internets terrorists."
"I don't know if the reactors are hooked up to the Interne--"
"Listen. Keep this quiet, but make sure you tell everyone you know. These reactors are not safe! No one is safe from the terror!"
"Well, it was a data storm. Can terrorists make data storms?"
"Yes. They caused your meltdown."
"No, no, the microchips melted down because of the storm. A meltdow--"
"In the terror business, there's more than one type of meltdown, you just let us handle this."
"Ok, sure."
Re:Redesign the entire infrastructure (Score:5, Insightful)
i think the fact that an unforeseen erroneous condition caused the plant to *shutdown* and not *meltdown* is a pretty good indication that it was designed quite well.
There will always be unforeseen situations. The key is for the system to shutdown in an orderly fashion. In programming, this is accomplished through use of error traps.
Now, the hysteria surrounding terrorism is another thing the plant engineers have to worry about.
i just wonder if and when we get to put this hysteria behind us, and get along with our lives. unfortunately, terry gilliam's brazil is on a constant loop in my mind these days. . . .
mr c
Re:Redesign the entire infrastructure (Score:4, Interesting)
This might sound unreasonable but I would never expect a power plant (which has a lot of things depending on it) to shut down unless there was a major failure of a component or some other safety risk. Network traffic on its own, or its effects shouldn't ever be the cause. In a nuclear power plant you control ALL the nodes attached to the network, the nodes attached should not be in a position where they can saturate any individual node to the point of failure, especially if that failure causes a shut down of something as critical as a power station.
I can think of times where I have seen massive network spikes usually caused by issues with routing on fairly non-trivial networks, or loops where mistakes have been made and policies have not been followed, (lack of sleep or lack of patience), but then comparing an advertising companies internal network at 3am, or a paper factories network at midnight to a nuclear power station is taking it a little far.
That would be fair if we were talking about a software failure after some sort of unforeseen environmental issue, it would even be OK if an auto plant stopped production because of an unforeseen fault, and whilst power plants should certainly fail safe, they should be robust enough that a situation where failure is the only option is extremely difficult to achieve. whatever happened to redundancy?
I would suggest that this is hype to 1) keep terrorism at the top of everyone's agenda, and make people feel unsafe, after all that sells papers and grabs viewers (which in turn sell advertising) 2) deflect some of the negativity that this incident would produce (I wish that I could blame terrorists for my mistakes sometimes... "no that project plan... I haven't got it, but I'm checking to see if my poor time management is caused by terrorism or simply my inability to organise my resources properly") and 3) Security risks presumably attract additional funding, sureley it would be nice to get an extra few million in the next budget.
Honestly, this probably shows a component failure and some poor design, understandable, but unacceptable in this area. If and I say If with some considerable doubt, this turns out to be, or is reported as an external event, then whoever enabled external network access to what appear to be critical systems within a nuclear power plant on the US mainland need to be identified and punished, together with the contractors who built or maintained it, the managers or consultants that assessed and managed it and the politicians who have responsibility for public safety. But as I said, it will probably turn out to be a simple component failure and some poor design.
Re:No kidding (Score:5, Funny)
Sometimes such connections are sooo slow, it makes users cry. They don't call it onion routing for nothing, eh?
Re:The last thing we need (Score:4, Funny)