Stories
Slash Boxes
Comments

News for nerds, stuff that matters

Richard Feynman, the Challenger, and Engineering

Posted by CmdrTaco on Wednesday February 20, @11:28AM
from the this-is-not-warm-fuzzies-on-a-cold-morning dept.
An anonymous reader writes "When Richard Feynman investigated the Challenger disaster as a member of the Rogers Commission, he issued a scathing report containing brilliant, insightful commentary on the nature of engineering. This short essay relates Feynman's commentary to modern software development."

Related Stories

The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.

Richard Feynman, the Challenger, and Engineering 25 Comments More | Login | Reply /

 Full
 Abbreviated
 Hidden
More | Login | Reply
Keybindings Beta
Q W E
A S D
Loading ... Please wait.
  • External Pressures Ruin Engineering (Score:5, Insightful)

    by eldavojohn (898314) * <my/.username@@@gmail.com> on Wednesday February 20, @11:30AM (#22489314) Homepage Journal
    I'm a software developer. I would like to think of myself as an engineer but to me that's a higher title that belongs to people who actually engineer original ideas.

    The problem with the shuttle disaster (both of them, really) is external pressures that are not in anyway at all scientific. The pressure from your manager at Morton Thiokol to perform better, faster and cheaper. The pressure from the government to beat those damned ruskies into space at all costs.

    So this is really a case of engineering ethics, when do you push back? As a software developer, I never push back. Me: "There's a bug that happens once every 1,000 uses of this web survey but it would take me a week to pin it down and fix it." My Boss: "Screw it--the user will blame that on the intarweb, just keep moving forward." But could I consciously say the same thing about a shuttle with people's lives at stake? No, I could not.

    So when an engineer at Morton Thiokol said that they hadn't tested the O-Ring at that weather temperature that fateful day and that information was either not relayed or lost all the way up to the people at NASA who were about to launch--it wasn't a failure of engineering, it was a failure of ethics. External forces had mutated engineering into a liability, not an asset.

    And there's a whole slough of them [wikipedia.org] I studied in college:

    * Space Shuttle Columbia disaster (2003)
    * Space Shuttle Challenger disaster (1986)
    * Chernobyl disaster (1986)
    * Bhopal disaster (1984)
    * Kansas City Hyatt Regency walkway collapse (1981)
    * Love Canal (1980), Lois Gibbs
    * Three Mile Island accident (1979)
    * Citigroup Center (1978), William LeMessurier
    * Ford Pinto safety problems (1970s)
    * Minamata disease (1908-1973)
    * Chevrolet Corvair safety problems (1960s), Ralph Nader, and Unsafe at Any Speed
    * Boston molasses disaster (1919)
    * Quebec Bridge collapse (1907), Theodore Cooper
    * Johnstown Flood (1889), South Fork Fishing and Hunting Club
    * Tay Bridge Disaster (1879), Thomas Bouch, William Henry Barlow, and William Yolland
    * Ashtabula River Railroad Disaster (1876), Amasa Stone
    So I agree with Feynman's comments in relationship to engineering and the further comments to software development. But I don't find them to be a fault in the nature of engineering, just a fault in our ethics. What does capitalism and competitiveness drive us to do? Cut corners, often.
    • by DBCubix (1027232) on Wednesday February 20, @11:44AM (#22489542)
      The Kansas City Hyatt Regency walkway collapse was an engineering problem. The contractor asked to take a shortcut (instead of threading a nut up a three story threaded rod, they asked to cut the rod and offset it several inches) and the engineers rubber-stamped it without checking what the ramifications would be. The engineering part was not originally flawed, but it was when they approved the change order.
      • by Sanat (702) on Wednesday February 20, @12:49PM (#22490620)
        I stayed at this Hyatt over several different weekends while there was dancing and music on the ground floor. What would happen is that several individuals would get the walkways to start swaying and then reinforce the sway by shifting their bodies at the right instant causing additional sway from the positive feedback. it was not unusual to experience 3 to 4 inches of sway.

        Although this swaying is not normally mentioned in the articles about the construction of the Hyatt, it went a long way towards weakening and stressing the connectors supporting the floors.

        Two of my friends were dancing on the floor when the walkways gave way and both were killed.

         
      • by russotto (537200) on Wednesday February 20, @01:45PM (#22491552)

        The Kansas City Hyatt Regency walkway collapse was an engineering problem. The contractor asked to take a shortcut (instead of threading a nut up a three story threaded rod, they asked to cut the rod and offset it several inches) and the engineers rubber-stamped it without checking what the ramifications would be. The engineering part was not originally flawed, but it was when they approved the change order.
        Right, except that the original design wouldn't have worked, as the integrity of the threads could not have been maintained during construction and thus the nut could not have been put on. So in software terms it was a last-minute patch to fix a show-stopper, which wasn't adequately unit-tested.
    • by Vicious Penguin (168888) on Wednesday February 20, @11:48AM (#22489604)
      > What does capitalism and competitiveness drive us to do? Cut corners, often.

      Maybe, but remember what your own example shows -> What is the cost/benefit of fixing/preventing an error? Is a week of debug time worth missing your target ship date? Maybe, maybe not - depends on the error.

      A blanket indictment of capitalism is quite unfair. You would still have the same cost/benefit analysis regardless of economic system you toiled under.

      Is is not possible to engineer against all eventualities; trying to do so will usually keep you from ever getting off the ground.
      • by Protonk (599901) on Wednesday February 20, @12:01PM (#22489800)
        This is true to an extent, but safety concerns can and should be engineered for. You are absolutely right that there exists no direct corollary between software debugging for some non-critical application and meeting safety margins for a critical product. However some software IS critical. Flight software (This portion of Feynman's essay about NASA's flight software is amazing), software for hosptial applications (pharmacy, PCA's, microsurgery), ABS/suspension control software. Those are applications with VERY critical outcomes. Safety conerns need to be built in to the process.

        But I do agree that tradeoffs occur under any system. Those tradeoffs just let us make better decisions under capitalism whereas we can't allow the information from those tradeoffs to inform us economically in a socialist system.

          • by Protonk (599901) on Wednesday February 20, @04:59PM (#22494504)
            It's not a random assertion at all. It's a foundation of economics. the world is full of information particular to place and time, on other words, the nitty-gritty. If you were to make a statistical model of part of the world, that stuff would get buried in the "other" term. Unfortunately, where there is a lot of "other" it becomes hard to model. Take for instance, who to give cars to. Should I have a survey and have the outcome determines who gets the car? Should I give the car to someone who needs it the most or will use it the most effectively? How do I judge that? how do I stop people from lying to me? I could, alternately, just sell the car to someone for an agreed upon price. That means I learn at least how much it is worth to them (it may be worth more) and the car goes somewhere. Prices transmit information and preferences better than any 5 year plan or government study. Sometimes markets have failures and those need to be dealt with, but that is not what I am talking about.
    • Re: (Score:3, Interesting)

      There are other disasters that don't stem from the profit motive:

      Loss of the USS Thresher during initial sea trials.

      Steam Line Rupture on the USS Iwo Jima.

      Both of those were caused by engineering (the first) or procurement faults.

      The thresher was lost with
    • Re: (Score:3, Informative)

      There is a point you miss there I think. It is the top-to-bottom design philosophy vs the bottom-to-top. The first one gives objectives first then designs every part so that it fulfills the general objective. The latter focuses on designing simples element
    • Re: (Score:3, Insightful)

      I'm a software developer. I would like to think of myself as an engineer but to me that's a higher title that belongs to people who actually engineer original ideas.
      Well I know I'm missing the point of your post with this, but a quick google comes up with this description of an engineer:

      a person who uses scientific knowledge to solve practical problems
      I think your higher title should be an 'inventor'. Engineers are the guys that generally plod away using well tested mechanical or
      • by esocid (946821) on Wednesday February 20, @12:09PM (#22489946)
        Apparently you've never taken engineering ethics. The first class I had to take as a general engineering major. Needless to say, I changed majors but still got a hell of a lot out of that ethics class. The parent was right. These were all cases of cutting corners, either in terms of cost or time. Managers wanted it done quickly and cheaply, whether that meant mixing concrete improperly, or buying sub-par materials, or just ignoring what the engineers are telling them. It always came down to about 95% managerial and the rest engineering error.
  • wow (Score:5, Funny)

    by loconet (415875) on Wednesday February 20, @11:36AM (#22489394) Homepage
    For a second there I thought I read "Rogers Communications" and "brilliant" and "engineering" in the same sentence. I thought I had been kicked to an alternate universe where I wouldn't be able to escape. I am glad to be back.
  • A future essay... (Score:3, Funny)

    by StarfishOne (756076) on Wednesday February 20, @11:39AM (#22489454)
    A future essay relates Feynman's commentary to modern web hosting, load balancing and the so-called Slashdot effect"
  • Mirror (Score:3, Informative)

    by fishdan (569872) on Wednesday February 20, @11:40AM (#22489480) Homepage Journal
    http://duartes.org.nyud.net/gustavo/blog/post/2008/02/20/Richard-Feynman-Challenger-Disaster-Software-Engineering.aspx [nyud.net] As a side note, could someone make a grease monkey script to make all links frmo /. run through coral? it just makes sense
  • Faster, Better, Cheaper (Score:5, Insightful)

    by Protonk (599901) on Wednesday February 20, @11:44AM (#22489538)
    To be fair, the Challenger disaster actually preceeded NASA's slogan and procurement policy of "faster, better, cheaper" by a bit. More to the point, Feynman's article should be a cautionary tale to ANYONE in a engineering field. It isn't a matter of one field being subject to unscientific pressures and another field being immune. No technology or industry is immune from the pressures and problems that caused the challenger disaster. Anyone who claims to be well adapted to safety concerns enough to not spend lots of time and effort on fixing them is foolish. The nuclear industry still has to practice strong QC on parts, procedures and maintenance and CONTINUE that practice. Same with commercial aviation, acute medical care, etc. Constant vigilance is rewarded only with another uneventful day. That is the fundamental problem. Vigilance is expensive and time consuming. these are not pressures from the profit motive. They apply to government as well as civilian ventures.
  • Tag on to a famous essay... (Score:5, Interesting)

    by sphealey (2855) on Wednesday February 20, @11:44AM (#22489540)
    (I will refrain from a four-step Profit post). Standard technique: latch on to an essay by a brilliant and insightful person. Extend the insights of that person slightly into a different field with usual compare-and-contrast, brand-extension writing techniques. Claim that resulting essay (and self) are as insightful as the original essayist.

    It doesn't work 99.994% of the time, generally because very few people are as insightful as the original brilliant person.

    sPh
    • Re: (Score:3, Interesting)

      good point. I would suggest reading up on Dr Feynman as a precursor. Or, for those who prefer the flickering screen; there are several video interviews with the great man. One from Horizon called "The Pleasure of Finding Out" is VERY watchable. Also his
  • Hm. (Score:5, Insightful)

    by gardyloo (512791) on Wednesday February 20, @11:51AM (#22489630)
    The blog post makes a nice contribution by linking to Feynman's original thoughts (for example, here: http://www.ranum.com/security/computer_security/editorials/dumb/feynman.html [ranum.com] ), ones I haven't read for a long time (and was happy to be reminded of). However, the author makes the mistake of thinking that the original thoughts need to be interpreted and summarized for the reader. Feynman's words by themselves are simple to understand, are concise, and contain just the tone for which geeks go gaga. Anyone interested in the subject will be able to make his or her own judgements about the engineering and politics involved in the Shuttle development, engineering in general, and the extensions to software development.
    • Re: (Score:3, Insightful)

      This is a very good point. Feynman has the unique quality of startling intelligence, curiosity, and straightforwardness. Some authors need to be summarized. Feynman just needs to be trotted out every generation or so.
      • Re: (Score:3, Informative)

        I agree, and tried not to summarize at all. Mostly I just tried to link what Feynman said to software, rather than make a fool of myself paraphrasing him. That's also why the entry is really short, and basically tells people to go read the source :) cheer
  • Surely You're Joking (Score:3, Interesting)

    by Yoweigh116 (185130) <yoweigh AT gmail DOT com> on Wednesday February 20, @12:00PM (#22489782) Homepage Journal
    Offtopic, but I highly recommend Surely You're Joking, Mr. Feynman [amazon.com], the autobiography he narrated on his deathbed. It's got some great stories in it, like when he surreptitiously went around picking locks at Los Alamos or his personal recollections of the Trinity nuclear tests.
  • Chartered engineer status (Score:5, Insightful)

    by Martin Spamer (244245) on Wednesday February 20, @12:18PM (#22490080) Homepage Journal
    The biggest problem is most software developers are NOT chartered professional software engineers, so have no personal, professional and legal responsibility for their work. That is why IT is full of cowboys and trust is nearly none existent. Software Engineers must become a chartered only profession, so that people who are not chartered are not allowed to practice.

    To qualify as a Professional Engineer we should place good practice above short term gains. Professional Engineers should be truthful and objective and have no tolerance for deception or corruption. Professional Engineers only work in areas were they are competant. Professional Engineers build their reputation on merit and their skills through continual learning and the skills of their charges through ongoing mentoring.

    We wouldn't have to put up with the shoddy work of cowboys, because they wouldn't be allowed to practice. We wouldn't have to put up with orders that counteract professional ethics or good practice, because legal responsibility trumps commercial pressures. The professional wouldn't be undermined by fast to market but poor quality work. We could place trust in third party tools, software & services and we would not have to put up with EULA that diavowed responsibility for damage.
    • Your heart's in the right place, but it would not and cannot work.

      Why? Simply - an excess of demand and a shortage of resources. There is simply too much demand for software development and there aren't enough Computer Science curricula in existence to mee
  • As a software quality professional... (Score:5, Interesting)

    by gosand (234100) on Wednesday February 20, @01:43PM (#22491510) Homepage
    I've been in software quality and testing for 14 years. I've worked at very large corporations as well as startups. There is a WIDE gap in software development process in our industry. Many people like to call themselves software engineers when they are developers. There is a huge difference. Engineering is a discipline that follows well-defined rules, and it usually takes time. But I think the very important thing to point out is that some software requires engineering - other software does not. If I go into a startup company that is trying to develop a blog/wiki site and try to implement a NASA-like software development methodology, they will fail. Likewise, software to control a heart monitor should be engineered and closely controlled. Sometimes quality and perfection is the goal, other times it might be time-to-market that is critical. You have to fit the process to your business. A bridge is a bridge, and they should all be engineered pretty much in the same way. You can't say the same thing about software.

    I think that this is a very key point to software development. I have seen companies who spent entirely too much time and money trying to eliminate all defects from their software when it wasn't the critical part of their business. Yes, we should always strive to eliminate defects, but you can't get them all. You have to know when to pick your battles, and when to accept the risks. If we're talking about life-or-death software, or security, or other very critical things - you need to focus on those.

    There's a grid I have seen used that is a great tool when doing projects.
    Schedule, Cost, Quality, Scope.
    1 can be optimized, 1 is a constraint, and the other 2 you have to accept. Period. It is a more useful version of the "fast, good, cheap - pick two"