Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Programming IT Technology

Richard Feynman, the Challenger, and Engineering 217

An anonymous reader writes "When Richard Feynman investigated the Challenger disaster as a member of the Rogers Commission, he issued a scathing report containing brilliant, insightful commentary on the nature of engineering. This short essay relates Feynman's commentary to modern software development."
This discussion has been archived. No new comments can be posted.

Richard Feynman, the Challenger, and Engineering

Comments Filter:
  • by eldavojohn ( 898314 ) * <eldavojohn&gmail,com> on Wednesday February 20, 2008 @11:30AM (#22489314) Journal
    I'm a software developer. I would like to think of myself as an engineer but to me that's a higher title that belongs to people who actually engineer original ideas.

    The problem with the shuttle disaster (both of them, really) is external pressures that are not in anyway at all scientific. The pressure from your manager at Morton Thiokol to perform better, faster and cheaper. The pressure from the government to beat those damned ruskies into space at all costs.

    So this is really a case of engineering ethics, when do you push back? As a software developer, I never push back. Me: "There's a bug that happens once every 1,000 uses of this web survey but it would take me a week to pin it down and fix it." My Boss: "Screw it--the user will blame that on the intarweb, just keep moving forward." But could I consciously say the same thing about a shuttle with people's lives at stake? No, I could not.

    So when an engineer at Morton Thiokol said that they hadn't tested the O-Ring at that weather temperature that fateful day and that information was either not relayed or lost all the way up to the people at NASA who were about to launch--it wasn't a failure of engineering, it was a failure of ethics. External forces had mutated engineering into a liability, not an asset.

    And there's a whole slough of them [wikipedia.org] I studied in college:

    * Space Shuttle Columbia disaster (2003)
    * Space Shuttle Challenger disaster (1986)
    * Chernobyl disaster (1986)
    * Bhopal disaster (1984)
    * Kansas City Hyatt Regency walkway collapse (1981)
    * Love Canal (1980), Lois Gibbs
    * Three Mile Island accident (1979)
    * Citigroup Center (1978), William LeMessurier
    * Ford Pinto safety problems (1970s)
    * Minamata disease (1908-1973)
    * Chevrolet Corvair safety problems (1960s), Ralph Nader, and Unsafe at Any Speed
    * Boston molasses disaster (1919)
    * Quebec Bridge collapse (1907), Theodore Cooper
    * Johnstown Flood (1889), South Fork Fishing and Hunting Club
    * Tay Bridge Disaster (1879), Thomas Bouch, William Henry Barlow, and William Yolland
    * Ashtabula River Railroad Disaster (1876), Amasa Stone
    So I agree with Feynman's comments in relationship to engineering and the further comments to software development. But I don't find them to be a fault in the nature of engineering, just a fault in our ethics. What does capitalism and competitiveness drive us to do? Cut corners, often.
    • by DBCubix ( 1027232 ) on Wednesday February 20, 2008 @11:44AM (#22489542)
      The Kansas City Hyatt Regency walkway collapse was an engineering problem. The contractor asked to take a shortcut (instead of threading a nut up a three story threaded rod, they asked to cut the rod and offset it several inches) and the engineers rubber-stamped it without checking what the ramifications would be. The engineering part was not originally flawed, but it was when they approved the change order.
      • by Sanat ( 702 ) on Wednesday February 20, 2008 @12:49PM (#22490620)
        I stayed at this Hyatt over several different weekends while there was dancing and music on the ground floor. What would happen is that several individuals would get the walkways to start swaying and then reinforce the sway by shifting their bodies at the right instant causing additional sway from the positive feedback. it was not unusual to experience 3 to 4 inches of sway.

        Although this swaying is not normally mentioned in the articles about the construction of the Hyatt, it went a long way towards weakening and stressing the connectors supporting the floors.

        Two of my friends were dancing on the floor when the walkways gave way and both were killed.

         
      • Re: (Score:3, Interesting)

        by MightyYar ( 622222 )
        Same thing happened with the Citibank building in NYC - fortunately that error was caught by a student studying the plans!
      • by russotto ( 537200 ) on Wednesday February 20, 2008 @01:45PM (#22491552) Journal

        The Kansas City Hyatt Regency walkway collapse was an engineering problem. The contractor asked to take a shortcut (instead of threading a nut up a three story threaded rod, they asked to cut the rod and offset it several inches) and the engineers rubber-stamped it without checking what the ramifications would be. The engineering part was not originally flawed, but it was when they approved the change order.
        Right, except that the original design wouldn't have worked, as the integrity of the threads could not have been maintained during construction and thus the nut could not have been put on. So in software terms it was a last-minute patch to fix a show-stopper, which wasn't adequately unit-tested.
        • I've just looked at the Wikipedia article (and sketch) showing thw defect. As an engineer (yes, a real one, albeit mechanical discipline), all I can say about what was done is, what an unimaginative solution.

          They could have still split the threaded rod under the upper walkway, and re-joined it with a threaded coupling, just below the nut supporting the upper walkway. If the nut can support the upper walkway, then the threaded coupling could easily support the lower walkway.

          In my experience, the solution u
      • by AJWM ( 19027 )
        The engineering part was originally flawed in that it could not be built as designed. That three story rod wasn't threaded its entire length in the original design, it was just threaded where the bolt attached. Looked good on paper, darn near impossible to fabricate. That said, though, it would probably have worked if the entire length had been threaded (although threading adds its own set of problems).
        • Re: (Score:3, Interesting)

          Threading also takes material from the total material and improper threading will cause fissures in the material which under stress cause failure of material.
          This was a combination failure. Like most failures it requires many things to come into alignment before the disaster occurs. The Space Shuttle and Sky Bridge did fail because of one thing, but several factors that came together that occurred simultaneously then this disaster occurred. If any one of these factors where to be mitigated or removed then t
    • by Vicious Penguin ( 168888 ) on Wednesday February 20, 2008 @11:48AM (#22489604)
      > What does capitalism and competitiveness drive us to do? Cut corners, often.

      Maybe, but remember what your own example shows -> What is the cost/benefit of fixing/preventing an error? Is a week of debug time worth missing your target ship date? Maybe, maybe not - depends on the error.

      A blanket indictment of capitalism is quite unfair. You would still have the same cost/benefit analysis regardless of economic system you toiled under.

      Is is not possible to engineer against all eventualities; trying to do so will usually keep you from ever getting off the ground.
      • by Protonk ( 599901 ) on Wednesday February 20, 2008 @12:01PM (#22489800) Homepage
        This is true to an extent, but safety concerns can and should be engineered for. You are absolutely right that there exists no direct corollary between software debugging for some non-critical application and meeting safety margins for a critical product. However some software IS critical. Flight software (This portion of Feynman's essay about NASA's flight software is amazing), software for hosptial applications (pharmacy, PCA's, microsurgery), ABS/suspension control software. Those are applications with VERY critical outcomes. Safety conerns need to be built in to the process.

        But I do agree that tradeoffs occur under any system. Those tradeoffs just let us make better decisions under capitalism whereas we can't allow the information from those tradeoffs to inform us economically in a socialist system.

        • But I do agree that tradeoffs occur under any system. Those tradeoffs just let us make better decisions under capitalism whereas we can't allow the information from those tradeoffs to inform us economically in a socialist system.

          Wow, random assertion. I have no idea why this would be true, and a good body of work seems to suggest the opposite.

          Capitalism clearly makes poor choices about these tradeoffs when their monitization is incorrect, but so does socialism. Other than that, capitalism seems to make

          • by Protonk ( 599901 ) on Wednesday February 20, 2008 @04:59PM (#22494504) Homepage
            It's not a random assertion at all. It's a foundation of economics. the world is full of information particular to place and time, on other words, the nitty-gritty. If you were to make a statistical model of part of the world, that stuff would get buried in the "other" term. Unfortunately, where there is a lot of "other" it becomes hard to model. Take for instance, who to give cars to. Should I have a survey and have the outcome determines who gets the car? Should I give the car to someone who needs it the most or will use it the most effectively? How do I judge that? how do I stop people from lying to me? I could, alternately, just sell the car to someone for an agreed upon price. That means I learn at least how much it is worth to them (it may be worth more) and the car goes somewhere. Prices transmit information and preferences better than any 5 year plan or government study. Sometimes markets have failures and those need to be dealt with, but that is not what I am talking about.
    • Re: (Score:3, Interesting)

      by Protonk ( 599901 )
      There are other disasters that don't stem from the profit motive:

      Loss of the USS Thresher during initial sea trials.

      Steam Line Rupture on the USS Iwo Jima.

      Both of those were caused by engineering (the first) or procurement faults.

      The thresher was lost with all hands due to (among other things) a failure in modeling the high pressure air system and inappropriate welds on seawater systems.

      The Iwo Jima suffered a steam line rupture that killed a few guys because the wrong material was used on a high pres/temp
    • Re: (Score:3, Informative)

      by Yvanhoe ( 564877 )
      There is a point you miss there I think. It is the top-to-bottom design philosophy vs the bottom-to-top. The first one gives objectives first then designs every part so that it fulfills the general objective. The latter focuses on designing simples elements and assemble them as more complex elements with defined capacities and known weaknesses.

      This article states that the second approach is inherently better than the top-to-bottom approach. This is clearly an engineering problem. I am not sure I agree wi
      • It seems like bottom-to-top engineering is the best way for an individual to work - they know what they're trying to accomplish and they're the one who understands where the rubber meets the road. Conversely, the top-to-bottom approach seems best suited to large teams, because it should lend itself to more compartmentalization of effort. But then, I'm not an expert on the subject, and life is often counterintuitive. Certainly, however, basically everything I write is bottom to top :)
      • There is a point you miss there I think. It is the top-to-bottom design philosophy vs the bottom-to-top. The first one gives objectives first then designs every part so that it fulfills the general objective. The latter focuses on designing simples elements and assemble them as more complex elements with defined capacities and known weaknesses.

        This article states that the second approach is inherently better than the top-to-bottom approach. This is clearly an engineering problem. I am not sure I agree with the conclusions and acknowledge that most of the Challenger disaster was due to unwelcomed pressure, but I don't think you can dismiss the whole issue as not concerning engineering.

        I wouldn't necessarily agree that the Top-Down approach is 100% bad. Both have their strengths and weaknesses for what they can see and cannot see. Bottom-up tends to get too involved in the details, while Top-Down tends to outright ignore the details.

        However, please, please, please,..(can I say it enough?).., please do not mix the two - have one part top-down and another bottom-up. Be consistent in design methodology across the entire project, and find the appropriate balance of both methods.

    • In order to have a real sense of the "nature" of engineering, you have to look at more than the failures. You have to look at the successes that occurred in the midst of these same pressures. I'd start by looking into the Manhattan project, of which Feynman played a part in. The exercise of finding other examples is left for the reader.
    • I'm a software engineer too. However I've worked on projects where a software failure could get people killed or destroy hundreds of millions of dollors of "stuff". For example the software might be processing radar data inside a little gadget that flays at mach four and caries an explosive warhead. In those cases to don't just say "the user will blame the bug on The Internet" and let it go.

      The thing with software is that it is such a wide field. If you are wrinting a web based survey program, so what i
    • Re: (Score:3, Insightful)

      by somersault ( 912633 )

      I'm a software developer. I would like to think of myself as an engineer but to me that's a higher title that belongs to people who actually engineer original ideas.

      Well I know I'm missing the point of your post with this, but a quick google comes up with this description of an engineer:

      a person who uses scientific knowledge to solve practical problems

      I think your higher title should be an 'inventor'. Engineers are the guys that generally plod away using well tested mechanical or other scientific knowledge to get everyday jobs done (just like a software engineer really?). I work as IT support/coder for a bunch of engineers here and while they sometimes may be using old ideas in new ways, most of their work is just that plodding awa

      • by Knara ( 9377 )

        Don't tell that to the "professional engineers", though. Their head will fly off if they're one of the 80% of those who think that "software engineer" is tantamount to blasphemy.

        • I can think of some draftsmen that would probably have a chuckle at it, but a few of them have decent computer experience and would probably see my point if I talked about coding in terms of engineering. One of the guys used to work for Rolls Royce, he's our expert when it comes to the blade geometry stuff, and usually tells interesting stories of the olden days and him using FORTRAN and such :p Then there's my uncle who got a degree in Computer Science in the early 90s, but just recently finished a PhD in
          • by Knara ( 9377 )

            Sure, and I'd agree with you, but I point you here [slashdot.org] (later in this very discussion) for an example of what I'm referring to.

          • by Knara ( 9377 )

            Oh, and this AC [slashdot.org] for that matter.

            As if engineering didn't exist before someone made it an attainable certificate.

    • by ccguy ( 1116865 ) *

      "There's a bug that happens once every 1,000 uses of this web survey but it would take me a week to pin it down and fix it."
      How can you give a time frame for a non repeatable bug you still have to pin down? If you had a boss like mine you would be in trouble, because he would say "ok, fix it, I'll get you the time" and if a week later you don't deliver, he wouldn't be very understanding.
      • by uncqual ( 836337 )
        Being able to say with any reasonable degree of confidence that a particular bug is encountered "once every 1,000 uses" implies that it's been encountered multiple times and hence is repeatable - just not yet reliably repeatable on a single test run. If only 1000 test iterations are run and one fails due to "a bug", it's really not possible to say much useful about the odds of the bug being encountered - with some degree of confidence perhaps one could say that it's probably encountered less often than one
    • * Citigroup Center (1978), William LeMessurier

      In fairness, the ethics involved in the initial "cost-savings" decisions should be separated from ethics of how the situation is handled after the problem is revealed.

      It's pretty well documented that he exhibited uber-ethics by owning up to the engineering problem immediately after a student pointed out the miscalculations: http://en.wikipedia.org/wiki/William_LeMessurier [wikipedia.org]. Point being, he could have just lawyered up but that's not how responsible engineers beha
    • by ceoyoyo ( 59147 )
      That's the difference between an engineer and a not-engineer. Software developers are more like building contractors. You do what the boss says. The engineer, who's in charge of the job, or at least inspecting the job, is required, legally, to tell the bosses to screw off if they ask him to cut a corner that compromises safety. If he doesn't he's finished.
    • Re: (Score:3, Insightful)

      by Omnifarious ( 11933 )

      Blaming the shuttle disaster on capitalism is erroneous. I do not necessarily disagree with your assessment in general, but capitalism was not at fault in that particular instance. What was at fault was bureaucrats trying to look good to their superiors and present a positive public image at the cost of real engineering.

      I would say that in general is the meta-problem, not capitalism. In its current form in the US capitalism has caused the existence of many large entities that use hierarchical systems of

    • by dubl-u ( 51156 ) *
      So I agree with Feynman's comments in relationship to engineering and the further comments to software development. But I don't find them to be a fault in the nature of engineering, just a fault in our ethics. What does capitalism and competitiveness drive us to do? Cut corners, often.

      Any approach to engineering that only works with uber-humans, rather than the regular ones we have to work with, strikes me as painfully naive. Much of engineering is about understanding and accepting the nature of the materia
    • by Specter ( 11099 )
      To some extent, things this was also a communications problem. Edward Tufte has analyzed the Challenger and Columbia disasters and concluded that they largely occurred because critical information became obfuscated as it moved up the decision tree. Take a look at his analysis of the Columbia disaster:

      http://www.edwardtufte.com/bboard/q-and-a-fetch-msg?msg_id=0001yB&topic_id=1&topic=Ask+E.T [edwardtufte.com].

      Challenger has similar issues. I can't find a direct cite for it but this page:

      http://www.asktog.com/books/c [asktog.com]
  • wow (Score:5, Funny)

    by loconet ( 415875 ) on Wednesday February 20, 2008 @11:36AM (#22489394) Homepage
    For a second there I thought I read "Rogers Communications" and "brilliant" and "engineering" in the same sentence. I thought I had been kicked to an alternate universe where I wouldn't be able to escape. I am glad to be back.
  • Did anyone get through before the story hit the front page? I'd be interested in reading, but Google doesn't have a cached version of the story.
  • by StarfishOne ( 756076 ) on Wednesday February 20, 2008 @11:39AM (#22489454)
    A future essay relates Feynman's commentary to modern web hosting, load balancing and the so-called Slashdot effect"
  • Mirror (Score:3, Informative)

    by fishdan ( 569872 ) on Wednesday February 20, 2008 @11:40AM (#22489480) Homepage Journal
    http://duartes.org.nyud.net/gustavo/blog/post/2008/02/20/Richard-Feynman-Challenger-Disaster-Software-Engineering.aspx [nyud.net] As a side note, could someone make a grease monkey script to make all links frmo /. run through coral? it just makes sense
  • already?
  • by Protonk ( 599901 ) on Wednesday February 20, 2008 @11:44AM (#22489538) Homepage
    To be fair, the Challenger disaster actually preceeded NASA's slogan and procurement policy of "faster, better, cheaper" by a bit. More to the point, Feynman's article should be a cautionary tale to ANYONE in a engineering field. It isn't a matter of one field being subject to unscientific pressures and another field being immune. No technology or industry is immune from the pressures and problems that caused the challenger disaster. Anyone who claims to be well adapted to safety concerns enough to not spend lots of time and effort on fixing them is foolish. The nuclear industry still has to practice strong QC on parts, procedures and maintenance and CONTINUE that practice. Same with commercial aviation, acute medical care, etc. Constant vigilance is rewarded only with another uneventful day. That is the fundamental problem. Vigilance is expensive and time consuming. these are not pressures from the profit motive. They apply to government as well as civilian ventures.
    • Yes, like it or not cost analysis and time to market are integral to engineering. Finding the correct balance is what make a great engineer.
    • Constant vigilance is rewarded only with another uneventful day. That is the fundamental problem. Vigilance is expensive and time consuming.

      You're absolutely right. In this society dominated by the results/delivery-driven mentality, things that do not directly contribute tend to be marginalized. See how companies offshore support and QA because these divisions do not actively generate revenue. It is the same thing.

      For something like QA, no news is good news. For management, no news is waste. Management's me
  • by sphealey ( 2855 ) on Wednesday February 20, 2008 @11:44AM (#22489540)
    (I will refrain from a four-step Profit post). Standard technique: latch on to an essay by a brilliant and insightful person. Extend the insights of that person slightly into a different field with usual compare-and-contrast, brand-extension writing techniques. Claim that resulting essay (and self) are as insightful as the original essayist.

    It doesn't work 99.994% of the time, generally because very few people are as insightful as the original brilliant person.

    sPh
    • Re: (Score:3, Interesting)

      by pilgrim23 ( 716938 )
      good point. I would suggest reading up on Dr Feynman as a precursor. Or, for those who prefer the flickering screen; there are several video interviews with the great man. One from Horizon called "The Pleasure of Finding Out" is VERY watchable. Also his book "Surely You're Joking Mr Feynman" is a hoot! Highly recomended. Richard Feynman is one of the greatest safe crackers who ever lived and in the top 10 of minds of the 20th Century.
      • Richard Feynman is one of the greatest safe crackers who ever lived

        You're only saying that because he cracked the safe with every bit of information the US had on how to make an A-bomb after the Trinity tests. On second thought, I suppose that is pretty good.

    • While most commentaries on brilliant analysis are not brilliant, a few are.

      Edward Tufte's analysis of Dr. Feynman's brilliant analysis is brilliant, warranting a full chapter in Visual Explanations [amazon.com]. What makes it special is that it is not "hey, yeah, that's a good idea, I'm smart too" but instead a study of why Dr. Feynman's analysis is brilliant.
      • Re: (Score:3, Informative)

        by Phat_Tony ( 661117 )
        I don't have my copy of Visual Explanations handy, but I've read it and I was at a talk Tufte gave on this subject, and my recollection of it is rather different. Without directly criticizing Feynman, Tufte actually comes up with a significantly superior analysis of the root cause of the disaster. Feynman spread he blame around many places, finding bad science, bad engineering, inaccurate statistics, poor procedures and documentation, politics influencing design, and most importantly and famously, a disconn
        • by sphealey ( 2855 )

          === I've read Adventures of a Curious Character and have the utmost respect for Feynman. Every problem Feynman outlined in his analysis was a real problem that NASA should fix. But none of it really pinpointed the exact cause of the disaster. Feynman mostly chalks the failure to postpone launch to management's disconnect from engineering, from their mistakes and lack of understanding and therefore overestimating the safety of the shuttle. This puts the blame in the wrong place. The managers were no where n

        • by redelm ( 54142 )
          Sorta. The things that failed were called "O-rings" but have nothing in common with proper O-rings (viscoelastic differential pressure seals). Those things leaked every time, while real O-rings never leak when clean and in good condition. The things that failed are more like packing, which must leak to work.

          The real problem was the closure design was ripped off from a Hahn & Clay patent which the NASA and contractors did not understand and implemented horribly. The large-diameter closure was supposed

  • Hm. (Score:5, Insightful)

    by gardyloo ( 512791 ) on Wednesday February 20, 2008 @11:51AM (#22489630)
    The blog post makes a nice contribution by linking to Feynman's original thoughts (for example, here: http://www.ranum.com/security/computer_security/editorials/dumb/feynman.html [ranum.com] ), ones I haven't read for a long time (and was happy to be reminded of). However, the author makes the mistake of thinking that the original thoughts need to be interpreted and summarized for the reader. Feynman's words by themselves are simple to understand, are concise, and contain just the tone for which geeks go gaga. Anyone interested in the subject will be able to make his or her own judgements about the engineering and politics involved in the Shuttle development, engineering in general, and the extensions to software development.
    • Re: (Score:3, Insightful)

      by Protonk ( 599901 )
      This is a very good point. Feynman has the unique quality of startling intelligence, curiosity, and straightforwardness. Some authors need to be summarized. Feynman just needs to be trotted out every generation or so.
      • Re: (Score:3, Informative)

        I agree, and tried not to summarize at all. Mostly I just tried to link what Feynman said to software, rather than make a fool of myself paraphrasing him. That's also why the entry is really short, and basically tells people to go read the source :) cheers.
        • by Protonk ( 599901 )
          Oh absolutely. I can't read the article right now. :) But I'm not going to crucify you for making the parallels. I remember reading the chapters about NASA's flight software testing and getting goosebumps. It's THAT good. I think you are right for making that parallel and suggesting its relevance. There are a fair number of coders alive today who weren't adults when Mr. Feynman was alive, sadly.
  • scooped! (Score:2, Funny)

    by troybob ( 1178331 )
    And here I was on the verge of releasing my twin papers on how the 9/11 Commission Report can be applied to software development, and how the Warren Commission Report on the Kennedy assassination applies to P2P.
  • Surely You're Joking (Score:3, Interesting)

    by Yoweigh116 ( 185130 ) <(moc.liamg) (ta) (hgiewoy)> on Wednesday February 20, 2008 @12:00PM (#22489782) Homepage Journal
    Offtopic, but I highly recommend Surely You're Joking, Mr. Feynman [amazon.com], the autobiography he narrated on his deathbed. It's got some great stories in it, like when he surreptitiously went around picking locks at Los Alamos or his personal recollections of the Trinity nuclear tests.
    • by Protonk ( 599901 )
      It isn't really off-topic. I think the essay in question comes from the other volume (What do you care what other people think?). Both are outstanding books and well worth the shelf space.
  • I'm not sure if he is stating that a bottom up testing method is readily available in all situations, but it sure is a hell of a lot easier with data rather than with physical designs. Scanning and testing code is much easier than building a CPU and testing it from the bottom up (not that I ever have). He does make the distinction that it is less costly in the long run, and I'd probably agree with him, not from experience with this particular application, but experience in general with preventative maintena
    • by Protonk ( 599901 )
      For critical applications, bottom up design is not impractical. It is impractical for non-critical applications. Even with physical applications, bottom up design has some clear advantages.

      I do not personally feel that one of those advantages is overall cost savings. I think that most top-down design programs are cheaper overall than their bottom-up counterparts (all things being equal). However the benefit in terms of clear and understandable safety margins is almost impossible to replicate.

      Easy exampl
  • by Martin Spamer ( 244245 ) on Wednesday February 20, 2008 @12:18PM (#22490080) Homepage Journal
    The biggest problem is most software developers are NOT chartered professional software engineers, so have no personal, professional and legal responsibility for their work. That is why IT is full of cowboys and trust is nearly none existent. Software Engineers must become a chartered only profession, so that people who are not chartered are not allowed to practice.

    To qualify as a Professional Engineer we should place good practice above short term gains. Professional Engineers should be truthful and objective and have no tolerance for deception or corruption. Professional Engineers only work in areas were they are competant. Professional Engineers build their reputation on merit and their skills through continual learning and the skills of their charges through ongoing mentoring.

    We wouldn't have to put up with the shoddy work of cowboys, because they wouldn't be allowed to practice. We wouldn't have to put up with orders that counteract professional ethics or good practice, because legal responsibility trumps commercial pressures. The professional wouldn't be undermined by fast to market but poor quality work. We could place trust in third party tools, software & services and we would not have to put up with EULA that diavowed responsibility for damage.
    • Re: (Score:3, Insightful)

      Comment removed based on user account deletion
    • by dubl-u ( 51156 ) *
      Software Engineers must become a chartered only profession, so that people who are not chartered are not allowed to practice.

      This is a reasonable theory, but I think it's wrong in practice for a few reasons:
      1. Most software is not life-critical. much engineering is.
      2. Scale matters. You don't need to be licensed to build a doghouse or install cabinets. Only some software development is of the scale where quality practices matter.
      3. Practice is quickly changing. For example, agile methods appeared circa ten yea
  • They said that the management at NASA didn't want to cancel the flight of the challenger because it was such a high profile launch even though they were warned about the O rings.
  • May be there will be some sunny day when I will listen to what Linus Pauling says about vitamin C, what Fomenko [wikipedia.org] says about history [wikipedia.org] and what Richard Feynman says about programming.

    But that day is not today.
    • Richard Feynman never said anything about software engineering, to the best of my knowledge. Did you read TFA? But should Feynman have commented on software engineering, I for one would have read his comments with the utmost interest. Feynman had extraordinary intelligence and perception and used both in whatever interested him. And his interests were exceptionally varied. Whatever the topic, Feynman's comments are almost always insightful, interesting, and original. Would your same close-minded attitud
      • "Did you read TFA?" Do you mean TSA (the slashdotted article)?

        He probably did not. Please consider my comment as an illustration of a more general idea than plain bashing aforementioned scientists (love that too, by the way...)
  • An example of flawed control software leading to fatalities: http://en.wikipedia.org/wiki/Therac-25 [wikipedia.org]
    • Actually, I would classify that as a hardware, not a software problem. Lack of hardware safety interlocks was the real problem. That is why buggy software had fatal consequences. In many industrial settings where safety is an issue, the safety devices are generally not completely under software control, but incorporate hardware interlocks to ensure life-threatening conditions do not occur. Heck, even consumer microwave oven doors have a hardware interlock, as do many other appliances. Safety shouldn't
  • Physics is not engineering. If you get things wrong in physics, usually, nothing happens except maybe an angry letter to the editor. Physicists regularly produce incomplete or even contradictory theories, and nobody dies. Physics doesn't have to interface with people; when coming up with a theory of quantum gravity, you don't have to worry about people pushing the wrong button. And the complexity (in terms of variables, equations, etc.) of all of theoretical physics taken together is probably still less
    • I think you greatly underestimate the complexity of the ideas you need to have under your belt to understand all of theoretical physics.
      • I think you greatly underestimate the complexity of the ideas you need to have under your belt to understand all of theoretical physics.

        Yeah, my boss thinks anything he doesn't understand is easy as well.

      • by nguy ( 1207026 )
        I think you greatly underestimate the complexity of large engineering projects. They may not be intellectually as challenging as theoretical physics, but they have an enormous amount of detail.
    • Physics is not engineering. If you get things wrong in physics, usually, nothing happens except maybe an angry letter to the editor. Physicists regularly produce incomplete or even contradictory theories, and nobody dies

      Ironic that you would post that in a topic about Feynman. While he was a brilliant theoretical physicist, his first job was designing the analog computing machines that launched artillery shells for the Army, and he had a part of building the atomic bomb. He was the guy who used then-theo

  • by gosand ( 234100 ) on Wednesday February 20, 2008 @01:43PM (#22491510)
    I've been in software quality and testing for 14 years. I've worked at very large corporations as well as startups. There is a WIDE gap in software development process in our industry. Many people like to call themselves software engineers when they are developers. There is a huge difference. Engineering is a discipline that follows well-defined rules, and it usually takes time. But I think the very important thing to point out is that some software requires engineering - other software does not. If I go into a startup company that is trying to develop a blog/wiki site and try to implement a NASA-like software development methodology, they will fail. Likewise, software to control a heart monitor should be engineered and closely controlled. Sometimes quality and perfection is the goal, other times it might be time-to-market that is critical. You have to fit the process to your business. A bridge is a bridge, and they should all be engineered pretty much in the same way. You can't say the same thing about software.

    I think that this is a very key point to software development. I have seen companies who spent entirely too much time and money trying to eliminate all defects from their software when it wasn't the critical part of their business. Yes, we should always strive to eliminate defects, but you can't get them all. You have to know when to pick your battles, and when to accept the risks. If we're talking about life-or-death software, or security, or other very critical things - you need to focus on those.

    There's a grid I have seen used that is a great tool when doing projects.
    Schedule, Cost, Quality, Scope.
    1 can be optimized, 1 is a constraint, and the other 2 you have to accept. Period. It is a more useful version of the "fast, good, cheap - pick two"
    • A bridge is a bridge, and they should all be engineered pretty much in the same way. You can't say the same thing about software. You can't say the same thing about software.

      Well, actually you can. It's just a bit different since in mechanical engineering, you will have one that will specialize in bridges, and another in autos, etc; but management won't try to take a bridge engineer and make him design autos. Where as in Software, it is often the case that the software engineer with specialty X will be as

      • by gosand ( 234100 )
        Well, actually you can. It's just a bit different since in mechanical engineering, you will have one that will specialize in bridges, and another in autos, etc;


        I'll take my bridges desinged by civil engineers, thanks. But I get what you meant. :)

  • This story is about Feynman, so it needs to be tagged "richardfeynmanisgod."
  • The blog entry is dated today.
    The link to Feynman's appendix to the Rogers Commission is a link dated 1996.
    Feynman died Feb 18 1998.

    So we're talking about something over 10 years old that a blogger has added a few personal observations to, and it's linked in as slashdot news.
    • And not very much in the way of personal observation at that. It was Feynman himself who made the comparison to best-practice software engineering. All TFA was really doing was pointing out that we already know what best-practices are for critical applications and that we generally don't use them. To anyone who develops software, who was interested enough in either the Challenger explosion or Feynman himself to have read it already, it's not news.
  • From the linked Feynman Essay:

    "There is not enough room in the memory of the main line computers for all the programs of ascent, descent, and payload programs in flight, so the memory is loaded about four time from tapes, by the astronauts."

    Since I've had such stellar success with tapes and drives made this century, I can't image trusting landing the shuttle to some made 20+ years ago...

    • by PhxBlue ( 562201 )
      If it ain't broke, don't fix it. The software, at least, ain't broke.
    • From Feynman's book "What do you Care what Other People Think": "the comptuers on teh shuttle are so obsolete that the manufacturers don't make them anymore. The memories in them are the old kind, made with little ferrite cores that have wires going through them. In the meantime we're developed much better hardware: the memory chips of today are much, much smaller; they have much greater capacity; and they're much more reliable." (page 192 of my copy, it's in chapter called An Inflamed Appendix).
  • 01. Don't build solid fuel boosters in sections.

    02. Don't build them out of state so they have to be sectioned to transport by rail.

    03. Don't compromise design so as to get some politians vote for funding, forcing you to site the solid rocket booster in his state.

    04. Don't ignore safety concerns from your own engineers

    05. It don't take a nuclear physicist to figure this out ..
    • To be fair, I have a feeling that almost all rocket boosters once they reach a certain size have to be built in sections. And maybe there just aren't any companies in Florida that can build such rockets. Engineering ethics is a constant trade off. Some element of risk must be accepted.
  • Faynman Videos. (Score:2, Informative)

    by ip_free ( 544799 )
    If you like Faynman here are some of his lectures. http://vega.org.uk/video/subseries/8 [vega.org.uk]
  • Marcus Ranum has an interesting talk (MP3) [ranum.com] in which he discusses Feynman's Challenger commentary at some length in the context of designing reliable/secure software systems.

    The talk gets off to a bit of a rough start (see Ranum's comment below), but contains much insight and makes a lot of sense before long. Highly recommended for those in the software development field, where the approach is often 'throw it together, then poke at it and patch it until it stops obviously breaking'; the rigour Feynman

  • by wannabegeek2 ( 1137333 ) on Wednesday February 20, 2008 @03:37PM (#22493264)
    I work in the aerospace industry, specifically an airline, as a manager of an Engineering subgroup. (if "manage" is what you call what I do)

    One of the first things I have a new hire do is read Feynman's appendix to the Challenger Report. Primarily to instill a respect for dealing with data, not desires or pressures, and to (re)enforce the concept that "it worked last time", does NOT make it right or safe to do the same thing again.

    The pressure / desire from above or parallel organizations within the company is constant, and usually precipitated by the latest operational interruption. All to frequently the refrain is along the lines of "but last time you authored a deviation, this is only a little bit more". When I feel the pressure is starting to cause situational ethics creep, I pull out Feynman's appendix, and read it myself, or have the affected person on my staff read it.

    It is amazing how effective it is in restoring sanity, and a healthy respect for the ability of the hardware to kill you (and / or your customers).

    Richard Feynman gave many things to this world, and especially certain segments of it. It's my opinion however that one of his best and most unsung gifts was the Challenger Report Appendix. It should be required reading for ANYONE who will ever touch or direct action on hardware that could even remotely present a potential for injury or death.

    The message was not rocket science, but as the Columbia accident proved the rocket scientists still can't get it right.

Truly simple systems... require infinite testing. -- Norman Augustine

Working...