Become a fan of Slashdot on Facebook

 



Forgot your password?
typodupeerror
×
Media IT Technology

A Rubric for IT Analysis 86

Aredridel writes "Zed A. Shaw has an insightful article on how analyses of software systems should be performed, and how they're often done wrong. It should be required reading for all IT journalists, and all readers of IT journals."
This discussion has been archived. No new comments can be posted.

A Rubric for IT Analysis

Comments Filter:
  • by Doc Ruby ( 173196 )
    Would someone please run this rubric through the rubric and say how well it complies?
    • Re:MetaRubricry (Score:2, Interesting)

      by concept10 ( 877921 )
      Thats my exact question. When will someone make software that will analyze TFA and tell me if it is worth reading? Think about how much bandwidth could be saved with this app? Sort of like a stumble upon for news.
  • Hmmmm.... (Score:5, Funny)

    by extagboy ( 60672 ) on Sunday June 12, 2005 @05:14PM (#12797298) Homepage
    Even worse it works about as well as pricing soap at $1.95 instead of $2.00 to fool people into thinking it's cheaper.

    I think $1.95 is cheaper, isn't it?

    Better run it through the rubric...

    8. Paper does not use the above terms correctly or calculates them incorrectly. Without the data you won't know the second part, but these 6 statistical concepts are very simple to calculate and get right.

    I think it's broken.
    • Yeah, the paper's "broken". the examples of graph manipulation which he used were contrived, to say the least.

      So what if the y axis on one graph had ticks every 100 instead of every 50? I read it as a way of making both graphs the same height - it didn't distort the information. the ticks were clearly labeled (50 or 100 per tick).

  • by Anonymous Coward

    Something for Timothy [slashdot.org]

  • by alanw ( 1822 ) * <alan@wylie.me.uk> on Sunday June 12, 2005 @05:15PM (#12797308) Homepage
    When you have read that article, go and buy a copy of the 1954 classic How to Lie with Statistics by Darrell Huff [wikipedia.org], ISBN 0393310728.
  • by YU Nicks NE Way ( 129084 ) on Sunday June 12, 2005 @05:15PM (#12797312)
    The author of the rubric "carefully" lists examples of things that ought to be seen -- and then carefully extracts two graphs from a long analysis in order to "prove" his claim. Never mind that the things he argues one should look for would be embedded in the materials and metods or results section, not the conclusion or the paper summary. Never mind, either, that his objections are bogus (red versus black ink? Uh, wait -- if the winning system had been shown in red, it would have conveyed how burningly fast the system was.)

    Oh, wait -- it's somewhich which shows that samba 3.0 is slower than w2k3. Never mind. This is slashdot, so the ditors have gotta troll for ad views.
    • Agreed, I'm sorry, the 'red lines are bad' claim is lame. If the Windows line was in blue he would probably have claimed that it "is intended to remind you of the blue screen of death".

      Moreover, he accuses the example graph makers of bad practice by re-scaling the x-axis, without rescaling the y-axis "to compensate".

      Excuse me? As far as I can see the x axis was scaled in order to display the data in the most room available, not to deceive in some way. The y-axes were left alone because the data range depi
      • Excuse me? As far as I can see the x axis was scaled in order to display the data in the most room available, not to deceive in some way. The y-axes were left alone because the data range depicted were identical.
        What grafs are you looking at? The y-axis has been changed and the x-axis has been changed.
        This gives the obvius impression that the results of the two test lie in the same interval however the first graph has mush lower y-values then the first.
        The "dots" marking the path of the curve have been
        • by Otter ( 3800 )
          1) Yes, the grandparent got the x- and y-axis confused in his post.

          2) The point of the graphs is that the Windows server has a roughly 75% performance advantage over Samba on both systems. The different y-axes are used because one system is twice as fast as the other and using the same scale on both graphs would leave half of one graph empty. I would say the choice of scales is entirely correct.

          3) The x-axis is labelled in numbers, not intervals. Excel graphs place tickmarks between the labels. You can co

    • I agree with you. The graphs seem perfectly all right to me. Rescaling the y-axis seems like a very sensible thing to do, because you would otherwise get a large white space on the second graph - just because the second computer is roughly two times faster than the first.

      It would be a terrible abuse of graphs if the point was to compare the two computers, but I don't believe it was.

    • I mean he's correct about an improperly scaled graph conveying the wrong things. For example suppose I run some graphics test, call it BitchinFastMark 8002, on two graphics cards. One scores 10837, one scores 10921. Ok what that means is that these two cards are basically the same speed. That change is so small it might be experimental error. If I properly graph the results on a scale form like 0 to 12000, it'll be readily apparant that the two are almost identicle. However suppose I scale the graph from 10
      • Actually, only having two values it is impossible to know how much difference there is. If you test 100 video cards made in the last 5 years, and their scores range from 500-11000, then these two cards are basicly the same speed. If your range is from 1800-1950 then they are radicly different. Numbers are meaningless without units, and units are meaningless if the user doesnt know what they are.
  • by mister_llah ( 891540 ) on Sunday June 12, 2005 @05:17PM (#12797328) Homepage Journal
    The usage of red and green determines the meaning, if the higher statistic was red, it wouldn't be the "bad" effect he is stating.

    The statement that green is good, red is bad, is not really true. Red is an attention getter, Green is an easy, inobtrusive color (relaxing, generally).

    While it is easy enough to make the leap that 'red' is bad because red is often an 'alert' color, the reason red is an alert color is because it is an attention getter, not because it means bad.

    Why else do you think so many people drive red sports cars? If red was bad, why wouldn't they drive green ones? ... and the graphs aren't necessarily misleading in the aspect of spacing, the graph seems to be trying to show the ratio of difference, not the difference amount. ... aside from what looks like a bad example of bad examples... there are some good points in the article...
    • The statement that green is good, red is bad, is not really true. Red is an attention getter, Green is an easy, inobtrusive color (relaxing, generally).


      But for someone who is color blind, it isn't going to make any difference anyway. Who's to say what the visual abilities of the person reading your report are going to be.

      The recommended alternative was to use Red and Blue instead.
      • Problem is really not color. Problem is that analysis can only be done with selected samples. Rarely are analysis done with the entire population.

      • You are completely correct, as I was searching to see if anyone made this comment first. I have two coworkers with red/green color blindness and they have problems with graphs and highlighted spreadsheets all the time.
        • An even worse situation was with process monitoring systems, where the color coding scheme was: green meant normal, flashing green meant returning to normal, red meant fault, and flashing red meant about to fail. For the want of a set of RGB color values, that job was unavailable to someone with red/green color blindness. Until that is, a technician figured out you could swap the green/blue cables on the monitor.
    • You brought interesting matter and I was following until you name the sport cars example.

      People buy red sport cars because they want to be flashy and get the attention (I'm sure you agree on this). But it nonetheless has some drawbacks - just see how much tickets you get when you have a red and then a green sport car.

      The point is, rather, that the author goes a little extreme in his conclusions.

      This particular point, number 10, also shocked me at first but then I've realized that the author has the o
      • People buy red sport cars because they want to be flashy and get the attention (I'm sure you agree on this). But it nonetheless has some drawbacks - just see how much tickets you get when you have a red and then a green sport car.

        I have a green sports car, and I just got a ticket, you insensitive clod!
    • I worked as an installation engineer for a major brand of measurement equipment all over the United States and Canada.

      This makes for amusing stories in all kinds of ways, as many ways as there are to do things wrong, but the one that always makes people's jaws drop is how Caterpillar (in some of its facilities, anyway) uses green colored tags to indicate questionable materiel, and red colored tags to mark stuff that's ready to ship.

      Sorry, but there *are* social implications to colors. They vary by society

      • Aye, there are, I've studied social implications to colors, which is rather what I was referring to. This is rather why I made my post to begin with. Obviously the color red doesn't mean "attention" for absolutely no reason.

        So, yes, there is the connotation of stop / go, but this is NOT the primary American schema of red and green.

        The important point (red) was already made, it involved correcting the author of the paper's selection of connotation, he selected the wrong implication. ... anyway, if you'd li
  • Good article (Score:5, Interesting)

    by bobbis.u ( 703273 ) on Sunday June 12, 2005 @05:18PM (#12797330)
    Perhaps the author of the Openoffice.org vs MS Office [slashdot.org] comparison should have read it first.

    I hate it when people lie with statistics. Even the BBC did it recently when they were trying to justify 1 million GBP on their new weather program. They said 7/10 people either liked the new system the same as the old one or preferred the new one. Perhaps they could also have said 9/10 liked the new system the same as the old one or preferred the old one? Who knows when you lump categories together like that without providing the raw data?

    • You said it before I could (I know this post is a little later, but I have other things to do)...
      I'm glad I'm not the only one who thinks it odd that slashdot posts an article that pretty much bashes the previous entry.
      Can we call this a dupe^(-1)?
  • somewhat obvious? (Score:3, Interesting)

    by moz25 ( 262020 ) on Sunday June 12, 2005 @05:18PM (#12797331) Homepage
    What he's stating seems rather obvious, but then again I might not be his target audience. One thing he seems to be missing is: who is paying for the test and is the one in whose favour the test turns out to be also the one who paid for it?
  • I just sign my evaluations. My regular readers can get used to my way of doing things, and benchmark me :) Like, if this idiot (me) finds it easy to use, it's probably underfeatured ...

    Apart from that remark, I think the linked articel is well-meaning but total BS.
  • Who am I to say that this is a basic set of requirements for an analysis study?

    A very good question indeed, my dear friend.
    • by Anonymous Coward
      Indeed and it's follow by an answer. What exatly is you point?
      That pulling fragments of a text out of its context serves to confuse?

      Well, I'm not getting these requirements from some arbitrary place, but from many books on properly displaying statistical data and graphical information. Read any book by Edward Tufte on displaying information [edwardtufte.com], and just about any book on statistics for giving accurate information.

      In addition, I've developed this list after years of reading, writing, and studying studies

  • You have to be kidding me. The last three jobs I had, I got dinged if I did analysis of any sort. Most software developers skipped the analysis and design part, because Managers wanted them to start coding on the first day and not stop until it was ready for QA to look it over. I called it "Seat of your pants" programming. Often I had to fix problems in other developers' programs and they did not have proper documentation, source code comments, naming conventions, flow charts, or any sort of documentation to help me figure it out at all.

    Requirements kept comming in, and they changed daily. Often what I started writing at 8am, was useless by 4:45PM when the requirements changed on-the-fly and adhoc and required me to program something else to replace it before I went home for the night. While I could have waited until the requirements were locked in, there was no such thing as that, any idea anyone had was instantly accepted by a manager and given to me to put into the program. Combo boxes became Listvues, then combo boxes again, then a text box, and then a Listvue again, and then a combo box. Database names for tables and columns were always changed, and of the thousands of SQL Queries in my programs that accessed them, they needed to be changed as well.

    Management didn't think anything of it, and kept their "We cannot say no to anyone, no matter how insane the request" attutude.

    Analysis, hooo haaaa! Yeah I wish! Corporate America apparently does not believe in it anymore.
    • Your description is disturbingly similar to to what I've seen at my last three jobs. I think it's an Industry standard. Perhaps something out of ISO ...
    • There are two ways I read your post.

      Way 1: Yes, I've seen this sort of thing before. At one company where I took over as Engineering head, the programming teams had failed to make decent forward progress. One reason is that I counted 26 people elsewhere in the company who were empowered to change the specs with a phone call (and some of them made a habit of doing so daily).

      Way 2: Maybe the problem you're solving isn't really well enough defined for anyone to have written a spec up front. Maybe flipping
      • #1 Bingo, too many chiefs, not enough braves. Programmers need to be empowered as well to decide what to accept and refuse based on their knowledge of the technology.

        #2 Do you mean Just In Time (JIT) development? That usually has a prototype with a small group of people who provide feedback and changes. Yet even that is managed so that it does not change daily, and the changes actually make sense. The way I had it originally was "The Right Way" that the end users wanted it, but Managers kept reading books
        • No choice but to take the paycheck, get the resume tuned up, and run for the hills as soon as possible. Unbelievable story.
          • True story. I tried to find another company, but it was 2000 and 2001, when the IT jobs were scarce and nobody wanted to hire, as the Dotcom bubble burst and they got 500 IT resumes a week and considered that "Programmers are a dime a dozen".

            I got really sick with the extra stress, so sick that in 2001 I lost my job, found another in 2002, same thing happened, and finally my doctor ordered me not to work any more.

            I am now the pathetic creature you see before you, true story.
  • by Otter ( 3800 )
    Gee, I wonder how well the "study" in the previous article (Open Office "better" than Word based on startup time) complies with this standard...?

    At any rate, I disagree with his complaints about graphs. Choosing an appropriate y-axis scale obviously changes the impact of the presentation, but that hardly makes one scale more intrinsically "good" than another. In this case, Samba and Windows are compared on two different servers. One is twice as fast as the other, the software packages have similar relative

    • "use the same number of decimal places in each label! I grit my teeth whenever I see 0, 0.5, 1, 1.5, ...)"

      Agreed. The number of decimal places *should* tell you the precision to which the data was measured.

  • Unfortunately there are too few people out there with scientific training, that especially includes many journalists and management, attempting to get them to apply some rigour is a futile task, especially when they have to present to an audience with no scientific training.

    Standard deviations, measurement errors are for engineers. The papers you get from companies are sales tools nothing more. Simply treat them with the scepticism (caveat emptor) they deserve and try $WHATEVER yourself with the your syste
  • INSIGHTFUL?!?! (Score:3, Insightful)

    by imsabbel ( 611519 ) on Sunday June 12, 2005 @05:36PM (#12797464)

    "Also look at the axes and their layout. The first graph has the y-axis (left side) going in 50 increments, and the second graph has the y-axis going in 100 increments. This distorts the graphs to make it look like they are the same results, but actually they look very different when graphed properly. What's worse is that the x-axis for both graphs is the same which means they are changing one scale (y-axis) without adjusting the other scale (x-axis). This creates a distorted graph."


    Well, no idiot. When graphed properly, they look the same. Both tests show an absolutely compareable performance ratio. What does it matter that the faster machine runs both OSses faster? How does this skew anything? Is the concept of relative speed increases a new concept for the creator of the article?

    A REAL loaded graph would surpress the y-axis or something to push the lower graph further down, or to skew the proportions.

    Man, is today really shit article day on slashdot?
  • What's a rubric (in my best Bill Cosby voice)
    • by Colin Smith ( 2679 ) on Sunday June 12, 2005 @05:50PM (#12797558)
      No, really. That's how it started, usually the title of a section, paragraph or similar.

      Obviously the bit of red text contained something someone thought was important so eventually the word came to mean an important rule or important passage. These days it means an important set of rules.

      http://www.dictionary.com/ [dictionary.com]
      htttp://www.m-w.com/
      http://www.askoxford.com/?view=uk [askoxford.com]

      • (Noah enters, and begins working in his garden, digging)

        God: (standing on a chair behind Noah, he rings a bell once) NOAH.

        Noah: (Looks up) Is someone calling me? (Shrugs and goes back to his work)

        God: (Ding) NOAH!!

        Noah: Who is that?

        God: It's the Lord, Noah.

        Noah: Right ... Where are ya? What do ya want? I've been good.

        God: I want you to build an ark.

        Noah: Right ... What's an ark?

        God: Get some wood and build it 300 cubits by 80 cubits by 40 cubits.

        Noah: Right ... What's a cubit?

        God: We

    • Let's see, I used to know what a rubric was.

      Well, don't you worry about that, get some software, analyze it. . .

      KFG
  • Why won't they just stick to the basics?

    Fire it up on your intel based PC, running windows. If it doesn't work at all, mark it down for requiring non-standard hardware.
  • ... the data points the author criticizes are not data points but line decorations for black and white readability.
    • No, they're data points. Notice how at the end the line dips downward and then right back up, at an angle, not at a curve. If they weren't data points, they wouldn't do that. If they aren't data points, than its a gross misrepresentation, because any sane person will assume they are, for the reasons that they are points, and the reason I outlined above.
  • Would somebody please write a rubric for slashdot, to help it realize that posting blog crap that is biased and generally full of inaccuracies and problems in testing isn't news?
    • "posting blog crap that is biased and generally full of inaccuracies and problems in testing isn't news"

      Sorry, you've lost me. I don't see how that differs from what journalists produce in magazines, newspapers and (online) journals.

      • Oh, well it's very simple. Journalists, and I'm talking journalists, (if you're going to call crap like blogs journalism then you've already ruined your argument) are supposed to write an objective piece, do research, obtain evidence of story etc. Blogs are just opinions, presenting them as news, which slashdot keeps trying to do only hurts the credibility of slashdot. It seems slashdot is just posting some mindless drivel that is meant to hurt MS, but ends up hurting slashdot the most. The worst thing in t
        • That's the freaking point. He's a dude in the consumer end of things bitching about what he DOESN'T see or what he does see that is WRONG.

          He certainly has a point that stats are spinned any-way that sells. Classic example would be the Pentium4. The high clock rate is meant to show that it outperforms the competition when in fact Intel's own lower-clockrate processors often eat it.

          Similarly you have all these TCO studies against Linux that are usually totally FOS.

          So some guy with a blog wrote about wha
          • P4 spec performance. [aceshardware.com]
            The only consumer processor comparable to the P4 is AMD64, and it has a lower base int score, with higher int peak, and lower base/peak fp scores.
            • Benchmarks are often useless, first off what the fuck operations does "content creation" or "business app" do?

              I do things like "compile source" or "make RSA keys" and can measure the time gaps [in favour of the AMD/AMD64] with a fucking wrist watch!!!

              Sure the P4 can do some things VERY quick [e.g. 128-bit load/store or SSE2] but because it's ALU is so very inefficient it dies on pretty much anything else.

              The Athlon Barton [32-bit core] shares the overall ALU design with the AMD64. So to say only the 64
              • You are correct that the only correct benchmark is the time it takes for an program to complete. However, spec benchmarks provide a reasonably good approximation of the integer and floating point performance.

                As for the 32bit AMD, they have the 3200 (2.2ghz) benchmarked, and it barely scores half as fast as the AMD 64 FX

                Do you have any links to the P4 ALU being efficient, I'm interested.
  • Who commissioned the study.

    It's inevitably the company who comes out smelling like a rose, but it's never stated up-front.

    disclaimer:
    I'm not a member of the anything-but-Microsoft crowd. Microsoft products supply my income and have done so since I left the mainframe market fifteen years ago.

    I will say I take no pleasure in seeing research results showing a Windows-based product to be exponentially superior to another product (e.g. Linux) without a statement as to what caused the study to be made: w
  • This article works so well considering that it follows immediatly after Performance of OpenOffice.org and MS Office [slashdot.org]. I wonder what this author (excellent article, btw) would make of the data from that other "IT Analysis" paper?
  • Naturally, despite Zed's assertion, not all graphs need to avoid his enumeration of pitfalls. For example, if the target audience is mathematically unsophisticated, most statistics (save perhaps mean) are inappropriate. Or perhaps red is meaningful in some contexts. Or maybe the presenter wants to show a very small delta, so an axis range is chosen to illustrate this.

    Nevertheless, Zed's enumeration can be extremely valuable in helping a discerning reader (who doesn't already know it all!) to critical

  • With the scientific rigor proposed in the article, no PHP will be able to understand it. Without special "keywords", the PHPs will have to waste their precious time reading the whole document and getting all the details.

    Let me propose my own list of what a successful IT article should have:
    1. Name recognition. If it fails to mention a well known company, it's not worth reading. Good example: Microsoft vs. Linux. Bad example: Gentoo vs. Debian. Rule of thumb, if none of the companies/brands mentioned
  • Lot and lots of good points in this article but a sloppy presentation. It appears that the standard deviation of this piece is probably quite wide. Maybe if the author knew he was going to be presented on Slashdot he would have taken more care in his work.
    • It appears that the standard deviation of this piece is probably quite wide.
      What does that mean? Or what do you think it means? It doesn't actually mean anything.
  • According to the same charts he lambasts, the Microsoft configuration outperformed the Samba configuration over the entire range of client connections. 900 to 600mbps at peak, and 350 to 200mbps at peak. Aside from altering the data, how can one deliberately modify or misrepresent these results?

    That looks convincing at any scale, regardless of "how the x axis is ticked". What x-axis tickmarks would you like, to make any difference at all? And would aligning the triangles and squares with the tickmarks ma
  • A lot of good points made, especially on being able to reproduce the test results, etc.

    But his example just indicates he has an axe to grind. The color bias thing is just bogus. His complaints about the readability of the graph seem to miss the point that graphs show trends, tables show individual points.

    I've seen far worse graphs, where they cut out entire sections of the y-axis to show you a remarkable graph where 98 is a whole lot higher than 94 because they're not showing you 1-90.

    Which serves a us

"May your future be limited only by your dreams." -- Christa McAuliffe

Working...