Slashdot Log In
Linux Has Fewer Bugs Than Rivals
Posted by
CmdrTaco
on Tue Dec 14, 2004 09:10 AM
from the preaching-to-the-converted dept.
from the preaching-to-the-converted dept.
sushant_bhatia_progr writes "Wired has an article stating that according to a four-year analysis of the 5.7 million lines of Linux source code conducted by five Stanford University computer science researchers, the Linux kernel programming code is better and more secure than the programming code of most proprietary software. The report, set to be released on Tuesday, states that the 2.6 Linux production kernel, shipped with software from Red Hat, Novell and other major Linux software vendors, contains 985 bugs in 5.7 million lines of code, well below the industry average for commercial enterprise software. Windows XP, by comparison, contains about 40 million lines of code, with new bugs found on a frequent basis. Commercial software typically has 20 to 30 bugs for every 1,000 lines of code, according to Carnegie Mellon University's CyLab Sustainable Computing Consortium. This would be equivalent to 114,000 to 171,000 bugs in 5.7 million lines of code."
This discussion has been archived.
No new comments can be posted.
Linux Has Fewer Bugs Than Rivals
|
Log In/Create an Account
| Top
| 626 comments
(Spill at 50!) | Index Only
| Search Discussion
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
(1)
|
2
(1)
|
2
Re:Now tell us what the bugs are (Score:4, Informative)
(http://paradoxinc.net/ | Last Journal: Tuesday March 09 2004, @09:18AM)
ReadingTFA: "The Linux source-code analysis project started in 2000"...
This is an ongoing study, to find the bugs checkout Bugzilla.kernel.org [kernel.org] it's not like they hide them or anything
Is anyone else bothered by this? (Score:5, Insightful)
(Last Journal: Friday January 07 2005, @06:23PM)
Why does "Linux is just a kernel" suddenly not apply here? Is it because we're bashing Microsoft? This is kind of lame and makes the community look silly and vitriolic.
Re:Is anyone else bothered by this? (Score:4, Informative)
(http://shadowlife.ca/ | Last Journal: Monday January 02 2006, @02:53PM)
Linux is a kernel. The problem is that this is comparing the linux kernel to all of windows XP
Mistake (Score:3, Funny)
I think they mean "40 million lines of bugs" :)
Re:Mistake (Score:5, Insightful)
(http://www.pobox.com/~chrish/)
This just in! "Hello world" has 0 bugs per three lines of code! Most stable and secure software ever devised!
Re:Mistake (Score:4, Insightful)
(Last Journal: Sunday November 12 2006, @07:08PM)
Then, when speaking of XP, they don't quantify the bugs, but merely say, "more are being found daily". Great... a pear.
Re:Mistake (Score:4, Funny)
(http://slashdot.org/)
Actually, hello world has the highest ratio of bugs/program complexity I've seen. Depends on who is doing the implementation, I guess.
Kjella
IEFBR14 (Score:4, Funny)
The purpose of IEFBR14 was to do exactly nothing, and pass a zero return code to the caller after doing the 'nothing' (branching on the return address in register 14 - thus BR 14).
This was actually more useful than it sounds and was used frequently in MVS JCL (Job Control Language) to make JCL do its thing without having to run a real program in a JCL 'step'.
Thing is, this program that had to do precisely nothing, had no less than 3 patches issued from IBM. Mostly to do with not clearing R15 (the return code register) correctly.
Go figure!
Re:Mistake (Score:5, Insightful)
(http://www.battlebazaar.net/)
There's no central Linux repository for reporting bugs in ancillary packages, while at Microsoft's site, all reported bugs go into a single database that can be queried. Each individual Linux distro and each individual Linux package maintains its own bug lists, which would have to be some how amalgamated.
In order to do the stated comparison, you would need to state what distribution you were using, you would have to state which patches you were using, and you would have to document where the information on bug counts came from for both products.
On top of all this, bugs per 1000 lines of code, while an industry metric, isn't a valid measurement of code quality, either, by the way.
One good code metric is the number of control points per function. The more control points per function, the more likely that you have a bug. It's always possible to take a complicated function and break it up into smaller pieces, and with C/C++ (but not
Another good metric is the distribution of severity of the reported issues. Draw a bell curve of severity of issues, compare percentage to percentage. You're not concerned with 1000 bugs vs. 100 bugs, you're concerned about how many of those bugs compromise your system or cause a crash.
Another good metric is delta bugs over time. There will always be bugs, everyone knows that there will be bugs. The question is the bug count going up or down.
Another good metric is delta time between open and close. Again, there will always be bugs, but how fast they are getting closed is a measure of whether a product is good or bad.
Another good metric is distribution of the number of days that bugs have been open and reported (by severity).
The reason for this is that the number of lines to implement a given function using a given algorithm varies between programmers and programming styles. Some will use two lines or statements to do what can be done in one, to make code easier to read. The line number is something that a developer can manipulate easily.
However, these other measurements are tied directly into the quality of the code. Nobody cares how big the Linux kernel is compared to the Windows kernel, or vice versa. What they care about is how well the Linux kernel works compared to the Windows kernel.
Any count related to bugs, also, needs to take into account the fact that on Windows, you have billions of users any of whom could find and report a bug. On Linux, bugs are more likely to go undiscovered for a longer period of time, simply because there aren't as many people trying to hit them.
The Windows and Linux kernel tend to be very similar in bug counts. The kernels of both OSes tend not to have bugs, because kernels tend to have simple code that's hard to mess up in any meaningful way.
It's only when you start including all these ancillary subsystems, device drivers, etc. that you start to see significant percentages of the bug.
And it is exceptionally hard to get an accurate count of those on Linux. On vMac, I documented approximately one bug in ten that I fixed, and I think that would be typical of other open source projects (although I can't swear to that).
I think that on these other, valid measurements that aren't dependent on lines of code, that if you could collect the statistics on a package per package basis, and compare them, that Open Source would still come out ahead of Windows.
It's just a matter of using a meaningless metric on widely divergent code bases to prove any point is irresponsible, and by reporting it, the media is doing a disservice to the Open Source community by perp
Re:Mistake (Score:5, Funny)
(http://slashdot.org/)
We're trying to bash the dogshit out of MS products here and you are messing it up!
Go to your cubicle!
Congratulations... (Score:5, Funny)
(Last Journal: Friday February 04 2005, @03:38PM)
Re:Congratulations... (Score:5, Insightful)
Seth Hallem, CEO of Coverity, a provider of source-code analysis, noted that the majority of the bugs documented in the study have already been fixed by members of the open-source development community.
Re:20-30 bugs per 1000 lines??? (Score:5, Insightful)
(http://www.pangalactic.org/ | Last Journal: Wednesday May 05 2004, @12:34AM)
It's not.
There are bugs that affects applications to a level the user notices, and there are bugs that affect the maintainability of the code, the reusability of the code, and the ease of use of the code. Most bugs are hidden from the user's view.
Does a stack overflow in a piece of code that only occurs when a craftily created exploit is executed constitute a bug? Yes it does -- even when no exploit exists. Under normal operations it does not affect the use of the system -- at least not until an exploit is developed and widely used. The fact that Windows, especially the older versions of the OS, were vulnerable to so many simple exploits illustrates the bugginess of the code. And most of the bugs are not even in close enough to the surface to be so readily exploited.
Does an end user see API bugs? Does anyone outside of Microsoft experience architectural flaws in the Windows OS? No, but that does not mean they do not exist.
What about code that misbehaves when presented with certain data. That may not be a problem for the original application that the code was written for, but any time that code gets reused, the programmer must know about these "hidden features" (read: bugs). Again, the user never sees this sort of bug.
The number they came up with wasn't pulled out their ass. See this article on measuring bugs [ganssle.com] for a more detailed discussion on the topic.
When reading it, bear in mind that most commercial software is produced for in-house use, receives very little QC, and frequently does not even compile cleanly. It's usuall just "good enough" to get the job done.
Re:20-30 bugs per 1000 lines??? (Score:5, Funny)
Sounds like Windows to me!
It's a joke, laugh.
Re:20-30 bugs per 1000 lines??? (Score:4, Insightful)
It's a joke, laugh.
It's a sure sign that the moderators are getting stupid when you have to point out the jokes to them.
Re:20-30 bugs per 1000 lines??? (Score:5, Interesting)
Keep in mind that you need to know the definition of a bug. It's not necessarially what you think it might be, but what the researchers defined. By their definition a condition that could never occur could be considered to be a bug. For example:
int foo ()
{
if (0)
return;
do_something();
return (0);
}
This overly-simple example could be considered to be a bug. If the condition is ever true the function will return an undefined value, but the condition will never be true so you couldn't possibly return an undefined value. It's not at all uncommon to find code with similar logic scattered throughout - improperly defined loops, conditionals, etc. could result in theoretical bugs that no path of execution can actually get to.
Then there are the kinds of bugs that only occur in extremly specific situations. About 13 years ago I had to track down a bug that caused a report package to crash. It took me a while to figure it out but eventually I did. The program would crash only on specific days. It'd only crash on Wednesdays. It'd only crash on certian Wednesdays - Wednesdays in September. Even more specifically, usually only the 3rd or 4th Wednesday in September.
The bug was that whoever wrote the code that printed a header on the reports was extremely anal about memory usage. He calculated exactly how many characters it would take for a buffer to hold the full date. The problem was he miscalculated by 1 character. With "Wednesday" being the longest day spelled out and "September" being the longest month, a 2 digit date (eg. Wednesday September 23) meant that the full date string would overflow the buffer by 1 character. This kind of bug wouldn't show up very often - only a few times a year - but it was a pretty nasty one when it did.
Re:Congratulations... (Score:5, Informative)
(http://www.cs.cmu.edu/~leak)
1. code patterns -- if you see something that looks like a pattern, it is probably a bug... "if(x = 0)", for example. of course, you have to check that it actually IS a bug, but you can catch certain common things that way.
2. type safety -- tools can go through your code (either statically or while it's running) and look for type violations. for example, you might write an int to an unsigned int, or mix up pointers and ints, which could be bad. you can catch a stunning number of bugs this way.
3. pointer analysis -- another annoying bug can be in aliasing, where you have multiple pointers that may or may not be pointing at the same memory. are you really
I'm not sure what sorts of current tools are released by these researchers, but this is a very basic overview of the techniques I've heard about people using recently. (Repeat disclaimer: I'm a theorist.)
Lea
How can one be sure (Score:5, Insightful)
(http://vectorgeeks.org/)
Conflict of interest... (Score:5, Funny)
(Last Journal: Sunday October 02 2005, @11:20PM)
Re:Conflict of interest... (Score:5, Funny)
(http://www.thedruid.co.uk/ | Last Journal: Monday June 21 2004, @06:14AM)
See Also: Diebold [wikipedia.org]
What about the ones they missed? (Score:3, Informative)
(http://127.0.0.1/ | Last Journal: Monday May 09 2005, @04:20PM)
Of course, we must remember, "It's not a bug, it's a feature!"
Apple != Orange (Score:5, Interesting)
(http://slashdot.org/)
The Windows XP code base includes all of the extraneous crap that gets bundled with and on top of the kernel.
The "Linux" code base just includes the kernel.
Re:Apple != Orange (Score:5, Insightful)
(Last Journal: Thursday March 15 2007, @12:56PM)
This is what you get for integrating your web browser into your operating system. Legality aside, there was a low cunning to that business move when M$ did it. Now, however, that decision is coming back to bite them on the tender bits: the browser is part of the OS, ergo bugs in the browser count as bugs in the OS.
Re:Apple != Orange (Score:4, Insightful)
Exactly how does making the browser part of the operating system make Windows more usable?
Because they did this, I now can't remove IE if I want to. I have trouble with outlook if I set mozilla as my default browser. I can't upgrade IE without patching the OS (which means other programs that use IE components might break). There is no way for me to downgrade to earlier browser versions. There is no way for me to have multiple versions of IE installed, so it makes it hard to test. Because IE is in the operating system, Microsoft has delayed releasing new features until the next version of the OS is shipped.
I can't think of a single thing that integrating the browser into the OS buys you, except some code reusability. From my engineering perspective, this wasn't done for any legitimate engineering reason, but because of legal and business reasons.
Re:Apple != Orange (Score:4, Informative)
(http://seenonslash.com/ | Last Journal: Friday May 11 2007, @04:02PM)
And both can be achieved without OS integration. Rendering for any 3rd party app can be direct to the video driver if the OS allows it. That's not integration.
It's already been proven that startup time for all Office apps is from hidden API calls near the start the executable code. They load the visual interface before the application's actually ready for use. That plus pre-loading of DLLs gives fast startup. Office isn't considered part of the OS, yet IE is. Therefore fast startup times have nothing to do with integration.
Try supporting 5 different applications vs. one. Over the phone. With a user with no training or previous knowledge.
I have. An entire office of old-fashioned accountants who prefer ledgers and pencils. How is blending 2 apps tightly together better than having 2 separate apps? If there's a problem with Firefox I can tell someone to not launch it. If there's a problem with IE parts of it are in memory whether I choose them to be or not. If there was less integration in Windows then it could be trimmed down to a minimal size for each user. Instead everything including the kitchen sink must continually be supported. You're only increasing your headache by using Windows and its tight integration.
I'm questioning the statistic mentioned is valid or not. Can this number even be trusted?
Not as purely fact. Yet someone who reads the study may determine that it's better to have the code open to all who can fix bugs instead of one select group. Or it may give insight to management that security can be better achieved when they can have their own people analyze the code. When read properly I don't see how anything but good can come from a study such as this.
Re:Apple != Orange (Score:5, Insightful)
Do you use only the Linux kernel? No. You run Slackware, or Fedora, or SuSE (my personal choice), or Gentoo, or something. A fresh copy of the latest version of a distribution is what should have been analyzed. All of the other stuff that's piled onto the kernel is what you use, such as the GUI, time-killing games, and all the other shit that a standard XP install disc contains. You don't click on line 11359 of the Linux kernel to open up Konqueror. You click on an icon, which starts a process that you use. The stuff piled onto the kernel is far more buggy than the kernel would be. I'd bet that a distribution of Linux is still far less buggy than Windows is, but it's certainly not 985 vs. 800,000. For example, when I minimize my windows on XP, there isn't hard drive activity for the next five minutes, as can happen with SuSE with not very much open. I'm sure something's misconfigured, but if a distribution, in 2004, is that poorly configured by default, then that's a bug.
Not completely scientific (Score:5, Interesting)
(http://www.fantasticdamage.com/)
"[Linux has] 985 bugs in 5.7 million lines of code, well below the industry average for commercial enterprise software. Windows XP, by comparison, contains about 40 million lines of code, with new bugs found on a frequent basis."
So Linux has 985 bugs. Windows has bugs that appear frequently. Ok that doesn't really tell me anything. I tried to dig a bit deeper [zdnet.co.uk] and came up with: "Coverity has not analysed the source code to Microsoft Windows because the company does not have access to the source code, Hallem said. Apple Computer's Mac OS X has a great deal of proprietary programming, but the core of the operating system is based on BSD, an open-source operating system similar to Linux."
So everything is based on estimates. Now, you know and I know that the Linux kernel has less bugs... but this is a tentative (at best, shoddy at worst) way of presenting that idea.
Linux Kernel vs Windows XP (Score:3, Insightful)
What about if you throw in KDE or GNOME, Mozilla, etc, everything that you'd have to add to really equal the features of Windows XP....
The real question... (Score:3, Insightful)
(http://www.sophicstudios.com/)
Mistaking symptoms for bugs (Score:3, Insightful)
The best you can hope to do is to count the number of *symptoms*, but that's not the same thing as bugs. It's possible, even likely, that a single bug in the code manifests itself as 100 apparently different external symptoms.
So this comparison is apples to oranges and completely meaningless.
So in all this code auditing... (Score:3, Funny)
(http://www.kickthebobo.com/erotech/index.html | Last Journal: Thursday November 15, @02:53PM)
Update (Score:4, Funny)
(http://web.lemuria.org/)
Where do they get these numbers??? (Score:3, Interesting)
I'm going to call bullshit on this. Maybe this number is based on some rule that every variable must be asserted, everything exception checked, etc. (even if these conditions rarely or never happen).
If they're counting bugs like I count bugs - i.e., in a normal operating environment the software loses data, produces incorrect results or limits operability then there is no way that a commericially viable product can have this number of bugs.
smells like BS (Score:5, Insightful)
How do you analyze 5.7 million lines of code except to run it through a static analyzer? Static analyzers can't detect most errors, which tend to be data dependent. So a company selling an uberlint donates their tool, 5 academics run it and write a paper. *Blech*
Commercial software typically has 20 to 30 bugs for every 1,000 lines of code Bull. Software with 20-30 errors per KSLOC doesn't work. (I know, I just spent a year trying to save a project that had 10 errors per KSLOC. Even at that defect density (found and fixed) it was undeliverable. Notice that the link in TFA doesn't take you to anything that supports this assertion, just to the organization's home page. I could't find anything to support these numbers.
Weasely non-comparisons: 2.6 Linux production kernel, ... contains 985 bugs in 5.7 million lines of code as opposed to Windows XP, by comparison, contains about 40 million lines of code, with new bugs found on a frequent basis.What, bug in Linux are found on an infrequent basis?
This is all hot air.
OK (Score:3, Funny)
(http://www.spamgourmet.com/)
I hope they submitted a patch
Bug This! (Score:3, Interesting)
The report, set to be released on Tuesday, states that the 2.6 Linux production kernel, shipped with software from Red Hat, Novell and other major Linux software vendors, contains 985 bugs in 5.7 million lines of code, well below the industry average for commercial enterprise software.
Commercial software (at this point in time) has its priority on releasing new versions often. Because each release is a salable item. Linux on the other hand gets forked or changes version whenever "Linus feels its ready". BIG DIFFERENCE. Here's why.
Commercial software decide how much value is on each bug, if the bugs are cheap (not show stoppers), but minor things they can't forsee as causing them to lose money... They will ship it. Acceptable known bugs. Project management decision.
Open source has time on their hands. They can look over the code carefully, waste time on bugs that commercial outfits wouldn't even bother... But the problem (like with software project management) is that you can't tell which bugs will be the nasties when you choose to ignore them. Less bugs == more secure software, less nasties.
If commerical software decided to play the careful release, minimize bug game... They would make less money initially, but in the end it would work out. Microsoft and ilk can certainly compete with linux, but they made a choice long ago not to. They made a choice to RELEASE FAST and MAKE MONEY FAST! (hey, that sounds like spam).
m
Flawed comparison (Score:3, Insightful)
(http://projekt.dnsalias.org/)
Of course it is.
Most proprietary software are user-level apps where a bug here or a bug there isn't critical. The economic loss that can be atributed to a bug in MS Word's bullet-point-algorithm isn't as great as a datacenter going down due a Oops in the linux-box running the SQL-database. To put it simply: there are different requirements for different tasks. A comparison between the quality of WinXP's GUI and GNOME for example would have been much more interesting.
I don't understand how this story could make slashdot's frontpage.
Classic case for the do-nothing argument (Score:4, Insightful)
Linux Has Fewer Bugs Than *One* Rival (Score:3, Interesting)
(http://www.mavetju.org/)
How about comparing it with MacOS/X, FreeBSD and others?
Re:Kernel is not the problem (Score:3, Informative)
(http://www.requisitesystems.com/)
You might want to talk to Nvidia about that. They are able to produce a driver that does this, but they choose not to.