Stories
Slash Boxes
Comments

News for nerds, stuff that matters

Slashdot Log In

Log In

Create Account  |  Retrieve Password

Bug Hunting Open-Source vs. Proprietary Software

Posted by Zonk on Sat Oct 07, 2006 01:41 PM
from the i-am-a-big-fan-of-quality dept.
PreacherTom writes "An analysis comparing the top 50 open-source software projects to proprietary software from over 100 different companies was conducted by Coverity, working in conjunction with the Department of Homeland Security and Stanford University. The study found that no open source project had fewer software defects than proprietary code. In fact, the analysis demonstrated that proprietary code is, on average, more than five times less buggy. On the other hand, the open-source software was found to be of greater average overall quality. Not surprisingly, dissenting opinions already exist, claiming Coverity's scope was inappropriate to their conclusions."
+ -
story

Related Stories

[+] Developers: Coverity Report Finds OSS Bug Density Down Since 2006 79 comments
eldavojohn writes "In 2008, static analysis company Coverity analyzed security issues in open source applications. Their recent study of 11.5 billion lines of open source code reveal that between 2006 and 2009 static analysis defect density is down in open source. The numbers say that open source defects have dropped from one in 3,333 lines of code to one in 4,000 lines of code. If you enter some basic information, you can get the complimentary report that has more analysis and puts three projects at the top tier in quality of the 280 open source projects: Samba, tor, OpenPAM, and Ruby. While Coverity has developed automated error checking for Linux, their static analysis seems to be indifferent toward open source."
This discussion has been archived. No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
 Full
 Abbreviated
 Hidden
More
Loading... please wait.
  • by pembo13 (770295) on Saturday October 07 2006, @01:47PM (#16349275) Homepage
    I scanned through the article, it didn't seem to mention how they tested the top proprietary software. I can well understand that there are are a lot of bugs in open source code since it is written by humans. But human also right the proprietary code. How did they test it?
  • What's a bug? (Score:5, Insightful)

    by BadAnalogyGuy (945258) <BadAnalogyGuy@gmail.com> on Saturday October 07 2006, @01:48PM (#16349279)
    Knuth used to have this great offer where he'd send you a check for pi or e or something if you managed to find a bug in his code.

    Well, what is a bug?

    I doubt he'd send me a check if I told him that TeX doesn't have an easily accessible iconic user interface. No, his concept of a bug is a deviation from the specified functionality.

    But what if that functionality is wrong or sucks?

    Apple does really well at creating functionality that doesn't suck. They suffer from the same problems of deviations from the spec as much as anyone, but they manage to mold their spec around what users want. Microsoft, to some extent, does the same and they release products that conform to what users want (generally) because they change the spec as necessary when customers demand change.

    If you are implementing towards a standard (like most OSS projects with any traction are wont to do), then you are necessarily restricted by what that spec says. If the spec says to do something inane, the standard-follower must implement it that way.

    I don't really have a point here except to say that unless they say "this is what we mean by bug", there can be no way to really examine their results.
    • Re:What's a bug? (Score:4, Informative)

      by AJWM (19027) on Saturday October 07 2006, @02:35PM (#16349655) Homepage
      Knuth used to have this great offer where he'd send you a check for pi or e or something if you managed to find a bug in his code.

      I think you're conflating two things. The check was (is?) for $50 or some such. The version number of the software is pi (or e) to whatever number of decimals, where each subsequent release adds a decimal place (becomes a closer approximation to the real thing.)

      No, his concept of a bug is a deviation from the specified functionality.

      That's the only reasonable definition of a bug in the software.

      But what if that functionality is wrong or sucks?

      Then that's a bug in the specification or in the requirements. I spent the better part of six months debugging the requirements on a major project once. Part of that was getting mutual agreement from three major customers, part of that was resolving internal inconsistencies in the requirements document, and part of that was a high level design process in parallel, to be sure we had a chance of actually satisfying the requirements.

      Of course the end user (especially of off-the-shelf software) generally doesn't differentiate between a bug in the software vs a bug in the specification or requirements. The end user generally never sees the spec, and only has a vague idea of the requirements. (Sometimes worse than vague -- how many people do you know who use a spreadsheet for a database?)

      (And to BadAnalogyGuy -- I'm not disagreeing, just amplifying.)
    • by jchenx (267053) on Saturday October 07 2006, @02:41PM (#16349703) Journal
      I work at MS. In my group (and I imagine it's the same in others), a bug can be many things. Here's what they typically are though:

      1. A product defect
        - This is the typical meaning behind the word "bug".
      2. DCR (Design Change Request)
        - That's where your TeX complaint would fall under. It's "by design" that it doesn't have an iconic user interface, but that doesn't mean it's something that shouldn't be addressed ever
      3. Work item
        - This is actually a result of the bug tracking system that we use. Rather than sending e-mail, which often gets lost, we often track work items as bugs. For example, "Need to turn off switch X on the test server when we get to milestone Y"

      To further complicate things, there is a severity and priority attached to every bug. Severity is a measure of the impact the bug has on the customer/end-product. It can range from 1 (Bug crashes system) to 4 (Just a typo). Priority is a measure of the importance of the bug. It ranges from 0 (Bug blocks team from doing any further work, must fix now), to 3 (Trivial bug, fix if there is time). (I don't know why the ranges don't match, BTW, seems silly to me)

      As anyone who works on large-scale project probably knows, there are always a wide range of bugs, across all the pri/sev levels. To me, a simple count of all the bugs isn't terribly useful. A project could have a ton of bugs, but most of them being DCRs (which are knowingly going to be postponed till the next release) and/or low pri/sev bugs. Or maybe it's the beginning of the project and they're all known work items. Or a project could have only a few bugs, but with all of them being critical pri/sev ones.

      So, whenever I see a report that simply talks about bug count, I take it with a huge grain of salt. If I had to guess (I skimmed the article), it seems like OSS projects have far more bugs, but perhaps lower pri/sev since the product itself has been evaluated as being higher quality. In the end, it's the quality that the customer really cares about.
      • Not quite (Score:5, Interesting)

        by The_Wilschon (782534) on Saturday October 07 2006, @04:04PM (#16350231) Homepage
        Bugs (a.k.a. Entomology)

        Donald Knuth, a professor of computer science at Stanford University and the author of numerous books on computer science and the TeX composition system, rewards the first finder of each typo or computer program bug with a check based on the source and the age of the bug. Since his books go into numerous editions, he does have a chance to correct errors. Typos and other errors in books typically yield $2.56 each once a book is in print (pre-publication "bounty-hunter" photocopy editions are priced at $.25 per), and program bugs rise by powers of 2 each year from $1.28 or so to a maximum of $327.68. Knuth's name is so valued that very few of his checks - even the largest ones - are actually cashed, but instead framed. (Barbara Beeton states that her small collection has been worth far more in bragging rights than any equivalent cash in hand. She's also somewhat biased, being Knuth's official entomologist for the TeX system, but informal surveys of past check recipients have shown that this holds overwhelmingly for nearly everyone but starving students.) This probably won't be true for just anyone, but the relatively small expense can yield a very worthwhile improvement in accuracy.
        This is from the TeX users group site, at http://www.tug.org/whatis.html [tug.org].
  • by Herkum01 (592704) on Saturday October 07 2006, @01:55PM (#16349335)

    "Deanna Asks A Ninja: What is the circumference of a moose?!"

    "It's michael pailum with his face in a pie times douglas adams squared."

    This answer makes as much sense as the article.

    Except "Ask A Ninja" made more sense. And was more accurate. And more entertaining.

    Can I just get a Ninja hit out on this guy something so these articles will not make it slashdot anymore?

  • by Anonymous Coward on Saturday October 07 2006, @02:03PM (#16349389)
    ...and while it is on the list on the web page, I was happy to determine that most of the issues they found were false alarms. They found three real bugs, none of which were likely to bite, and even if they did bite it is not exploitable. Nonetheless, those bugs probably wouldn't have been found otherwise, so I was happy for the scan.

    Rather than brag (I won't say who I am or the name of my project), I'm just going to sit back and read all the defensive flames from self-appointed "security experts" whose open-source project didn't do so well. After all the flames from these "security experts" that I've endured, I'm going to enjoy watching them squirm.

    It's karma.
  • Misquoting TFA (Score:5, Informative)

    by Harmonious Botch (921977) on Saturday October 07 2006, @02:07PM (#16349425) Homepage Journal
    While I appreciate that PreacherTom was good enogh to bring this to us, the sentence "...no open source project had fewer software defects than proprietary code." just does not match TFA.

    TFA says that no open source project is as good as the BEST of proprietary, but it also says that the AVERAGE open source is better than the AVERAGE proprietary.
  • Not quite... (Score:5, Insightful)

    by Timothy Brownawell (627747) <tbrownaw@prjek.net> on Saturday October 07 2006, @02:11PM (#16349449) Journal
    The study found that no open source project had fewer software defects than proprietary code. In fact, the analysis demonstrated that proprietary code is, on average, more than five times less buggy. On the other hand, the open-source software was found to be of greater average overall quality.

    No, *popular* open-source software is 5x as buggy as *safety-critical* closed software. The linked dissenting opinion [fortytwo.ch] is at least partly right; they're comparing apples to oranges.

    Maybe they should try comparing open- and closed-source software that's actually trying to solve the same problem? That'd be a bit more valid of a comparison...

  • by rduke15 (721841) <(rduke15) (at) (gmail.com)> on Saturday October 07 2006, @02:16PM (#16349511)
    The article makes it quite clear that the proprietary software which is much better that open source is mission-critical software. A class of software where ensuring minimum bugs is a top priority, and also a class of software which mostly does just not exist in OSS. If you are an OSS developer, would you try to develop open source air traffic control software? And even if yes, how would you do it anyway?

    Basically, my own conclusion from reading the article was that it IS possible to write excellent software with very few bugs, if that is a top priority. And, that the author seems to say that while mission-critical software (which happens to be proprietary) is fortunately much better than the rest, among all that other non-mission-critical software, open source tends to be better than proprietary.

    Not surprising, and quite encouraging...
  • by oohshiny (998054) on Saturday October 07 2006, @02:34PM (#16349643)
    The selection of programs from the two populations of programs (open source, proprietary) are not going to be comparable: vendors of proprietary software have a say over which code gets scanned, and they are going to select a different population of programs than the company selected for open source projects. This isn't a fixable problem: there is no way of doing this sort of study so that you can compare the two data sets. The best they could do is compare something like OpenOffice against Microsoft Office, or Apache against IIS.

    Furthermore, Coverity simply cannot accomplish what they claim to accomplish: there is no way of detecting "bugs" automatically--if there were, compilers would already be doing it. Coverity effectively does little more than compare code against a set of internal coding conventions; that can be useful if it's done right, but it's not a measure of code quality. Some completely correct code will score thousands of violations against their tool, while other code may contain thousands of bugs, none of which register. Furthermore, it is likely that a lot of their customers are Windows based and that Coverity is biased towards Windows-based coding conventions, giving more false positives on non-Windows code. Before publishing such comparisons, Coverity first would need to demonstrate that their tool does not contain such biases.

    Finally, and perhaps most importantly, the company isn't publishing its data, so nobody can verify or even evaluate their claims. Not only do they fail to publish their raw data (obviously, they can't do that for proprietary software), they also fail to list their summary statistics by vendor and project (which they could, but obviously won't do). They don't even give a summary statistic by class of application, class of organization, and code size. Their results are meaningless because they're not reproducible.

    These numbers tell you nothing about FOSS code quality relative to commercial code quality. What they tell you is that Coverity apparently doesn't know how to do statistics, misrepresents what their product can do, and doesn't know how to report experimental results properly. Now, do you want to put your trust in such a company?
  • by Ibag (101144) on Saturday October 07 2006, @02:43PM (#16349717)
    If you look at the summary, you come to the conclusion that proprietary software is five times less buggy than open source. It is also unclear how software can have five times as many bugs but be of higher quality. However, if you read the article, you find:

    In our research using automatic bug-hunting technology, no open-source project we analyzed had fewer software defects (per thousand lines of code) than the top-of-the-line closed-source application. That proprietary code, written for an aerospace company, is better than the best in open source--more than five times better, in fact. That company's software won't let you down when you're flying from New York to London.

    If we ignore that the automatic bug finding algorithms might not be a good measure for anything, we have a few issues with the summary. The richest american is twice as rich as the richest Swiss man. Does it follow that Americans are on average twice as rich as Swiss people? No. In the same way, the statement does not imply that the average open source software has five times as many bugs as the average proprietary software does. The coding practices of mission critical apps like flight control systems are different from those of most of the industry, and it is almost wrong to lump them together with everything else.

    The problem with statistics is not that they give an inaccurate picture, or even that selecting the right statistics can give a skewed picture, but that people who don't appreciate what statistics actually give use them to form opinions, make decisions, and summarize articles. Statistics don't lie, but the people who misreport them do, even if they don't realize it.
    • Re: (Score:3, Insightful)

      You are assuming that "a whole lot of people" actually check the code and submit patches for FOSS projects... My guess is that most testing, even of FOSS software, is done with the compiled program, not by reading the source code.
        • And n00b developers are also capable of finding bugs. Aren't they?
          No they are not to the extend of a experienced developt.
          going through the code dow not find bugs. Either you do a formal correct approach, that is a walk through or a code inspection then you may find bugs, or you only have the chance to find occasional off by one errors in a loop or array index. Just by looking over code as you say in your n00b appoach you only find suspicious pieces of code.
          What now? You change it to be less suspicious? And then? You commit it? So you don't know if somethign elsewhere is breaking now because of your change? Ah .... you have test cases for the software? So you run them after your refactoring? What now? All pass as before? Oops, if so: then you had no test case for that piece of suspicious code you just have fixed! So you still don't know if there was an error or not!

          Testing means to DEFINE how individual pieces of code should behave and writing a test case exactly for that. Changing software and fixing bugs means to have tests, lots of tests, not eyeballs.

          angel'o'sphere

          P.S. that does not mean that formal walk throughs / inspections don't work, they do!! But informal ones are only for educational purpose intersting.
    • Re: (Score:3, Informative)

      An open source software is tested by a whole lot of people over the world and everyone is free to take the code and test if. On the other hand in case of proprietary software this is not the case and is tested by far less number of individuals.

      That sounds rather idealistic... The coverage on OSS varies a lot. Most is not tested much, and the testing is not systematic and analyzed, but ad hoc. And if a bug is found, many just shrug and think of it as buggy software, but don't do more about it. There is

    • by Alien54 (180860) on Saturday October 07 2006, @02:01PM (#16349381) Journal
      The problem is that there are different types of Bugs. things like a typo in a help file, or American spelling vs British spelling, vs a bug were the app crashes the system when installed on a system with an early version of Quicktime are clasdsified differently.

      The summary just says all bugs, which is not fair if the proprietary has 5 times the number of critical or super-critical bugs.
      • Even worse. (Score:5, Insightful)

        by khasim (1285) <brandioch.conner@gmail.com> on Saturday October 07 2006, @02:32PM (#16349635)
        He's comparing "bugs" in a project such as Apache with "bugs" in the software controlling a jet engine on an airplane.

        He refuses to accept that different projects have different requirements. When the project results in people dying if it fails, you spend a LOT more money and time finding all the "bugs".

        When the worst that happens is that you don't see a web page, your money/time requirements are not so high.

        Even so, from his finding, Open Source is, on average, better than the closed source projects (not counting the closed source projects that result in loss-of-life in the event of a failure).

        He's an idiot for confusing the different requirements.
        • Re:Even worse. (Score:5, Insightful)

          by phantomfive (622387) on Saturday October 07 2006, @04:17PM (#16350317) Homepage Journal
          Don't listen to the slashdot summary. It's terrible. The author is not against open source, he talks about the "brilliant open-source community."

          What this guy is trying to say (besides 'buy my software') is that open source can do better (the title of his article is "...what open-source developers can learn....."). He wants people to use stricter development practices; things like automatic testing, nightly builds, etc.

          Furthermore, he is probably right, automatically testing code ala j-unit or cpp-unit is a great idea when you are getting contributions from many different people. If that became common practice in the open-source world, the code quality would improve. He's not saying open-source is bad, he's saying it could get better.

          This guy is not an idiot, you just didn't understand his point.
              • The codebase is very old, contains a bunch of legacy stuff nobody really understands, as the codebase has passed hands from a German company to Sun to the OpenOffice.org foundation. It's also picked up a layer of java along the way (for whatever reason).

                It's too bad because it actually works kinda okay, but it's a real effort to get your hands dirty with.
                Blender is also like that... it seems when a codebase has 'gotten around' it tends to pick up the bad habits of all the hands its been through.

                MySQL is a bad state because it's really only developed by MySQL AB -- no one else is contributing to it so they have no reason to make it any more maintainable than it is. PostgreSQL, on the other hand, had the luxury of being the fruit of some academic research projects and was rewritten once or twice, so it's a little more maintainable.
      • by LetterRip (30937) on Saturday October 07 2006, @02:42PM (#16349709)
        Coverity scanner only checks for programming errors. Ie things that cause crashes, etc.

        However as others have pointed out they are comparing mission critical software to non mission critical software. What should have been done (as has also been pointed out) is to cluster by usage case or software field. So databases to databases, browsers to browsers, generic office usage to generic office usage, etc.

        LetterRip
    • by linuxci (3530) on Saturday October 07 2006, @03:47PM (#16350161) Homepage
      I hate reports like this, there's so many reasons that bug counts don't prove anything. This all reminds me of the times MozillaQuest [mozillaquest.com] used to delight in posting Mozilla bug counts as a measure of quality (now MozillaQuest doesn't seem to mention Mozilla anymore, but a good parody of their Mozilla reporting is here [mozillaquestquest.com]).

      Now these days you often get studies claiming that proprietary software is less buggy than free software, but it misses some very significant points, the ones we used to respond to MozillaQuest articles still apply very much to today:

      • Free software projects very often have an open bug database so it's easy to see how many open bugs are in a project, most proprietary software doesn't have an open bug database so you have to trust the manufacturer and your own testing
      • Not all bugs in open databases are really bugs. Some are requests for enhancement, some are duplicates and some are rants
      • In some cases one persons bug may be another persons feature (e.g. if an application does something differently to the platform guidelines, some people may like this alternative behaviour, others will consider it a bug).
      • The profit motive - companies have a lot to lose by letting people know about bugs, volunteer led projects tend to want people to know about bugs in the hope someone will help fix them (this is getting a bit blurred now that more and more organisations are making money off free software but the fact still is with proprietary software you can't fix the bugs so they gain nothing by telling you about them)
      Sorry if this is redundant, I'm working on call at the moment and was halfway through typing this when I had some work to do!
    • by Anonymous Coward on Saturday October 07 2006, @02:24PM (#16349567)
      > wine for example only has 0.112 / 1000 lines of code as well.
      > and we all know it by far doesn't always do what we want it to do. ;)

      Well duh! It is an implementation of the Windows API. And when considering how often the WinAPI does what you want, I think they have made a perfect copy.
    • by Reziac (43301) * on Saturday October 07 2006, @02:35PM (#16349651) Homepage Journal
      Quoth the poster:

      linux 2.6: 3,315,274 lines of code, 0.138 / 1000 lines of code.
      kde: 4,518,450 lines of code, 0.012 bugs / 1000 lines of code.

      So far so good! But for contrast, I'll add this stat from TFChart:

      Gnome: 31,596 lines of code, 1.931 bugs / 1000 lines of code.

      Eeeep!!

      (No wonder I prefer KDE :)

    • by tb3 (313150) on Saturday October 07 2006, @02:38PM (#16349681) Homepage
      Are you nuts? Or are you just trying to see how many vapid over-generalizations you can jam into a single comment?

      Propriety software traditionally undergoes a formalized, designed testing process. It's not perfect, but it's an ordered approach to boundary testing, design level implementation of quality, and more.
      Says who? QA and testing covers the entire gamut, from formalized unit-testing at every level, to 'throw it at the beta testers and hope nothing breaks'. it's got nothing to do with 'proprietary' (not 'propriety') vs open source.

      Open source software must rely on after-the-fact testing in the form of "this broke when I tried to do this".
      Where on Earth did you get that? Are you completely oblivious to all the testing methodologies and systems developed by the open source community? Here's a few for you to research: JUnit, Test::Unit, and Selenium.

      Commercial software has a strong QA engineering component. Open Source software relies primarily on a black box testing approach.
      Again with the generalizations! Commercial software development is, by definition, proprietary, so you don't know how they do it! They might tell you they have a 'strong QA engineering component' (whatever that means) but they could be full of shit!