I've been writing code since 1979 (doh!) and writing C (and, later, C++) code since the early 80s. I've seen and written, nearly every possible way you can write bad code. But I'm pretty good about learning from my mistakes, and I like to think I don't make the same mistake twice, once the errors of my ways are pointed out.
But some programming mistakes are just hard to avoid. And that's why we have compilers as our first line of defense. There's lots and lots of code that they just won't allow through, and lots of code they will at least complain about.
Yes, you can use other tools to check your source code. The most common tool that reads C and C++ source code to warn you about dubious constructs is called lint. The problem is, it is a beast to use and a bear to configure, especially if you are trying to run it on already written code. As it is supposed to, lint complains about everything, no matter how trivial a problem it may or may not be. So you spend a lot of time getting its settings so that it reflects how aggressive you want it to be, and you keep doing this. This means it generally isn't used, often to the detriment of better code.
There are many other tools that do lint-like things, and even more. I've used Parasoft's tools and have found them to be very nice. While we were at a local Linux show, we talked to a company that sold a very comprehensive source code analsysis tool.
Each of these commercial tools has a big drawback - cost. At US$1000 and up, they are not minor investments. Now, admittedly, US$1000 is not that much to a company. I get paid a pretty penny to code, as do my co-workers and a US$1000 tool that saved 10 hours of work would begin paying for itself. But it still is a hard thing to justify, as Jack Ganssle talks about. And each takes time to begin using, and to continue to use in order to be most effective.
So the front line of code analysis remains the compiler. It is, after all, the final arbiter of what your code can look like. So we've come to expect some small level of meta-analysis, warning us of dubious code, like unpredicitable side effects, unused variables, using an assignment in an if statement, etc. And, unfortunately, I got bit by a couple of GCC's pecadillos.
GCC is the standard compiler in the Linux, *BSD and Unix world. Freely available, incredibly powerful, it is really just a front end to a whole suite of compilation and linking tools. It has dozens and dozens of options, but the one we're concerned about here is the -Wall, which turns on all warnings, so it will be persnickety about your code. So I gave it some code that looked something like this:
#include <stdlib.h>
const int OK=0;
const int ERROR=1;
const int MAX_RETRIES=3;
int main( int /*argc*/, char* /*argv[]*/ )
{
int rv = CheckSomething();
if ( rv == ERROR )
{ // error return, try to work around it
// ... do some recovery stuff, give up if it doesn't work
}
// ... everything's okay, so carry on
return 0;
}
Everything is fine, and going according to plan. A little later, it turned out that CheckSOmething should be tried a few times before it is assumed there is a problem. So I quickly (ahh, the key word) changed it to be:
#include <stdlib.h>
const int OK=0;
const int ERROR=1;
const int MAX_RETRIES=3;
int main( int /*argc*/, char* /*argv[]*/ )
{
int rv;
for ( int rcnt=0; rcnt < MAX_RETRIES && rv != OK; ++rcnt )
rv = ();
if ( rv == ERROR )
{ // error return, try to work around it
// ... do some recovery stuff, give up if it doesn't work
}
// ... everything's okay, so carry on
return 0;
}
Do you see the problem? At first glance, it looked okay, so off it went. No complaints from GCC, the code worked perfectly and everyone was happy.
Then, months later, I got a complaint that the code wasn't working any more. It wasn't recovering fromthe error correctly. A quick re-scan of the code and sure enough, the problem is pretty evident. But my question then became, why didn't GCC complain about the error? It's a pretty common error, and one that compilers have been warning about for decades. Turns out there are two intertwined reasons that this bug remained hidden under a rock for so long, only rearing its ugly head when the error case cropped up.
- GCC zero initializes automatic variables when built in 'debug' mode, by using the -g flag. Normally, automatic variables (ie., ones that are declared inside the program and are not globals) are not initialized to any predictable value by the compiler; the programmer needs to do it specifically. But GCC sets them to zero, but only when built for debugging, not when you use the optimizer (the -O flag).
- The -Wall flag doesn't warn about using unitialized variables unless the -O flag is used. The man page says:
These warnings are possible only in optimizing compilation, because they require data flow information that is computed only when optimizing. If you don't specify -O, you simply won't get these warnings.
So I got nailed by probably the oldest programming bugaboo in the book - using an "uninitialized" variable. It worked in the normal case, because I was using the -g flag, which set rv to be 0, which, you'll remember is the OK status. So the loop never got executed, CheckSomething is never called, and things continued along their merry way, which is okay unless there is a problem and you want to try to fix it. You'll never know about it.
So the obvious fix is:
#include <stdlib.h>
const int OK=0;
const int ERROR=1;
const int MAX_RETRIES=3;
int main( int /*argc*/, char* /*argv[]*/ )
{
int rv=ERROR;
for ( int rcnt=0; rcnt < MAX_RETRIES && rv != OK; ++rcnt )
rv = CheckSomething();
if ( rv == ERROR )
{ // error return, try to work around it
// ... do some recovery stuff, give up if it doesn't work
}
// ... everything's okay, so carry on
return 0;
}
And now we're golden. We're going to call CheckSomething at least once, and now we'll correctly get the return value for checking later. So the moral of the story is to use -Wall but to be sure to compile the final version with -O to get all the warnings. And to look into getting a C++ code analyzer!