Friday, October 28, 2005

A Programmer's Cautionary Tale

Warning - geek jargon ahead, including light C code reading and other programming nerdliness.



I've been writing code since 1979 (doh!) and writing C (and, later, C++) code since the early 80s. I've seen and written, nearly every possible way you can write bad code. But I'm pretty good about learning from my mistakes, and I like to think I don't make the same mistake twice, once the errors of my ways are pointed out.



But some programming mistakes are just hard to avoid. And that's why we have compilers as our first line of defense. There's lots and lots of code that they just won't allow through, and lots of code they will at least complain about.



Yes, you can use other tools to check your source code. The most common tool that reads C and C++ source code to warn you about dubious constructs is called lint. The problem is, it is a beast to use and a bear to configure, especially if you are trying to run it on already written code. As it is supposed to, lint complains about everything, no matter how trivial a problem it may or may not be. So you spend a lot of time getting its settings so that it reflects how aggressive you want it to be, and you keep doing this. This means it generally isn't used, often to the detriment of better code.



There are many other tools that do lint-like things, and even more. I've used Parasoft's tools and have found them to be very nice. While we were at a local Linux show, we talked to a company that sold a very comprehensive source code analsysis tool.



Each of these commercial tools has a big drawback - cost. At US$1000 and up, they are not minor investments. Now, admittedly, US$1000 is not that much to a company. I get paid a pretty penny to code, as do my co-workers and a US$1000 tool that saved 10 hours of work would begin paying for itself. But it still is a hard thing to justify, as Jack Ganssle talks about. And each takes time to begin using, and to continue to use in order to be most effective.



So the front line of code analysis remains the compiler. It is, after all, the final arbiter of what your code can look like. So we've come to expect some small level of meta-analysis, warning us of dubious code, like unpredicitable side effects, unused variables, using an assignment in an if statement, etc. And, unfortunately, I got bit by a couple of GCC's pecadillos.



GCC is the standard compiler in the Linux, *BSD and Unix world. Freely available, incredibly powerful, it is really just a front end to a whole suite of compilation and linking tools. It has dozens and dozens of options, but the one we're concerned about here is the -Wall, which turns on all warnings, so it will be persnickety about your code. So I gave it some code that looked something like this:





#include <stdlib.h>

const int OK=0;
const int ERROR=1;

const int MAX_RETRIES=3;

int main( int /*argc*/, char* /*argv[]*/ )
{
int rv = CheckSomething();

if ( rv == ERROR )
{ // error return, try to work around it

// ... do some recovery stuff, give up if it doesn't work

}

// ... everything's okay, so carry on

return 0;
}


Everything is fine, and going according to plan. A little later, it turned out that CheckSOmething should be tried a few times before it is assumed there is a problem. So I quickly (ahh, the key word) changed it to be:



#include <stdlib.h>

const int OK=0;
const int ERROR=1;

const int MAX_RETRIES=3;

int main( int /*argc*/, char* /*argv[]*/ )
{
int rv;

for ( int rcnt=0; rcnt < MAX_RETRIES && rv != OK; ++rcnt )
rv = ();

if ( rv == ERROR )
{ // error return, try to work around it

// ... do some recovery stuff, give up if it doesn't work

}

// ... everything's okay, so carry on

return 0;
}


Do you see the problem? At first glance, it looked okay, so off it went. No complaints from GCC, the code worked perfectly and everyone was happy.



Then, months later, I got a complaint that the code wasn't working any more. It wasn't recovering fromthe error correctly. A quick re-scan of the code and sure enough, the problem is pretty evident. But my question then became, why didn't GCC complain about the error? It's a pretty common error, and one that compilers have been warning about for decades. Turns out there are two intertwined reasons that this bug remained hidden under a rock for so long, only rearing its ugly head when the error case cropped up.




  • GCC zero initializes automatic variables when built in 'debug' mode, by using the -g flag. Normally, automatic variables (ie., ones that are declared inside the program and are not globals) are not initialized to any predictable value by the compiler; the programmer needs to do it specifically. But GCC sets them to zero, but only when built for debugging, not when you use the optimizer (the -O flag).

  • The -Wall flag doesn't warn about using unitialized variables unless the -O flag is used. The man page says:

    These warnings are possible only in optimizing compilation, because they require data flow information that is computed only when optimizing. If you don't specify -O, you simply won't get these warnings.



So I got nailed by probably the oldest programming bugaboo in the book - using an "uninitialized" variable. It worked in the normal case, because I was using the -g flag, which set rv to be 0, which, you'll remember is the OK status. So the loop never got executed, CheckSomething is never called, and things continued along their merry way, which is okay unless there is a problem and you want to try to fix it. You'll never know about it.



So the obvious fix is:

#include <stdlib.h>

const int OK=0;
const int ERROR=1;

const int MAX_RETRIES=3;

int main( int /*argc*/, char* /*argv[]*/ )
{
int rv=ERROR;

for ( int rcnt=0; rcnt < MAX_RETRIES && rv != OK; ++rcnt )
rv = CheckSomething();

if ( rv == ERROR )
{ // error return, try to work around it

// ... do some recovery stuff, give up if it doesn't work

}

// ... everything's okay, so carry on

return 0;
}


And now we're golden. We're going to call CheckSomething at least once, and now we'll correctly get the return value for checking later. So the moral of the story is to use -Wall but to be sure to compile the final version with -O to get all the warnings. And to look into getting a C++ code analyzer!



Thursday, October 27, 2005

fortune and fame

The fortune command spills out a small witticism or other interesting quote. I've talked about it before here, where I pointed out there was a FreeBSD tips fortune file.

Well, it's pretty easy to add your own quote file, but I can never remember how to do it, so I'm documenting it here.

fortune generally reads its data files from /usr/share/games/fortune - all the files that have a '-o' appended are "offensive" ones. You can pass a quote file (soon I'll tell you how to make one) on its command line and it will use that one. The special file name "all" will use all the ones in the "standard" places. Other useful options are -a, to include "offensive" quotes, -e, to consider them all to have equal sizes (otherwise it chooses which file based upon the size of the file), and the -l/-s flags to show only long or short ones.

To build your own fortune file, create a simple text file that has no extension and is a list of strings, separated by a line that has just a '%' on it. It would look something like this:


It's amazing how many people in this world are born at third base,
and think they've hit a triple.
Albert L. Lilly III
%
Eagles may soar, but weasels don't get sucked into jet engines.
Todd C. Somers
%
There is no psychiatrist in the world like a puppy licking your face.
Ben Williams
%


Then run the strfile command on it, which generates a .dat file with indices into this quote file for fortune to use:


$ strfile quotes


So now you can run fortune on this file:


$ fortune quotes


The easiest way to integrate this into the normal fortune lookup is to put it into the /usr/share/fortune directory, but that isn't usually writable by normal folks. This means you need to create an alias to do it:


$ alias fortune="fortune -ae all ~/quotes"


The -a says to include the "offensive" ones, the -e says to chose equally among the files, the 'all' says to include all the standard ones and then I add my own file at the end.

Again, don't forget about the freebsd-tips fortune file:


$ \fortune freebsd-tips


gives you a fortune from the tip file. I used the backslash to avoid the fortune command getting replaced by my alias (all of this discussion assumes you are using bash for a shell). To see if it is working correctly add in the -f flag, which displays the files it will use:


$ \fortune -fae all ~/quotes

___% /usr/share/games/fortune
___% fortunes
___% fortunes2
___% freebsd-tips
___% murphy
___% startrek
___% zippy
___% fortunes2-o
___% limerick
___% murphy-o
___% fortunes-o



Again, note the backslash to "quote" the f, so that alias expansion doesn't happen. It's a nice little shortcut if you think an alias is getting in the way.

For some reason, the FreeBSD.org man pages on the web, found at FreeBSD Hypertext Man Pages don't have the 'fortune' - probably a bug.

strfile


Tuesday, October 25, 2005

New PC-BSD release

PC-BSD 0.8.3 has been released to the wild. Once again, I'm going to download it and give it a try on my test machine. I won't even make it work with my KVM box either. I have have a USB mouse hooked directly up to it.

PC-BSD - Personal Computing, served up BSD Style!



Tuesday, October 18, 2005

portaudit

Cool port that keeps a database of all the ports that have security problems, and it won't let you install them without a manual override. I'm going to look a little more into this one!

FreshPorts -- security/portaudit



Wednesday, October 12, 2005

grep help

I finally broke down and figured out how to better use the recursive grep function. Although I've been using the bash shell via Cygwin on Windows for a few months now, I'm still partial to 4NT as a command shell. There's a few nice features I wish that bash had:


  • a better directory recall

  • The '...' (and beyond) facility for changing directories upwards an arbitrary number of parents

  • a directory database
  • A much easier to use "FOR" facility from the command line

  • The FFIND /S command, which recursively searches for text in files



I knew about grep's -R flag, but found it hard to use and couldn't figure out how to recursively search just .h files, for instance. I finally broke down and read the man page, and here's what I found out.

To recursively search for something, you can use the -R flag to grep:


$ fgrep -R BIGENDIAN .


To recursively check just .h files:


$ fgrep -R --include="*.h" BIGENDIAN .


Where:
fgrep - "fast" grep. You can use this if you don't use any regexps in the search pattern
-R - recursively search folders, starting with this one (.)
--include="*.h" - just include *.h files. You need to put the quotes around the wildcard, or the shell will try to expand it and it won't work like you think. That's why you can't just do:


$ fgrep -R BIGENDIAN *.h


It will still just search local .h files, because the shell expands the wildcard before fgrep gets it. So if there are two .h files in the current directory, the command looks like this after the shell gets done with it:


$ fgrep -R BIGENDIAN foo.h bar.h


You can add multiple --includes, and there's even a --exclude you can add to exclude files of that pattern from the search.

grep man page



Tuesday, October 11, 2005

/dev/random

Looking into some random number generator stuff and I came across this page on getting /dev/random ready for generating some good random numbers. Mind you, this is just for 4.x (like my server), as 5.x and beyond have a bultin setup for doing that random number thing (see the note at the bottom of this page).



Prepping /dev/random in FreeBSD 4.x



Thursday, October 6, 2005

New Freebsd.org home page

Whoa! I just went to the FreeBSD.org home page and found the all-new layout. It's got some nice things about it, but I surely miss the easy links to the Ports page especially. I had to go on a bit of a quest in fact to find it. FYI, you can find it under the "Get FreeBSD" button at the top, then the last item on the menu at the left.



Still, it's a pretty clean looking layout, and maybe my original shock will give way to a grudging acceptance.



The FreeBSD Project



Drupal

Yet another "Content Management Platform". I've talked about them before (like here), and here is another one, called "Drupal". Something I'd surely love to play with in my non-existent free time.



drupal.org | Community plumbing

Port description for www/drupal