Programmer Superstitions

Why do we cling to some programming traditions so strongly? Jesse Liberty asks us to pause, ponder and re-assess the value of many of our superstitions.

Superstition: widely held but irrational belief (Concise OED).

Ancestor Worship: the custom of venerating deceased ancestors (Merriam Webster).

Apophenia: seeing patterns or connections in random data (Wikipedia).

There are a number of practices that we engage in – no, that we cling to, defend, and teach to others – that amount to one or another of these forms of magical thinking. This is often just fine, no harm done (other than to our self-image as rational geeks) but some of these totemic-rituals are stumbling blocks in our ability to produce reliable software. From time to time we might want to stop and question our most cherished assumptions.

Data Hiding

I’ve been writing in and teaching C++ and C# for fifteen years. I know well the iron-clad rule of object-oriented programming that class data should be hidden (private) and accessed through either a property (C#) or an accessor function (C++). Thus:

There are good reasons for this rule. Data hiding makes for a better decoupling of classes, and allows the programmer to intercept the access of private data and apply rules or other processing. It is possible, for example, to check whether the client accessing a value has the correct permissions to see or modify that value, and also to massage the data in appropriate ways.

But look closely at the example shown above: it is not unusual. The backing data is stored as a private local variable, and full access is provided with a get and a set accessor, neither of which do anything but return or set the value. That is, the accessors add no immediate value at all.

Why do we do this? The last, desperate excuse, as you will find in many computer books, including my own (I add with some chagrin), is that making the backing variable private allows you to change how you store the data without breaking any client of your Employee class. You could, for example, decide some time in the future to retrieve the name from a database.

The rational part of me suspects that the number of person hours wasted, both by making this variable private and by providing do-nothing accessor functions, is swamped by any possible benefit. And yet, I can’t quite bring myself to rewrite this as follows:

It looks wrong. It goes against what I’ve been taught and what I ‘know’ to be correct. It makes me itch.

The problem is that I don’t have a good rational reason not to write this. This is just the equivalent of not stepping on cracks in the sidewalk. One could even argue that the second version is more efficient, easier to understand, easier to maintain, and the risk is vanishingly small (one can always wrap it in a property later, with client classes none the wiser!).

Now, do I have the courage and moral fiber to put that in commercial code? In my next book even?

Ancestor Worship

Let’s take an example where we are not only being irrational, but also making our lives harder and our code more expensive to write and to maintain.

You may want to sit down for this one, but I’m going to dare to ask: why do we insist that C-derived languages (such as C#) continue to be case sensitive? Other than paying homage to Kernighan [1] and Ritchie [2], I believe I can safely say after 20+ years of writing in C, C++ and C# that the disadvantages of case sensitivity swamp the advantages.

The only clear advantage I have ever found is the ability to make the name of a property the PascalCase version of the camelCase name of the backing variable, as we saw above. That is, the backing variable might be employeeName and the property would be EmployeeName, and because C# is case sensitive, these would be considered two different identifiers. This avoids naming the backing variable, for example, _employeeName or some such ugly alternative.

In exchange for that convenience, we enjoy hours of debugging, trying to find where we inadvertently introduced a new variable or method name because of a misplaced shift-key.

Has any bright graduate student done research on the cost/benefit of case sensitivity? Is there any rational reason that in 2006 C# continues this ‘tradition’ that was established thirty years ago? Or might it be a lingering fear of showing disrespect to the icons of our industry; the mighty heroes who created the C family, defeated Troy and bequeathed us these scriptures by which we live?

Or Maybe Not…

There is an argument that case sensitivity makes more sense with some human languages other than English, and may even make sense as an optimization for some data structures, such as hash tables. Such arguments, however, speak to the need for an optimizing compiler to handle the issue; there is no reason for the language to do so.

In any case, before you write in to me, I’ve stopped caring about the specifics here because Visual Studio goes so far in fixing this problem. I’m just using this more as an example of how we cling to irrational traditions than as a major issue in writing clean code. The truth is, anyone writing C# and who isn’t using Visual Studio deserves the tsuris that results. Anyway, Visual Studio’s Intellisense effectively makes case sensitivity a non-issue.

C++ programmers like to suffer anyway, so this just feeds the beast.

Ancestor Worship II

Here’s another example of latent ancestor worship (or at least of very old habits dying hard). There is a wonderful myth that American standard rail road tracks are the width they are (4 feet, 8.5 inches) because that is the way they built them in England, because that is the way they were gauged by the first tramways, because that is the width of wagon wheels created to fit in the wheel ruts in old English roads that were in turn dug by Imperial Roman Chariots.

The myth has tremendous lasting power (you can find it all over the net) because it feels right [3]. We do that kind of thing a lot; we build the streets of Boston on old cow paths [4]; we unconsciously follow old patterns, even if those patterns no longer make sense or are necessary.

How many times have you seen (or written) code like this:

In this example (for those of you who do not spend your time writing C#), the counter variables are i and j. Even those of us who would never dream of using single letter variable names feel perfectly comfortable using the venerable i and j in these circumstances; and have been doing so since we learned to program.

Why, in heaven’s sake do we use these particular letters? Surely a and b would make much more sense. Even x and y might make more sense; but it turns out that in Fortran (remember Fortran? Remember Eisenhower?) the integer variables were the letters I through N (which comes from an even older tradition of mathematicians using i to n as subscripts for integers), and, well, we just got into the habit. This one is fairly harmless, a ritualized and vestigial part of the programming mind that we’re surprisingly reluctant to let go of.

Pattern Recognition [5]

One of the most powerful forms of magical thinking is Apophenia: seeing patterns or connections in random data. The tendency towards Apophenia is probably hardwired into the human brain; it is the price we pay for the very advantageous human ability of pattern recognition (an adaptive part of our intelligence that helps us know when to run and when to hunt) but it can also lead us astray (arguably it is the basis of our belief in many pseudo-sciences).

Apophenia is certainly pervasive in consulting. A classic example was the tendency to study ‘excellence’ in successful companies in the 1980s; trying to extract those apparent essential elements that lead to success.

Unfortunately, not only was it far more complex and difficult for other companies to reproduce success following these patterns, even the iconic companies themselves felt the worm turn over time. They kept repeating their patterns, but the outcomes were different. What went wrong?

It became clear that the patterns of ‘excellence’ we saw in successful companies with their great customer service, their attention to employees and small details (as admirable and perhaps as necessary as these efforts were) were actually not as easily connected with success as we had thought. Correlation is not always causation, as we so often learn after losing our shirt.

Software Process Patterns

In software, we’ve identified various patterns of failure that we’ve tried to ‘correct’ over the past two or three decades. Lack of documentation and communication was ‘corrected for’ in the promulgation of standards such as CMM and ISO 9000. Unfortunately, this lead to rigidity and an inability to respond to rapidly changing requirements. In turn, this was ‘corrected for’ with more nimble Agile programming approaches such as eXtreme [6] programming.

In the 1990s, there was a series of divergent and later convergent methodologies offered for creating well-designed object-oriented programs [7]. Learning these ‘methodologies’ involved a significant investment of time and energy, and advocates of each methodology became highly motivated to demonstrate that their method (and perhaps only their method) would lead to success: ‘when all you have is a hammer, the whole world looks like a nail’. There is always a fine line between rationally embracing a methodology that will ensure success of a project, and irrationally clinging to an approach because it is what you know.

We know that personal investment clouds judgment; just ask the scientists who work for Big Tobacco or Big Oil, or for that matter, Big anything!

We have to be wary when following the development patterns of successful software companies, on the premise that they must know the secret. It wasn’t that long ago that the top database was dBase, the top spreadsheet was Lotus 1-2-3, the top Word Processor was Multimate (or was it WordStar…no, maybe it was WordPerfect), and the top compiler was made by Borland. Each of these companies clearly knew how to produce excellent software. After all, their development processes worked at least once.

It simply isn’t clear that you can capture a ‘winning’ process and reproduce it like a recipe, or that failure to follow the ‘recipe’ can be shown to be the sole reason for failure of a project. We’re getting pretty good at knowing what doesn’t work (it is always easier to create chaos than order) but not so good at finding what does work consistently.

I recently testified as an ‘expert witness’ in a civil lawsuit, at which the opposing ‘expert witness’ asserted that the ‘failure’ of the project could be attributed to a lack of strict compliance with the ISO 9000 standard.

Smart people can have a reasonable discussion about whether ISO 9000 will improve the likelihood of success on very large projects (e.g. the software for the mission to Mars). I personally would not like to work on a software project that is managed using anything like such a bureaucratic, heavyweight, inflexible, document-intensive, rigid process, but that does not necessarily mean that I can prove that no project would ever benefit from it.

I had no hesitation, however, in asserting under oath that a project with ten developers would benefit from strict adherence to ISO 9000 like a drowning man would benefit from being thrown an anchor. It is my opinion that knee-jerk reliance on a process like ISO 9000 to guide you through each project is a form of Apophenia; the connection between the pattern of ISO 9000 compliance steps and success, however measured, is imaginary.

Looking where the light is

There is an old joke about a man searching for his keys under a street lamp. He lost the keys in the alley behind him, but this is where he can see.

In our desperate attempt to gain control over very complex processes, with so much money at stake, and so many examples of previous failures, we often fall victim to seeing apparent patterns (be they processes or otherwise) where they do not exist. We examine various projects and say: ‘Ah ha! I see why this project worked and that one didn’t: the difference was too much/too little analysis/ design/ documentation/ process/ oversight/ communication. And all we have to do is increase/ decrease the number/ length/ duration/ sequence/ complexity/ formality of the meetings/ documents/ diagrams/ studies/ sign-offs, etc.

These false patterns lead us astray; they offer us the promise that if we paint by the numbers, we too can be Renoir. It may be, however, that the variables of successful process are far more complex; including, if we are terribly unlucky, factors over which we have little or no control; or, only marginally better, factors over which we will have no control until our tools and technologies mature.

Or, it may just be that some developers are better at the ‘art’ of programming and shipping product; and that the old adage ”tis a poor carpenter who blames his tools’ applies to software as well as it does to other crafts.

The Scientific Method

Over the years, at least to some degree, society has given up many (though not all) of its superstitions when presented with more compelling alternatives. One of the most effective techniques for distinguishing between superstition and truth (or some approximation of truth) is the scientific method; in short, controllable, measurable, reproducible effects subjected to peer review.

It’s hard to do that sort of thing when you’re trying to hit a deadline, and it’s particularly hard to sort out all the alternatives when there are so few objective comparisons. When was the last time you were able to find anything like an objective answer to the question “which is better: Java or .NET?” (Please don’t write in, my mailbox fills quickly).

It is particularly interesting that the work done at Universities and Research Centers is often not only unrelated to, but totally disparaged by, the folks who write code for a living. That is not the way things work in other Engineering fields and I’m not convinced we can afford the disconnect for much longer. We seem to be writing 21st century software with a 12th century mindset and that can’t be good.

Perhaps we need to put some time into establishing metrics for the software development process; an agreed upon way to measure how long development takes, and how successful are the results. Just defining those two concepts could take a while, let alone figuring out how to measure them. Once we do, however, we could apply these metrics to a variety of approaches and then we could objectively measure, across large samples (to remove confounding variables) which techniques seem to work under various conditions. It would be a start.


[1] Brian Kernighan – according to Wikipedia he is the author of Hello World, and co-author of the first book on C, but denies that he created C, saying “It’s entirely Dennis Ritchie’s work.”

[2] Dennis Ritchie, creator of C, key developer of Unix, and programming icon.

[3] The truth is (as always) more complicated . For a complete examination, take a look at Snopes (a wonderful source for debunking and disentangling urban legends)

[4] Actually, according to William Fowler, director of the Massachusetts Historical Society, this too is an enduring myth, as reported in the Boston Globe on April 25, 2004.

[5] With thanks to William Gibson

[6] See Agile Software Development by Robert C. Martin

[7] See for example Object-Oriented Analysis and Design by Grady Booch, The Object Advantage by Ivar Jacobson, Object-Oriented Modeling and Design by James Rumbaugh, Analysis Patterns by Martin Fowler, Object-Oriented Software Construction by Bertrand Meyer, etc.