Why Test-Driven Development? (Part 1)

This article is part of a larger ongoing series from Alex Bunardzic on Test-Driven Development. You can view all the articles on this page.

Software development is a very tough discipline. Robert Martin (better known as Uncle Bob) holds that software development is the toughest discipline because never before in human history was there an expectation to deal with such insane levels of details. Software development is so tough that it even created the movement called Extreme Programming (XP). We’re talking extreme levels of focus and hard work needed to produce quality software.

Since the core activity in software development is the activity of programming, what is the main challenge we’re faced with when attempting to program computers? Whenever we sit down to write some code, we are faced with two equally demanding challenges:

  1. We are focused on writing the code that will behave as expected.
  2. We are focused on writing the code that will be well structured.

It turns out that holding both expectations in our head at the same time tends to produce a lot of confusion. Trying to focus on two things at the same time is often a recipe for producing substandard material.

Is there a way to fix this impasse? Yes, there is a way – it was proposed by Kent Beck in the 1990s with his formulation of the Extreme Programming discipline. Kent proposed a different approach to writing programs by segregating the above two activities. This way of writing programs consists of first focusing on the activities that will produce code that behaves as expected. During this first phase, we neglect to focus on the quality of the code structure. In other words, we just write the code that comes easy to us. Meaning, it’s okay during this phase to break many rules and produce code that will make many professionals cringe and frown upon. That first pass of writing code (the first draft) is sloppy and messy; and that’s okay. We are asked to give ourselves permission to be sloppy at this juncture. Our aim at this point is single-minded – create code that will behave as expected.

And now for the crucial question: how do we know that the sloppy code we just wrote behaves as intended? We could manually verify the behaviour of the newly created code. But that would be very wasteful. Hey, here’s an idea – since we are very good at automating processes, why don’t we investigate automating our own development process? Instead of doing tedious, repeatable chores manually, why don’t we offload those chores to the machines?

What repeatable manual chores am I referring to here? I’m mostly thinking about the following scenario: we make a diff (a change) to the source code with the intention to get it to produce some desired behavior. We save those changes, and then embark on performing the following manual steps:

  1. Compile the code
  2. Build the app
  3. Configure the app
  4. Run the app
  5. Navigate to the login page on the app
  6. Log in with special credentials that simulate a specific user role
  7. Navigate the app to reach the section that is affected by the code change
  8. Enter some test data
  9. Examine the resulting output
  10. Log out and log in again with different credentials that simulate different user role
  11. Repeat steps from 7 to 9

Whew! That’s a lot of manual heavy lifting. Why are we doing things in such a pedestrian way? All the above steps can be easily automated. Saves not only time but also spares us from ‘fat fingers’ mistakes.

So, how do we automate the above chores? Simple – we write some code. But wait a minute – didn’t we just say that writing code is super tough because we are striving to satisfy two incompatible expectations (i.e., the expectation that the code behaves to our satisfaction and the expectation that the code is well structured). Well, in this case, the code we write to automate development chores is a different breed of code. Unlike the implementation/production/shipping code, which must be well-designed and properly structured, the code we write to automate our development activities need not be that fancy.

Rules for automating behavioral expectations

There is an adage in the software development profession: “Functionality is an asset; code is a liability.” That means we want to implement certain functionality, and if we can do it without having to write any code, that would be the most desirable outcome. However, oftentimes it is not possible to implement new functionality without writing some code, so a compromise seems unavoidable.

Functionality is what we call system behavior. Once implemented, the system behaves in certain automated fashion. While developing the system, we strive to make the system behave in the expected way(s).

To keep expectations about behavior separated from expectations about the code structure, we must write the expectation first. Let’s illustrate with a simple example. Suppose we are developing a tip calculator and we are expecting it to calculate tip percentages correctly. We can say “given an order total of $100.00, when the service was good, add $15.00 to the total to produce the grand total of $115.00”. That expectation stipulates that the system will calculate a 15% tip for a good service.

Okay, but how do we automate that expectation? We do it by writing a test. The test will first arrange for the preconditions by declaring the order total to be $100.00 and the service level to be ‘good’. The test will then envision that there is a system in place and that the test will be its first client. The test (i.e., the client) will notify the system saying, “here is the order total of $100.00 and the service was good; please calculate the tip and add it to the order total”.

To many software developers, this approach feels a bit odd, a bit backwards. Why are we spending time imagining how we could be interacting with an imaginary system? Isn’t it smarter to first build a system, and only then attempt to interact with it?

The reason we are envisioning how will the client’s interaction go with a not-yet-created system lies in our desire to keep our options open. We prefer to remain flexible. If we were to wait until the system gets built, we’d be coerced into having no choice but to interact with it in ways that may not make much sense to us, the client. The ease of understanding the system and the usability of the system are important aspects of the system design, which is why we are pursuing this train of thought.

So, by following this open-ended approach, we find ourselves in a very flexible position: we can envision any type of interaction with the nascent “tip calculator” system. We could call it directly and synchronously (i.e., blocking); we could try calling it asynchronously; we could even attempt to interact with it by publishing messages and letting the system subscribe to our publishing channel and consume our messages. Sky is the limit.

Once we formulate our first stab at how we would like the client to interact with the system, we can pretend that the interaction happened. The only thing left to do is to verify if the system behaved the way we expect it to behave. In this case, we are expecting the system to behave by returning $115.00 as the order grand total.

Now, in practical terms, this arrangement means that the expected system behaviour will not happen. We can try to run this half-baked test, and it will fail, of course. It cannot succeed for the simple reason that the system it is supposed to interact with does not exist yet.

But this failure is a good thing. It lets us know that our expectation hasn’t been met, and it gives us the impetus to keep going in the attempt to develop a system that satisfies our expectation.

From this little exercise, the most important thing is to focus 100% on our expectation regarding the system behavior. We need not waste any time pondering the implementation of the expected behavior. In other words, we are focused on formulating what is the system supposed to be doing, not on how the system is supposed to be doing it.

Separating the what from the how (separating the intention from the implementation) is a very powerful practice. It lets us focus on one thing at a time, which increases our chances of delivering quality work. Divide and rule is a maxim that is pretty much applicable to all situations.

The next thing to notice in the above exercise is that we are not formulating any conditional logic. The expectation, the test we have laid out above, is proceeding in a straight line. From arranging the preconditions (i.e., “given the order total is $100.00 and the service was good”), to triggering the system behaviour (i.e., sending the message to the yet-to-be-built system), to asserting the produced results (i.e., did the system return $115.00 grand total?). The last part is often referred to as post-condition (or, simply, assertion).

Pay close attention to how nowhere in the flow of the above test do we expect the code to make any branching decisions (in the form of “if this condition, then do that, otherwise do the other thing”). This ground rule is super important. It can be formulated as “never add any conditional logic to your tests!”

Of course, a testing framework itself possesses that branching logic, in the form of “if the expectation has been met, the test passes, otherwise the test fails”. Such branching logic is baked into the assertion statement offered by the testing framework; however, we don’t need to add any further logic to it; it is offered to us for free.

To summarize, two ground rules when automating behavioral expectations are:

  1. Envision what needs to be done, not how it will get done
  2. Avoid implementing any conditional (i.e., branching) logic in the test

How to work on meeting the expectation?

Now that we have crafted the shell of our expectation, how do we fulfill that expectation? In a more technical parlance, how do we make the failing test pass?

The recommended way to do that is to focus on two things:

1. Taking only small steps

2. Taking ordered small steps

The small-steps approach is another unusual aspect of this discipline. Professional software developers sometimes tend to view that approach as being suitable for novices, who are still learning the ropes. Then, once the ramping up process is done, novices graduate to intermediate/senior level and at that point they should dispose of the sheepish small steps and take the coding challenges in bigger strides.

That, however, is not the Test-Driven Development way. In TDD, we really cherish small steps. And by small steps I’d insist we mean micro-steps. Or one micro-step at a time.

Before we take a closer look into why micro-steps are so desirable, let’s have a look at what is meant by a micro-step. The first thing to ask always is: what is the smallest step we can do now? Usually, the smallest step is the simplest step. But that’s not always the case.

In this case, however, the smallest step we could possibly take would be crafting a skeleton of the system we intend to endow with expected behaviour. It could be a module, or a class, or some other construct containing executable code that could be triggered. Usually, such skeleton system can accept some input values and then return some output values.

So, we take our first small step by crafting a class or a module. That class/module can contain a simple function or a method that accepts arguments, or input parameters. For example, it can accept the order total and the service level parameters.

OK, but now that we have a simple system that accepts those values, how can we make that system behave? How can we make it return the expected value (in our case, we’re expecting it to return $115.00). We take the next simplest possible step – we return the hardcoded value. The body of the method remains empty, save for the simple statement:

And with that simple change, we now have built the system that behaves as expected – it accepts the order total of $100.00 and the service level labeled as ‘good’, and then functions (i.e., behaves) by returning the $115.00 value. In effect, this test now passes! The expectation has been met. We’re done here.

Or are we?

More complex behavior

If every order total was $100.00 and if every service level was ‘good’, we wouldn’t have the need for developing an automated tip calculator app. What happens when, to keep things extremely simple, the order total is still $100.00 but the service level is ‘great’?

In that case, our expectation is that the system should add $20.00 to the order total, producing a grand total of $120.00. We can now write another test that formulates such expectation. When we run that test, it will fail. Why? Because it will send the order total of $100.00 and the service level ‘great’ to the tip calculator, but it will receive back the actual value of $115.00. And because it was expecting the actual value to be $120.00 in case of a great service, the assertion will fail.

How do we make the second test pass? The only way to make it pass is by replacing the hardcoded return value with some processing logic. The processing logic will have to make calculating decisions based on the type of the service. If the service was ‘good’, add 15% to the order total, if the service was great, add 20% to the order total. Both tests now pass.

What happens when we expect the system to calculate the tip for excellent service? We formulate another expectation. We create a new test that expects the actual grand total to be $125.00 given the order total of $100.00 and service level ‘excellent’.

Of course, this new test will fail (as it should). It fails because the system doesn’t know what to return when the service level is ‘excellent’. So, we need to modify our processing code to make it recognize the service level ‘excellent’ and then calculate the appropriate tip and add it to the order total (by applying 25% on the order total).

As should already be obvious from this exercise, we are growing the capabilities of our system in a very gradual, piecemeal fashion. And each step of the way is marked by a failing test that then passes once we make modifications to our shipping code.

It is also worth noting that our progression moves from very concrete code (as in hardcoded value) toward more abstract code (replacing hardcoded value with actual calculations that are performed on demand). We will see that TDD is a discipline that keeps egging us, little by little, away from more concrete code toward more abstract, more generic code.

For now, we have illustrated only one side of the Test-Driven Development process. We have accomplished the first challenge: making the code behave as expected. A bigger, more important side of this process – creating well-structured code — will be discussed in the next article.

Conclusion

The above-described approach to developing software will probably look ridiculous to developers who have never tried it. I know it looked not only ridiculous, but also very stupid to me when I first encountered it back in 2002. I could not believe why would anyone waste time writing more code than is necessary? Why write those executable expectations/tests when all we’re really asked to deliver is the shipping code? Obviously, no customer will ever agree to pay for those tests. Isn’t it logical then to avoid wasting time writing those tests?

Of course, even back then we all agreed that having regression tests is necessary. But those regression tests were to be written after we finish writing the shipping code, not before. Also, traditionally regression tests were the jurisdiction of the testers, or QA, not something software developers were expected to produce.

When it was explained to me that writing the test before we write the code is helping with the code quality, that didn’t make any sense to me either. I was confused – so, if we don’t think developers are capable of writing quality code without tests (which are supposed to drive the code quality), what makes us think that those same developers are suddenly capable of writing quality code in tests? Isn’t that approach just pulling the wool over our eyes? Tests are also code, and if we are doubtful that developers can write quality shipping code, why are we not doubtful that they can write quality test code?

We will clarify that dilemma in the following article. Still, for now, we should review one important aspect of the example we’ve examined so far. By following the simplest TDD practice, we were able to implement tip calculator functionality without ever resorting to any manual steps. None of the 11 steps exposed in the introduction had to be executed. From that, we see that TDD offers full software development process automation.

When making changes to our code, we never have to manually compile, configure, run, log in, navigate, and type some test data to be able to verify if the change works as expected. Instead, we let our tests do all the heavy lifting. Tests are there to eliminate any manual waste; they are fast, reliable, repeatable at will.

Tests are also great for providing many other useful services that enhance the process of software development and ensure higher quality. More on that in the next article.