A TDD Journey: 5- Tests vs. Code; Refactor Friendliness; Test Parameterization

Test-Driven Development (TDD) has a workflow of writing some test code, and then writing some production code to make the test pass. That is necessary but not sufficient-you must also make sure the test and the code together are doing what you think! Michael Sorens continues his series by introducing Test case parameterization for avoiding code duplication with no additional code complexity.

This is part 5 of our exploration into practicing hands-on TDD. Unlike most of my multi-part series, it is not advisable to join this one in the middle. So if you are arriving fresh, please go back and review part 1 for an overview of TDD and subsequent parts that both built out our tests & code and introduced ancillary concepts and techniques crucial to the TDD approach.

Here are the tests we have developed so far. This list of behaviors documents the rather simple operations our class under test knows how to handle at this point in development:

2019-img2C.gifImportance of Tests vs. Code

Let’s start off this installment with a question to consider: what is more important in TDD-test code or production code? Consider for a moment before reading further. Then write down your answer so you cannot hedge :-).

 I predict you will scoff at the question: obviously your production code is more valuable because that is what you sell or deliver or install. Your customers do not pay you for your test code. Indeed, your customers never see your test code. But is it really so obvious? You learned in part 4 that everything you need to know about your class under test is embodied in a simple list of the names of the tests.

From just that information you can deduce we are testing a WidgetActivator ; it has an Execute method; Execute returns a Boolean status; the WidgetActivator uses an WidgetLoader and IWidgetPublisher both of which have a method that also returns a Boolean status; and the return value of Execute is a conjunction of the return values of its components. And that is just from the test names. Now if you had the full bodies of the test methods you can be more precise and more detailed. I submit, in fact, that if you lost your production code you could recreate it reliably and accurately from the tests!  How about the other way? If you lost your tests, what would happen? For a brief moment in time you have working production code. But if you proceed with new development, bug fixes, patches, etc., your code evolves and you do not have the safety net of your TDD-developed unit tests to guarantee the stability of your code. You could recreate some set of unit tests, of course, but it would be rather improbable that you could recreate your complete set of tests to provide full coverage of your code. So let me ask again: which is more important-test code or production code?

2019-img32.gif

TEST: The last test covered when publishing fails; let’s now put in a test for when publishing succeeds…

CODE: That test fails for the same reason the WidgetLoader test failed when the load was successful-we did not isolate the WidgetLoader in the test. Here it is just the reverse: we need to isolate the IWidgetPublisher by forcing the WidgetLoader to always return true, then this test passes.

But wait a minute; this test is a sham! It purports to validate the behavior that IF publishing succeeds THEN Execute succeeds. Period. It does not say that if publishing succeeds when loading succeeds then Execute succeeds. Yet that is exactly what we just coded. So this reveals a weakness in the behavior specification, i.e. the test name. It might be better to state
 Execute_reflects_the_publishing_result_if_publishing_succeeded,
suggesting there are other factors that are also reflected in the result of Execute. But that is a bit too wibbly-wobbly. I prefer something like …
 Execute_returns_true_when_publishing_succeeds_after_loading_succeeds.
 Similarly we will be best served by renaming the related tests:

Execute_returns_true_if_publishing_succeeded =>
             Execute_returns_true_when_publishing_succeeds_after_loading_succeeds
Execute_returns_false_if_publishing_failed =>
             Execute_returns_false_when_publishing_fails_after_loading_succeeds
Execute_returns_true_if_details_are_loaded =>
             Execute_returns_true_when_loading_succeeds_and_then_publishing_succeeds
Execute_returns_false_if_no_details_to_load =>
             Execute_returns_false_when_loading_fails_and_then_publishing_succeeds

These are much clearer because they are explicitly showing which component we are isolating while holding the other component constant.

2019-img2C.gifReprise: Beware Passing Tests

A key tenet of the previous article in this series bears repeating:

Watch out for tests that pass for the wrong reason!

The penultimate test we wrote was the opposite of our latest test:
 Execute_returns_false_if_publishing_failed. And in the last article you saw how that test passed only coincidentally. We needed to fix the production code to make it pass for the right reason. But let’s take one more look at that test now with its new name:

Do you see that this test still passes for the wrong reason (albeit a different wrong reason)? Here the test name points us to the culprit. This time, it is the test itself that needs correcting, with the very same fix we just used in the latest test: we must take the WidgetLoader out of the equation: this test will pass regardless of what the mockWidgetPublisher does because the stubWidgetLoader invokes its default behavior, returning false. We need to make loading succeed-as the test name now says-so that the results are meaningful:

2019-img32.gif

Now that we have the loader and the publisher nominally acting reasonably, let’s consider what to do next: of course you do not have any domain knowledge about this WidgetActivator and its components just from the class names, but let’s agree it is reasonable that if we fail to load our widget there is no point in publishing it.

TEST: In this test we force the loader to report failure, then we check that the publisher is never used. This test fails because, as written, the WidgetActivator always invokes both the loader and the publisher.

CODE: Let’s add a conditional to only invoke the publisher when appropriate. I have also renamed the local variable from result to success to make it more meaningful in this context.

CODE: That made this new test pass, but it made one of our first tests fail,
Execute_delegates_to_IWidgetPublisher_to_publish_widget.
The reason is clear: in this barebones test we gave the loader and publisher no mocking behaviors so method calls basically return null or false, as appropriate. The code change we just put in inhibits the call to the publisher, hence the test failure. Rather than fix this test, though, I am going to be a bit more draconian and just delete it.

Why delete it?

  1. This test is rather naïve: implicitly it is saying that we are using the IWidgetPublisher in all cases, but the last test just made that assumption invalid.
  2.     This test is now redundant, covered by the other IWidgetPublisher tests we have written after this one. Clearly, if we are testing whether publishing failed or publishing succeeded, then we are making use of the IWidgetPublisher .
  3. This test is “non-refactor-friendly”.

For the same reasons I am also going to delete the very first test,
    Execute_delegates_to_IWidgetLoader_to_load_widget_details.
I also do not see a great need for our very elementary tests,
    WidgetActivator_constructor_accepts_an_IWidgetLoader
 and
   WidgetActivator_constructor_accepts_an_IWidgetPublisher

so I will delete those now as well.

2019-img2C.gifRefactor-Friendly Tests

A bit of an explanation: Refactoring, the process of restructuring code without changing its behavior, is an integral part of TDD. Indeed, almost synonymous with TDD is the term “red-green-refactoring”, introduced in part 1 of this series, and originally espoused by James Shore. We have, in fact been following this process, which is to say, write a bit of a failing test (the test runner reports a failure, usually in red); write a bit of production code to make the test pass (green), then reflect upon the code, tidy it up, improve it, consolidate it (refactor).   Those of us in the .NET world are terribly spoiled with the refactoring power of ReSharper or CodeRush. (I understand IntelliJ Idea is as good in the Java world though I have not tried it.) These tools lets you move classes around, extract members out of classes, generate interfaces, and much more.

But when we have a test Execute_delegates_to_IWidgetPublisher_to_publish_widget that specifically refers to the interface IWidgetPublisher in its name, this would not be noticed by ReSharper or its kin and thus if you chose to rename IWidgetPublisher to say, IWidgetGluepot, this test would essentially be an orphan, mentioning an interface that no longer exists. In short, explicitly embedding the name of a class, interface, etc., within a test name is non-refactor-friendly.

2019-img32.gif

Let’s look again at the latest test, Execute_only_publishes_if_loading_succeeds:

Does the body of this test fulfill the expectation of the name of the test? The answer is “No.” Can you spot why? The name of the test is actually making two claims: publish if loading succeeds and do not publish if loading fails.

2019-img2C.gif Parameterized Test Cases

Here we are going to introduce a new construct from the NUnit framework, the TestCase attribute. Using TestCase instead of Test lets you specify parameters to the test. This is quite useful when you want to write two (or more than two) nearly identical tests.  In this instance we want the Load to succeed or the load to fail. So we introduce a parameter value in the TestCase attribute. We also add a parameter to the method signature that maps to that parameter value. Finally, we use that parameter as the value for the Load method to return:

With that in place, we can now add the second, very similar test case with just an attribute addition:

In your unit test runner of choice, run all the tests and see what happens. Here is what ReSharper’s unit test runner reveals. Notice how these two new tests are now grouped together under the one test name. But more importantly, notice that the new test fails…

2019-img28.gif

…because while we configured the test to take different inputs we did not tell it how that should affect the output.  One small change will clear up that error; making the output depend on that input parameter.

Notice how straightforward it is to understand this test. If you give this to someone who has never seen it, the thought process to understand it might go something like this:

The test name states that it should check that publishing depends on whether loading succeeds or not. There are thus two test cases defined; one that forces the load to succeed and one that forces it to fail. If it succeeds, then the Verify statement wants Publish to be hit once; if it fails, Publish should not be hit at all.

Note that I have introduced only a very small code complication to handle the two test cases-the ternary conditional in the Verify statement. If I needed anything more complicated than that, I would generally use two separate test cases, because making test code more complex just to have a few less lines of test code is a false economy and a bad practice. Tests should be as simple as possible to be easily understood by you or others. You could be even more stringent, though, not allowing any additional code complexity other than introducing parameters with TestCase(). You could rewrite the test to meet this tighter constraint by adding a second parameter. Because Times.Once() is a method, though, we cannot pass that in as a value; the values to TestCase must be compile-time constants. But Once just means 1 and Never just means 0. So this is completely equivalent to the last version-with no additional code complexity.

2019-img32.gif

Test case parameterization is a powerful mechanism for avoiding code duplication with no additional code complexity. The above test allowed for only two possible inputs, but imagine if you needed to test that a set of various strings were all recognized as a date, including perhaps: 1/3/2014, 1/3/14, 01/03/14, 1-3-2014, 2014-01-03, and others. Rather than repeat essentially the same test with just a different constant, use a set of TestCase attributes on top of a single test to save a lot of up-front coding and a lot of maintenance should things change down the road.