On Writing Unit Tests for C#

Is it realistic to keep to principles of 'Test-First' and 100% coverage for unit tests when in the heat of developing commercial C# applications? Does rigorous unit-testing lead naturally to good design by enforcing testability, low coupling and high cohesion? Patrick Smacchia gives his opinion based on hard-won experience.

I began writing Unit Tests for my C# code around 8 years ago. I’d like to describe some of the ways in which the experience of practicing real-world unit testing for all these years has changed the way I code.

Test Organization

Testability is about designing software that is easy to test, rather than about writing tests. By developing the tests with the code, you greatly increase your chances of writing testable code.The TestFixture class implements a TestFixture in unit testing in order to specify test conditions, such as setup and cleanup that need to be repeated during testing. It generally includes methods to Set-Up and Tear-Down, before and after each test or the entire suite of tests.

There should be one TestFixture class for every class that you test. By following this rule, matching classes with TextFixture classes, these operations …

  • writing a new test suite for a currently created class
  • writing some new tests for an existing class
  • refactoring tests when refactoring some classes
  • identifying which test to run to test a class

…are easier and more straightforward. Common sense dictates that there might be trivial exceptions to the one-to-one correspondence rule just stated. For example, very simple classes that are pure immutable entities (made of simple constructors, simple getters, and read-only backing fields) might well be tested merely through TestFixture classes corresponding to more complex tested classes.

If you find that you need more than one test fixture class for a class, you can create several, but a class that needs many test fixture class might violate the Single Responsibility Principle: http://en.wikipedia.org/wiki/S…

It is good practice to keep all tests in the same test fixture class, with each test responsible for calling the right initialisation procedure; but if you are faced with the need to run a lot of tests, it makes sense to have several test fixture classes just so as to group tests appropriately.

Tests must be written in dedicated test assemblies because production assemblies must not be weighted with test code, and so tests and tested code should live in different VS projects. These projects can live in the same VS solution, or in different VS solutions. However, it is useful to have them living in the same VS solution so as to be able to navigate easily between tested code and tests.

Even if you keep them separate in VS solutions, you will still need a global VS solution containing both code and tests, which is useful anyway for when you use refactoring tools to simultaneously refactor the tested code and tests.

Test-First and specifications

A practice that is made popular by Agile and Extreme Programming methodologies is ‘test-first development’. From Wikipedia: ‘The tests should be written before the functionality that is being tested’.

I don’t agree with that. Tests for a portion of code should be written at the same time as the code itself. I advocate this because not all edge cases can be faced up-front, even by domain experts or professional testers. Edge cases are generally discovered only when thoroughly implementing an algorithm. This is an iterative process where you write a few tests, then write the few corresponding lines of code and methods, and repeat until the features and classes are completely implemented and 100% covered by tests.

By talking with many who advocate test-first principles, I find that they are also talking of a process where code and test writing are iterated. We are certainly doing it all the same way in reality, and it is the term ‘test-first’ that seems misleading to me.With experience, one can anticipate how to design the code as testable. I often write the code with a testable design first, then the associated tests, and then repeat.

The Test-first principle is still useful when tests can be written up-front by a domain expert or a client, with dedicated tooling such as FitNesse. In circumstances like this, Test-first is a great forum for communicating with the domain expert in order to flush out any chance of a misunderstanding.

It is important to keep in mind that it is the code itself rather than the test suite that represents the complete specification. Tests can be seen as a specification scaffold that is removed at runtime. Tests can also be envisaged as a secondary, less-detailed, specification; useful to ensure that the main specification, the code itself, respects the range of requirements that is materialized by test assertions.

Test and Code Contracts

Calls from tests to tested code shouldn’t violate tested code contracts. When a contract is violated by running tests, then either the tested code, the test itself, or the specification is buggy.

Assertions in tests are not much different than code contract assertions. The only real difference is that assertions in tests are not checked at production-time. But both kinds of assertions are here for correctness, to sound the alarm as soon as something goes wrong when running the code.

Test Coverage

The quantification of the number of tests is a meaningless metric. Dozens of [TestCase]s can be written in minutes, whereas a tricky integration test can take days to write. When it comes to quantifying a testing effort and the robustness of a test suite, the more significant metrics are

  • The percentage of code covered by tests
  • The number of different assertions executed.

Not all classes need to be 100% covered by tests or even covered at all. Typically, UI code is decoupled from logic and UI code is hard to test properly, despite progress in this field. Notice that the practice of decoupling UI code, which is good design, is naturally enforced by tests. Hence UI classes should only be concerned with both UI code and calls to the logic/domain of the program. Logic and domain code must be written in non-UI class, to make it testable.

Other than the code that is obviously hard to test, such as UI code, it is necessary to aim for 100% coverage when testing a class. 90% coverage is not enough because experience shows that the 10% of non-tested code always seems to contain most of the unidentified bugs.

If you find that last 10% of the code of a class takes as much effort to cover as the other 90% in most situations, you’ll know that this 10% of code needs to be redesigned to be easily testable. The 100% coverage practice is costly only when it forces substandard code to be well designed.

If code is 100% covered, then there will be no dead code at all, and the code is liable to be pretty much 100% correct, just so long as the code tested is stuffed with contract assertions. Indeed, in such conditions a statement that is covered by tests fulfils its unit action; and all contextual data values are in well-defined ranges, checked by contract assertions. 

The ‘number of touch’ count is not really relevant. If, for example, I know that a piece of code works fine for all positive integers, I don’t need to test all 2 billion positive values to increase the ‘number of touch’. Testing the 4 values 1, 10, 100.000, and int.MaxValue should be enough. Notice than in the 4 values, I included the obvious 2 edge cases 1 and int.MaxValue, plus 2 values different enough. Your position here is to always strive to identify those edge-cases that are contractually supported, and write tests for them. Edge cases are the enemy of testers, and you’ll often spot them automatically while increasing the coverage ratio of a class from 90% to 100%.

As a side note, let’s mention that “100%” coverage is not actually complete coverage because of the way many .NET code coverage tools work. Language constructs such as ternary operators, inner method calls, or LINQ queries are considered as one statement (one monolithic sequence point) by code coverage tools. For example if just the false expression of a ternary operator is covered, then the true expression is assumed to be covered as well. Some .NET code coverage tools such as NCover are a bit smarter because they measure an additional branch-coverage by tests metric which provides a finer-grained view of coverage of monolithic sequence points and statements.

It is a good practice to tag a class that has been 100% covered with an arbitrary [FullCovered] attribute, because it documents that the class is, and must remain 100% covered. This will help future developers in charge of any future refactoring,

It is very hard to break code that has been 100% covered by tests, and that contains all the relevant contracts. This is close to the magical practice of having bug-free code.

Personally I consider that the job is done only when all those classes that can be covered by tests are indeed 100% covered by tests.

Test Execution

A fundamental difference between unit tests and integration tests is that unit tests are in-process only, they don’t require touching a Database, the network or the file system. The distinction is useful because unit-tests are usually one or several orders of magnitude faster to run. As a side note, it would be more precise to say that often unit-tests are relying on mocked objects to avoid accessing the out-of-process resources (DB, file, network…) but still focus on testing code that triggers access to the out-of-process resources.

Tests related to a piece of code that is just written, or just refactored, must be run from the developer machine before the code is committed.

Test should be fast to run, taking a few minutes at the most, because tests must also be run often on the developer machine so as to detect regression as soon as possible, even before committing code.

By having a one-to-one correspondence between tested class and test class, along with 100% coverage, it is then easy to identify which tests to run for a particular piece of code.

Test and Refactoring

Refactoring is an essential activity to maintain a clean code base and sustain long-term development effort. But refactoring comes at the high price of potentially breaking code that was initially working fine. This price can be especially high if a piece of code to be refactored is deemed as brittle, and developers that wrote it initially are not part of the team anymore. This is where tests come into the play: Something that experience has shown me countless times is that code that has been 100% covered by tests, plus containing all relevant contracts, can be refactored at whim. In these conditions, there are very few chances of missing a behaviour regression, this is a consequence of the remark made earlier, stating that such code is pretty close to bug free code. Hence tests not only deliver high correctness, but also assist greatly in maintaining a sustainable rhythm of development.

When some refactored code is not properly tested, it is a good practice to write tests that should have been written in the first place. Here is good place to make a distinction between the different types of refactoring we are talking about:

  • In the case of a major refactoring of a portion of code or even a component, it is really worth taking the time to write tests properly and achieve 100% coverage.
  • In the case of many minor refactorings, as when fixing several pesky bugs in a row, it is nice to write tests that cover changes. But if the code that has been refactored was not covered in the first place, it almost inevitably means that it is not easily testable. Common sense tells us that refactoring a large section of code in haste, just to cover a newly-refactored line of code is not the right thing to do. Here, we are talking of bug-prone code that hasn’t been tested. If there aren’t enough resources for a proper refactoring, then sooner or later will come the unavoidable need to refactor it entirely.

Test and Code Evolution

Normally, in development work, the team will currently have a stable and robust version of the application working well in production. One can expect that all untouched code since this stable release works reasonably well. In an ideal agile world, this code is 100% covered by tests, and can be assessed to be correct automatically just by running the corresponding tests. In the real-world, this untouched code is partially covered by tests and it is important to know that it doesn’t cause problems in production. Indeed, by relying on this information, the team can focus the testing effort on newly written or recently-refactored code. Clearly, the amended and new code is more likely to contain bugs than untouched code that works fine in production.

Also, for any sufficiently mature code base, code that is altered or added between two public releases represents a small fraction of the code base. Therefore, as all developers intuitively know, this perimeter represents a relatively small spot of bug-prone code on which to concentrate testing resources. This underlines the fact that the team must not only identify accurately which code has been touched and developed, but also carefully look at the code-coverage ratio of this hot code area.

Test and Design

If you are finding it difficult to cover certain code with unit tests, it needs to be redesigned to be easily coverable by unit tests. This problem is often solved by creating one or more classes to handle the logic that was formerly implemented by the code that was hard to test. At this point, it is often worth revisiting the list of common design patterns such as the MVC, to see if one or several of them can help in redesigning properly the code.

The previous point can be restated this way: writing a complete test suite naturally leads to good design, enforcing low coupling and high cohesion.

Singletons should be prohibited because they make the code hard to test. It is harder to re-cycle and re-initialize a singleton object for each test than it is to create an object for each test.

Mocking should be used to transform integration tests into unit-tests by abstracting tests from out-of-process considerations such as database access, network access or file system. In these conditions mocking helps separating concerns in the code and hence, increases the design value. Mocking is often made easier with Dependency Injection principles, so if you are unsure about this, learning it or revisiting it might be a good idea.

Mocking is also useful to isolate code that is completely un-coverable by tests, such as calls to MessageBox.Show().

Test Tooling

Personally I am happy with:

  • NUnit: I like the fact that only NUnit.Framework.dll needs to be referenced from the tests project and that NUnit is supported by all testing tools.
  • NCover: This is the fastest code coverage tools available. Its performance is almost magical since, in coverage mode, tests are executed almost as fast as without coverage. Contrary to VS coverage, the assembly instrumentation phase is completely transparent for the user, and coverage results files are easier to work with. Also NCover is very easy to use and integrates with most testing tools.
  • TestDriven.NET: Flawless VS integration made me addicted to it. Resharper also comes with cool testing facilities but my addiction to the simplicity of TD.NET is too high.

When it comes to testing, NDepend is especially suited to create rules to continuously check for code coverage by tests, to ensure that new and refactored code is properly covered by tests and to navigate between tested code and tests across multiple VS solutions opened in multiple VS instances.

It is also worth keeping an eye on tools such as .NET Demon, ContinuousTests and NCrunch (currently all in beta) that can identify which tests to run when a portion of code is touched, and run them in the background.