More tests are not always good – Why you should stop at 100% test coverage

More tests are not always good – Why you should stop at 100% test coverage

As I started in the programming industry almost a decade ago, writing tests at all was still a fairly new thing for many developers. Having been one of the early TDD advocates in my social environment, I had to recommend many best practices and books to students and colleagues. Kent Beck, Robert Martin and many other more or less public figures provided invaluable advice and did a lot of persuading with their books and talks and they continue to do so. Today I enjoy a situation where you almost have to justify every test you don’t write.

The basics – Why should you write tests?Rocket

In order to see which tests you should not write, we first should clarify why tests are good. Tests are a powerful tool to

  • Guard against regression
  • Provide documentation

These are the two main reasons for tests that will make your project fly, and they are powerful enough so that you should change your working habits in order to get as many tests as sensible.

Why should you limit your tests?

However, tests are not a value by themselves. The payload of a programming job is the actual code, not the tests. Just like rocket fuel, tests also provide a drag. This is the reason why you should limit test to what is really necessary:

  • Drag 1: They pin down method signatures and class structures
  • Drag 2: They pin down behaviour

Pinning down means that every use of a method or behaviour makes it more difficult to change it. If you change a method with a TDD mindset you go through the following steps:

  1. Locate the tests above your change point
  2. Execute them all to see them pass
  3. Change the appropriate test so that it expects the new behaviour(*)
  4. Make sure your code base compiles
  5. Execute the tests to see the changed one fail
  6. Apply the change to the method
  7. Execute the tests and deal with the consequences (i.e. change other tests that relied on the old behaviour)

In that list I highlighted Step 4 and 7. Step 4 correlates to drag 1, pinning down of method signatures and class structures. Step 7 correlates to drag 2, pinning down of the behaviour, but also often suffers from drag 1.

The more code (including tests) explicitly relies on the same method signature, the more work you will have in step 4 and 7.
The more tests rely implicitly or explicitly on the same behaviour, the more work you will have in step 5.

(*) If the change is driven from inside production code, have a look at Kent Becks description of Refactoring by Example.

How can one achieve thorough tests without overly pinning down a certain behaviour?

First of all, stop writing tests when you are at 100% code coverage. Do not include a class in a unit test that already has 100% code coverage.
If that class is a dependency that is used throughout your system, decouple and test separately with a mocking framework. Be aware that mocking is a method of white box testing which pins down method signatures and class structures.

How can one achieve thorough tests without overly pinning down a method signatures or class structure?

This can be done by reducing explicit usage of a method or class. Do not include separate tests for implementation details. To achieve this, prefer black box testing over white box testing. Sometimes it is enough to provide tests for the API and omit lower-level tests for internals that would just test the same behaviour from a different perspective.

Can I just not pin down these things?

You could, but then you would not have tests. Finding the optimum for both test coverage and keeping the drag down is a difficult balance. We have yet to discover the Rocket Equation for unit testing.

It is not always possible to minimize both drags. To get to 100% code coverage you will sometimes have to decide whether it is better to pin down behaviour or method signatures and class structures.


tldr; don’t repeat yourself

  • Jerry
    Posted at 4:40 am, February 24, 2014


    I also find it’s a pain to maintain software created by TDD (a failed test first, write every line of production code to pass a failed test), if the requirement is going to change.

    During the implementation of changed requirement, it’s extremely hard to follow TDD (a failed test first, and make changes to production code only to pass one failed test at a time). I described this situation here,

  • Ian Bull
    Posted at 11:32 pm, February 26, 2014

    Re: decouple and test separately (with a mocking framework).

    I find this actually couples the code even more, once with the real implementation and once with the mocking framework.

    Example: Say you have two classes (client and business). The business has all the business logic (and 100%) coverage. When testing the client, you can either use the business class, or a mock (where the mock needs to mimic what the business would do). Now, what if the business logic changes? What if the client ‘expected’ the old behaviour.

    Since the client is using mocks, those tests will continue to pass, however, in production it will fail. To avoid this, the business class developers will need to update the mocks to continue to mimic the same behaviour.

    Thoughts? Or maybe I missed something.

  • Matthias Kempka
    Posted at 9:54 am, February 27, 2014

    Ian, I completely agree. This is the part of “dealing with consequences” that is swashing over from drag 1, but much better explained. Thanks.

Post a Comment