More tests are not always good – Why you should stop at 100% test coverage
As I started in the programming industry almost a decade ago, writing tests at all was still a fairly new thing for many developers. Having been one of the early TDD advocates in my social environment, I had to recommend many best practices and books to students and colleagues. Kent Beck, Robert Martin and many other more or less public figures provided invaluable advice and did a lot of persuading with their books and talks and they continue to do so. Today I enjoy a situation where you almost have to justify every test you don’t write.
In order to see which tests you should not write, we first should clarify why tests are good. Tests are a powerful tool to
- Guard against regression
- Provide documentation
These are the two main reasons for tests that will make your project fly, and they are powerful enough so that you should change your working habits in order to get as many tests as sensible.
Why should you limit your tests?
However, tests are not a value by themselves. The payload of a programming job is the actual code, not the tests. Just like rocket fuel, tests also provide a drag. This is the reason why you should limit test to what is really necessary:
- Drag 1: They pin down method signatures and class structures
- Drag 2: They pin down behaviour
Pinning down means that every use of a method or behaviour makes it more difficult to change it. If you change a method with a TDD mindset you go through the following steps:
- Locate the tests above your change point
- Execute them all to see them pass
- Change the appropriate test so that it expects the new behaviour(*)
- Make sure your code base compiles
- Execute the tests to see the changed one fail
- Apply the change to the method
- Execute the tests and deal with the consequences (i.e. change other tests that relied on the old behaviour)
In that list I highlighted Step 4 and 7. Step 4 correlates to drag 1, pinning down of method signatures and class structures. Step 7 correlates to drag 2, pinning down of the behaviour, but also often suffers from drag 1.
The more code (including tests) explicitly relies on the same method signature, the more work you will have in step 4 and 7.
The more tests rely implicitly or explicitly on the same behaviour, the more work you will have in step 5.
(*) If the change is driven from inside production code, have a look at Kent Becks description of Refactoring by Example.
How can one achieve thorough tests without overly pinning down a certain behaviour?
First of all, stop writing tests when you are at 100% code coverage. Do not include a class in a unit test that already has 100% code coverage.
If that class is a dependency that is used throughout your system, decouple and test separately with a mocking framework. Be aware that mocking is a method of white box testing which pins down method signatures and class structures.
How can one achieve thorough tests without overly pinning down a method signatures or class structure?
This can be done by reducing explicit usage of a method or class. Do not include separate tests for implementation details. To achieve this, prefer black box testing over white box testing. Sometimes it is enough to provide tests for the API and omit lower-level tests for internals that would just test the same behaviour from a different perspective.
Can I just not pin down these things?
You could, but then you would not have tests. Finding the optimum for both test coverage and keeping the drag down is a difficult balance. We have yet to discover the Rocket Equation for unit testing.
It is not always possible to minimize both drags. To get to 100% code coverage you will sometimes have to decide whether it is better to pin down behaviour or method signatures and class structures.
tldr; don’t repeat yourself