Eclipse Yoxos Services Downloads Blogs About
Home > Blogs >

on Feb 17th, 2009Unit testing revelations

The other day I experienced an unexpected light bulb moment concerning unit testing. Maybe this one is obvious to most of you, but I wish someone would have told me earlier. So here goes.

My biggest gripes with unit testing has been that I couldn’t get any satisfactory answers to these two questions:

  1. Why should I practice Test-First?
  2. How do you test the tests?

Concerning the first issue, well we discussed some papers that where trying to answer that question when I attended a software quality course at university. The gist of the results were: There is no statistically significant difference in code quality between Test-First and and Test-Later. (Sorry can’t find links to the papers atm. Holler if you want me to find them and I’ll do some more digging.)

The second issue is discussed often as well: If tests are code and code should be tested, doesn’t that lead to more and more tests? This is sometimes referred to as the stack overflow of unit testing.

The revelation I had was this: These two questions answer each other! What’s the easiest way to test a test? A broken implementation. Where do you get a broken implementation? Just use an incomplete implementation. If you practice  Test-First all your implementations start off incomplete by definition. This means that each assertion in your test is guaranteed to fail at least once, giving you the confidence that the assertions actually perform significant work.

I am often surprised that assertions that I expect to fail actually go through just fine. This usually means one of two things: Either the functionality is already there by some fluke (cf. accidental correctness), or my test is incorrect. In either case I can then fine-tune and adapt the assertions to make sure that they fail and thus test some missing functionality.

Two conundrums solved.

Related posts:

4 Responses to “Unit testing revelations”

  1. Nilanjan says:

    Sweet, welcome to the test infected developers club. There is no way to learn tdd by just reading books. The only way is to practice and pair with an experience developers and eventually within week or 2 you will have the light bulb moment.

  2. There’s definitely a difference in the quality of code readability when doing test first. So much so that it’s easy to make out if the code was developed test-first just looking at the code.

    It’s hard to prove this statistically, but it’s a good feeling that test-first leaves me with when I know the code works and is readable.

  3. There is a very big difference between test first and test later.

    Test first forces you to think about a class/method/function/whatever from the point of view of clients of that code. ie. how is it going to be used rather than how I am going to implement it.

    Also, if you find that the test requires too much setup, fixture code, etc then its telling you there is a problem with the class, there is too much coupling going on and there is almost always something missing. This also helps design.

    My experience of teams that practice test later is usually a rapid degradation to functional test only and then test nothing, mainly because the coupling and code quality makes it harder and harder to test.

    Test later also means missing things – how are you going to be sure that you’ve covered everything, I don’t mean in terms of code coverage but in terms of intent.

    I am interested to know how the studies you mentioned measured quality, which is very difficult to do in a meaningful way.

  4. manuel says:

    Alright I dug up one of the papers (which cites other papers concerning TDD). Their conclusion seems to be that while the quality is largely the same, the productivity is better with Test-First.

    http://iit-iti.nrc-cnrc.gc.ca/publications/nrc-47445_e.html

    The way most of these studies assess and measure “quality” is by having a large set of test vectors and corresponding expected (i.e. known good) results. This set is not known to the people doing the implementations. They may sometimes have a small subset as basic acceptance tests. With this kind of setup “quality” can be measured as the number of passing tests from the hidden set. This of course assumes that quality is correctness. Of course there may be other aspects to “quality”, that are not considered in this setup e.g. readability, flexibility, performance, modularity, etc. So as I said in this case “quality”=”correctness”.

    A big problem with all of these studies is external validity (the authors are quite forward admitting that) so these results should be taken with healthy grain of salt – which is generally a good idea with these kinds of topics.

© EclipseSource 2008 - 2011