When you start learning to write unit tests and learn Test-Driven Development, you will hear all these wonderful things about tests. They will increase your code quality. They will change the way you write code for the better. By the end of the book, they’ve taught you 137 ways to write a unit test and absolutely nothing about changing the way you write code.
That catches up with you very quickly as you discover that most of the code you have in your current system is painful or impossible to write tests for. No one who teaches TDD also teaches you how to write code that is easy to test. And to add insult to injury, even if you are able to wrap a few tests around some gnarly ball of spaghetti, those tests may fail intermittently even when nothing is broken.
Test coverage is also difficult to achieve for poorly designed code, and both Gary Bernhardt and J.B. Rainsberger explain why.
If you have N-conditionals in your program, N-branches, N-decisions, you have 2^N paths. If you have 500 conditionals in your program this is a number with 150 digits in it. That’s too large a number. It’s too hard to decide which of those paths are worth testing.Gary Bernhardt, Boundaries [00:05:40]
The problem is not that you need this many tests. The problem is that you don’t have the first chance in hell of writing this many tests. So how many tests are you really gonna write? By my rough calculations you will write somewhere between 1% and 80% of the tests that you need. And you don’t have the faintest idea whether you are closer to 1% or 80%. That my friends is not good enough. We can do better.J.B. Rainsberger, Integrated Tests Are A Scam [00:21:20]
If you have to rely on a static analysis tool to know what your test coverage is you have already failed.
There are two major reasons that code is difficult to test: code that has dependencies and code that makes decisions.
When your code calls other code you have two choices for how to test is—each one is worse than the last. You can write an isolated test or an integrated test. In order to write the isolated test, you have to use a mocking framework to setup mock dependencies and stub out simulated return values. Even with DI, these tests are not fun to write. If you decide to write the integrated test it gets worse. The number of tests you have to write in order to achieve proper test coverage explodes. Integrated tests are often orders of magnitude slower and sometimes fail intermittently even when nothing is broken.
I consider control structures, mathematical operations, and string manipulation to be “decisions” made by code. When you add an
if statement to your code, there are now two possible paths for evaluation/execution—one for the
if block and one for the
else block. The code will “decide” which path to execute at runtime. Switches add one path for each case. Loops add at least three paths: zero, one, and many iterations. In order to make sure that all paths are functioning correctly, you will have to write a test for each. Mathematical operations and string manipulation are also examples of code that may require tests with several possible inputs to achieve proper test coverage.
Most code doesn’t just do one of these things—it combines the worst of both worlds. Writing tests for code that has many paths through it and calls lots of other code is a nightmare. You have to write hundreds of tests that are all very long and tedious to setup mocks and stubs for.
Code can either have dependencies, or it can make decisions, but never both.
Basically, don’t cross the streams.
A Few Long Tests
If this code calls other code, then there can only be one single path through that code. No math, no control structures. If you’re calling other code then the only thing you should be doing is calling other code and pass the output from the previous calls to the next calls. This way, even if you have to do a great deal of test setup for mocks and stubs, you will only have to write that test once.
RequestValidator validator = Mockito.mock(RequestValidator.class); DTOConverter converter = Mockito.mock(DTOConverter.class); UserService userService = Mockito.mock(UserService.class); Request request = new Request(); RequestDAO requestDao = new RequestDAO(); Response response = new Response(); when(validator.validate(request)).thenReturn(true); when(converter.convertToDAO(request)).thenReturn(requestDao); when(userService.processRequest(requestDao)).thenReturn(response); UserController userController = new UserController(validator, converter, userService); Response actualResponse = userController.postRequest(request); verify(validator).validate(request); verify(converter).convertToDAO(request); verify(userService).processRequest(requestDao); assertThat(actualResponse, is(response)); // Please don't make me do that again.
That’s an exhausting amount of setup and the subject under test only has three collaborators in this case. Imagine how much more fun it will be if you have 15 dependencies.
Many Short Tests
If the code makes decisions, it will require you to write many tests for proper coverage. Make your life easier by not calling other code from within those branches. Writing 127 tests with mocks and stubs is never going to happen, but writing 127 tests that are all one liners is a breeze. This function is easy to write tests for because it doesn’t call other code:
assertThat(sum(0, 0), equalTo(0)); assertThat(sum(1, 0), equalTo(1)); assertThat(sum(0, 1), equalTo(1)); assertThat(sum(1, 1), equalTo(2)); assertThat(sum(10, 200), equalTo(210)); // I can do this all day.
Don’t Nest Decisions
The second rule is to avoid nesting decisions. Say you have three bits of code that each have ten paths. If you nest those decisions, it will require
10 * 10 * 10 paths and 1000 tests. If you flatten those decisions you have
10 + 10 + 10 or 30 tests. You can write 30 tests and get to 100% test coverage. You’re never going to write 1000 tests.
This is why mixing decisions with dependencies explodes the number of tests we have to write. This code has several paths and within one of those paths, it calls some other code, which also has multiple paths. But even if you aren’t calling other code, you can still make the same mistake if you nest decisions within other decisions. Nesting loops and control structures does the same combinatorial damage. Try to keep it flat. Linus Torvalds was once asked why the Linux Kernel uses 8-spaces for indentation and he said if you need more than three levels of indentation you’re screwed anyway, and should fix your program.
The combinatorics problem of code complexity comes up in the Software Talks over and over and over. Gary Bernhardt, J.B. Rainsberger, Jonathan Blow, Chad Fowler, Rich Hickey, Out of the Tar Pit—every single one of them discusses how fatal this mistake is and how to avoid it.
Later I will cover specific patterns and tactics for flattening decision trees as well as a lesson on how to organize code that has dependencies and code that makes decisions into useful modules.