The interview of episode 5 is with Willem van den Ende. He's going into quite some detail about what function languages bring to the TDD table and where you might handle development in those languages differently.

Willem is a long-time practitioner of all things Agile. He is one of the founding fathers of XP Days Benelux and still very much into training, coaching and practising software development, which he has done in a dozen languages (or more). You can learn more about Willem on and by following him on Twitter.

Q: When was your first contact with TDD and what did you think about it at the time?

Willem: I started reading the C2 wiki, at that time known as the portland patterns repository, shortly after it got started in 1996 or 1997. Various people were writing about what would later become Extreme Programming. I vaguely remember reading the pages about test-first-programming and thinking 'that looks interesting, but I don't quite get how it works'.

Refactoring and iterative development resonated more with me at the time. I had just removed some defects from a fresh legacy C++ GUI by removing 90% of the code while keeping the functionality. Other people on that project were writing end-to-end tests, and we would run them every night on all workstations in the company. That had great value for finding defects, but written after the fact, with little communication in the development team, had negligible impact on the design of the rest of the system. 'My' GUI fell outside of that, so I ended up doing piecemeal refactorings in the Solaris C++ debugger. It allowed modifications in the implementation with reloading, as long as you did not change any interfaces.

Q: What did eventually convince you that TDD is a worthwhile approach?

Willem: This was several years later, in 1999 or 2000, after Kent Becks' eXtreme Programming book finally came out. Me and a colleague, Erik Groeneveld, had a bit of time on our hands, so we sat together and did a small exercise. It only took an hour and a half. We were going: 'that is very interesting, I have no idea how it works, but the design we ended up with was completely different from what we had planned, and much better'. I've used that sitting together for an hour and a half a lot in coaching and teaching. It does not take long to get a feel for how TDD works, and get hooked to try it in more situations. After that it is difficult to go back and do what you did before (most of the time).

Q: What has changed in the way you practice and teach TDD since the early days?

Willem: I delete tests more often after I'm done building up a design, keeping only the more interesting ones. I worry much less about integration and acceptance tests. I prefer to put software in the hands of the users. I've been teaching ATDD and BDD from the beginning. In projects I find it much more important to go for that elusive Metaphor, and find a common language that all stakeholders can understand. Whether you automate some of that understanding and how, is less important.

When you do ATDD or automate BDD tests, you often end up with a couple of level of indirection. Maintaining those indirections is not always worth it. I was hugely impressed when Ward Cunningham showed the swimlanes visualization he made when he was working for the Eclipse foundation. He was making a workflow system, and the acceptance tests were a visualization of workflow examples. My takeaway from that is, if you can find the metaphor, and the coverage that automated customer-facing end-to-end tests can have has value in your situation, make the effort and tailor a solution. So for my most recent project that involved writing customer facing end-to-end tests for a simulator, we tried to specify end-to-end tests in BDD's given-when-then format, but none of the specs that came out were interesting enough from the customers perspective, or thorough enough from the developers perspective. The developers had tried FitNesse before that, and it just did not provide enough details for the clients. The clients could all program, or read programs up to a point (PhD's in physics, econometrics, that kind of thing). The customers wanted to know the calculations that went in to the acceptance tests. We ended up writing acceptance tests in Clojure, with a small support library to keep the tests clean and readable.

It was great to sit with the customers and go through the tests we wrote, to check that we were all understanding the same thing, and the simulator was doing what they expected it to be doing. Don't assume your users or clients are dumb, make an effort to find the most expressive way to communicate. And be very selective in the number of automated end-to-end tests you have. For any moderately interesting system, maintaining them while the system changes can be a huge burden. Don't forget that you want to be able to keep evolving. And if you can't find an effective and efficient way to do end-to-end tests, leave it for a while, focus on shipping your software and get feedback from real users. Try end-to-end tests later and see if you can come up with something better. It often takes a few iterations.

In teaching, I spend much more time letting participants figure things out for themselves. We've started explaining TDD with Mock objects through finding responsibilities in your software, and hexagonal architecture as a metaphor for overall system design - as opposed to using Mocks to break nasty dependencies.

I'm considering dropping ATDD and BDD from my trainings. I've found value in unit-level TDD in every single project I've done since learning it. ATDD and BDD much less so, it can easily lead to bureaucracy and test maintenance with little value. So I prefer to make sure we ship first, get feedback from actual use. Show, don't tell. If you think of valuing "working software over comprehensive documentation", is ATDD or BDD in our project the former, or the latter? With frequent and early deliveries I don't have to ask that question. That is not to say that there is no value in ATDD or BDD, I just don't think there is a single way of doing it that is valuable in a large enough number of projects to put in to open enrollment trainings anymore.

Q: Are there situations in which you consider TDD not to be the right approach for developing software?

Willem: When I write a two line bash script. Or when I can use the 'tests' as the program itself. I've been exploring logic programming in the forms of Answer Set Programming. It does not work well for all problems, but when it works, your 'tests' are the program.

I have written somewhat larger programs without tests on purpose. It sort of works for a while, until my programs grow bigger. Adding tests then, and refactoring, is not so much fun. I

I also worked on one project where the people and the framework we were using was not very amenable to TDD. In that case I prioritized more frequent delivery, and automating the many manual steps that were needed to make a delivery. I tried TDD first, though, because it is one of the best value for effort practices I know, and you don't need much to get started.

Q: What do you think is TDD's relevance in today's world of lean startups, functional and concurrent programming, continuous delivery and mobile everywhere?

Willem: I find that I go faster with TDD than without, and so do most of the project I've coached - in languages like Java, PHP, C#, Ruby - within a couple of weeks. So I think if you believe, as some people seem to do, that lean startup and TDD are opposites, you may have missed a few tricks.

Most of the time I find TDD cheaper to set up than continuous delivery. So I usually start with TDD and go from there. Having said that, I'm trying to set myself up with a cheap and simple continuous delivery solution, so that I can quickly ship and, equally important, roll back. It's a yes-and situation. Without shipping to users, nothing much matters. But without TDD or a different mechanism to get early feedback on our design and code quality, you cannot continue for long.

Functional programming is a big space, a complex system in and of itself. There are some aspects that can help reduce the number of not-so-interesting tests. You might come across statements like these: "I don't need TDD because I do everything in the REPL". I guess people who make this statement have never used a Smalltalk environment in anger, or worked on a LISP machine. Both have significantly more interactive power than, say, Clojure's REPL. I just completed a project in Clojure, and yes, the REPL is useful. I have used it a lot. And still I wrote tests first (and sometimes after since I did try REPL driven development). Boy was I  glad I had some integration tests when I needed to port the software to Windows near the end of the project. Those were tests I initially had written to drive what API I wanted on the application side, and to find the APIs I needed on the OS side. I was also glad to write tests to figure out how to design some of the more difficult parts of the application.

In Clojures' case, the JVM just is not that good on reloading code, and since your editor lives outside the REPL, it never knows when you've renamed a method; that's why code that you expect to break, because it calls the function by its old name, stil runs and vice-versa. You get used to it, so you just reload very often, but it is a very different experience from a Smalltalk environment where you keep working in the same live image for days or weeks at a time, and you can modify all the tools (like the debugger) to your heart's and workflow's content. In Haskell I'm not that fluent, I use the REPL mostly for finding out what types my code has, and compiling my code as I go, sending it to the REPL.

Even when you have something as highly interactive as a Smalltalk environment, there are limits to growth. I recently worked on a product in Smalltalk. It had only a few integration like tests, some of which were read all the time, so you could not say it was in a known space. The product had problems growing beyond the initial number of customers for several reasons, some of them non-technical. One of the main technical reasons I could see, was that most of its development was done with Smalltalk's super REPL, the debugger. It's great, very dynamic, you can change everything on the fly while debugging, adding methods as you go. In this case that also meant that the logic was very hard to follow. Adding a new programmer on the existing code meant long pair-debugging sessions, and the code was much more complicated than it had to be. We added a new component, dependent on the rest through a clearly defined messaging interface, that was much easier to reason about.

Remember where TDD came from? Oh, right. Smalltalk. That is what Kent, Ward, Ron and Chet programmed in. Dynamic environments are great for TDD, because you can get fast feedback. At the same time you need it, because a year from now you will have forgotten your REPL or debugger session, but your tests have remembered. So I call bullshit on the REPL argument. Having said that, there are projects like Gorilla-REPL moving the Clojure REPL in a more interactive workbook-like direction. But having something that was not build for interactive use like the JVM underneath means that it will take a while. There are people working on making reloading faster and more predictable, but it is going slowly.

There are a lot of potential benefits, and activity, in the functional programming space. Some mean you have to write fewer tests (static or gradual typing, clearly delineated spaces where mutable state and IO go), others mean you can write more powerful tests (Haskells' QuickCheck).

There is another argument I often hear: "I Don't need TDD because I have static typing". Don't be fooled by Java's, and to a lesser extent, C#'s half baked type systems. There are type systems out there that will, similiar to TDD, help you drive your design. Having said that, even in Haskell with its advanced type system I still write some tests to drive out interesting behaviour and interfaces. Test-after works slightly better than in say, Java, because of the advanced type system, and that most of the program are simple functions. Nevertheless, I still find working test-first useful to figure out more complicated parts of the program.

When embarking on something new, try to understand how you are working, where your feedback is coming from. Then understand where the 'new' thing is coming from. While trying the new thing out, keep reflecting and check your understanding against your experiences. I did a project in Clojure, because it was a good fit for the people, the project and the problem, and I was curious. Now I'm trying Haskell, because I wanted to try a dynamic and more static functional language to get a feel for the differences.

Q: Willem, you've tried out TDD in lots of different environments and on different platforms. Do you see other tools, e.g. in functional languages, that could have similar benefits like TDD with less trouble?

Willem: I've just started a new project using Yesod, a somewhat Rails like framework in Haskell. Not that I like big frameworks per se, but I don't have enough understanding of Haskell and its ecosystem to collect my own stack out of libraries. Michael Snoyman, Yesod's maintainer seems to have good taste in selecting libraries and delineating dependencies. One of the reasons for choosing Yesod, is that it makes good use of Haskell's type system to prevent mistakes like creating a route named "/hello" in your application, and then making a typo when you create a link in html, e.g "<a href="helo">". In most frameworks that would require end-to-end tests or manual tests to turn up, here it is a compile error. Also very interesting is using the type system to make sure that two sides of a remote procedure call are using the same data. Yesod together with Fay (a subset of haskell that compiles to relatively compact javascript) looks promising, as well as research in haste. Again, fewer slow and brittle end-to-end tests to write, leaving more time to focus on interesting aspects of your problem domain.

Both Clojure and Haskell force you to be explicit when you want mutable state. Constants are the norm, variables the exception. This makes parts of the program easy to test, a test provides the input, and you just check the output. In addition, Haskell forces you to clearly mark the places where you use IO. These two combined were mind-bending for me. Not that I normally have IO everywhere, but this forces you to really think hard about the design of an application. After doing the Clojure project I'm getting more comfortable with it; although I cheated with the use of IO here and there, I only used one bit of mutable state close to the end of the project. The problem lent itself well for that approach, so that helped.

An interesting development I found is that some aspects of functional programming are trickling down to the curly bracket languages, e.g. lambdas in Java, but more interestingly, gradual typing for PHP with Facebook's Hack language. I have not tried it, and I expect it will help more on the defect prevention side than the drive-your-design side, but at the very least it will improve largely untested PHP code, and allow PHP programmers to focus their TDD efforts on more interesting parts of the application.

I would not say all this comes with less trouble. Like TDD, learning languages like Clojure and Haskell and trying to understand their idioms and culture takes time. Your performance will probably go down more or less, before it goes up. I think I've found a good way to explain the use of Clojure to other people, I hope to find the same for Haskell, so I can use one of the two with more people around me.

On the research front, I'm keeping an eye on dependently typed languages like Idris, where you can specify even more constraints than in Haskell. Where in TDD we work with examples, and hope we have enough examples to cover ourselves, you go more in the direction of providing complete proofs.

Underlying the application of TDD are a few things for me. XP's question: 'What has changed in technology over the last ten years that allows us to do things differently?' and TDD's 'Test everything that could possibly break'. New technology allows us to reduce the number of things that could possibly break. But that number is not zero, and it's unlikely that it will ever be. Instead we will probably use that reduced number to build more complicated, more interesting systems, so we will still benefit from TDD. The other question is: "Where do you get feedback from?". Never forgot to combine TDD and other technical practices with a healthy interest for those that use your software, and how you can make their lives better.

Many thanks, Willem, for answering my questions!

Other episodes of the series: