Test-driven development in OOPLs is mostly focused on example-based test cases aka as “plain old unit tests”.
Let’s say we want to check if the common JDK function java.util.Collections.reverse()
works as expected.
We use a simple JUnit Jupiter test for that purpose:
import java.util.*;
import org.assertj.core.api.Assertions;
import org.junit.jupiter.api.Test;
class ListReverseTests {
@Test
void reverseList() {
List<Integer> aList = Arrays.asList(1, 2, 3);
Collections.reverse(aList);
Assertions.assertThat(aList).containsExactly(3, 2, 1);
}
}
Many, hopefully most, developers have written similar tests for years or even decades.
Usually with good success and a reasonable capability to detect common programming errors.
There is one thought, though, that’s always been nagging in the back of my mind:
How can I be confident that reverse
also works with 5 elements? With 5000? With an empty list?
With elements of different types? The amount of doubts can go as high as I allow it.
One way to fight this type of uncertainty is to add more examples and test cases. I am calmed down by the hope that my choice of examples is sufficiently representative to catch bugs now and regressions in the future. When in doubt I add another test case - and yet another. Model-based testing approaches (e.g. equivalence classes and parameter combinatorics) address this exact problem but usually err on the side of too many tests. And every single test that does not reveal an error now or in the future means a waste of resources and additional maintenance effort.
We can approach the question of correctness from a different angle:
Under what preconditions and constraints (e.g. the range of input parameters) should the functionality under test lead to which postconditions (results of a computation)? And which invariants should never be violated in the course?
This combination of preconditions and qualities that are expected to be present is also called a property.
Let’s formulate a property for the reverse
function in plain English:
For any given list of elements, applying
reverse
twice should result in the original list.
If we can now translate this prose into a computer-executable form, e.g. programming language code, and also let the computer generate a wide range of examples that conform to the preconditions, then we have arrived at Property-based Testing.
The mother of all modern PBT frameworks is QuickCheck for Haskell. The property from above could be translated into QuickCheck like that:
prop_reversed :: [Int] -> Bool
prop_reversed xs =
reverse (reverse xs) == xs
As a functional language with a strong type system Haskell allows this very concise property specification:
Int
as input parameter and a boolean value to represent the property’s check result.
The fact that we only consider lists of integral numbers is a concession to the way QuickCheck works internally.
From the point of view of the abstract property the concrete type is not of interest.reverse
twice to
input list xs
and compare the result with the original list.At test runtime QuickCheck will generate a number (usually 100) of (mostly) random lists
and call prop_reversed
. If only a single call returns False
the property is considered to be
falsified and the test run as failed. Thus it’s important to recognize that
PBT cannot prove that a property is correct but only try to find examples to refute it!
What you can also notice is that this property is required but not sufficient.
A trivial implementation of reverse
which returned the input list as output list would
also succeed. This raises an important question which will be
tackled later in this series: Can property-based tests replace our good old examples tests
or are they “just a complement”?
Translating the above QuickCheck function into Java with just JUnit is not simple. That’s why we pull in another test engine: jqwik. We will have a closer look at jqwik in the next article. For a start all we do is translate the Haskell from above into a jqwik test method:
import java.util.*;
import net.jqwik.api.*;
class ListReverseProperties {
@Property
boolean reverseTwiceIsOriginal(@ForAll List<Integer> original) {
return reverse(reverse(original)).equals(original);
}
private <T> List<T> reverse(List<T> original) {
List<T> clone = new ArrayList<>(original);
Collections.reverse(clone);
return clone;
}
}
Let’s have a closer look at the code:
reverse
behaviour by writing our own reverse
method
that clones a list before it reverts it.@Property
so that IDEs and build tools will recognize it as such. At least if they support the JUnit platform.@ForAll
tells jqwik that you
want the framework to generate instances for you. A parameter’s type - List<Integer>
-
is considered to be the fundamental precondition.boolean
value is the simplest form of communicating the result of checking.
Alternatively you can use any assertion library like
AssertJ or JUnit Jupiter itself.Running a successful jqwik property is as quiet as running a successful JUnit test. If not instructed otherwise, jqwik will invoke each property method 1000 times with different input parameters. As long as you write microtests running a test so often is not a problem. If needed you can tune the number as high or as low as you need.
In order to see a falsified property, we remove one of the calls to reverse
:
@Property
boolean reverseTwiceIsOriginal(@ForAll List<Integer> original) {
return reverse(original).equals(original);
}
and look at the test output:
ListReverseProperties:reverseTwiceIsOriginal =
org.opentest4j.AssertionFailedError: Property [ListReverseProperties:reverseTwiceIsOriginal]
falsified with sample [[0, 1]]
|-----------------------jqwik-----------------------
tries = 1 | # of calls to property
checks = 1 | # of not rejected calls
generation-mode = RANDOMIZED | parameters are randomly generated
after-failure = SAMPLE_FIRST | try previously failed sample, then previous seed
seed = -6242034001699631739 | random seed to reproduce generated values
sample = [[0, 1]]
original-sample = [[0, 1453667410, -606426461, 0, 585908698, -1915683337, 48, -1748790884, 10, 1993507675, -325]]
Quite a bunch of information: You can see the number of test tries
,
the number of actually run checks
, the random seed
, the originally falsified sample,
and the simplest found falsified sample
.
If you want to see all the generated lists, just add a @Report
annotation
to the property annotation like this:
@Property
@Report(Reporting.GENERATED)
You will see a surprisingly large variety in both list size and number range.
The example in this article is rather simple. Hopefully, it got you thinking nevertheless, and some questions started to cross your sceptic mind:
These and other questions will show up again in later articles, but next time we will first have a closer glance at how jqwik works, but also at alternative PBT libs on the JVM.