Test-driven development in OOPLs is mostly focused on example-based test cases aka as “plain old unit tests”. Let’s say we want to check if the common JDK function java.util.Collections.reverse() works as expected. We use a simple JUnit Jupiter test for that purpose:

import java.util.*;
import org.assertj.core.api.Assertions;
import org.junit.jupiter.api.Test;

class ListReverseTests {
    void reverseList() {
        List<Integer> aList = Arrays.asList(1, 2, 3);
        Assertions.assertThat(aList).containsExactly(3, 2, 1);

Many, hopefully most, developers have written similar tests for years or even decades. Usually with good success and a reasonable capability to detect common programming errors. There is one thought, though, that’s always been nagging in the back of my mind: How can I be confident that reverse also works with 5 elements? With 5000? With an empty list? With elements of different types? The amount of doubts can go as high as I allow it.

One way to fight this type of uncertainty is to add more examples and test cases. I am calmed down by the hope that my choice of examples is sufficiently representative to catch bugs now and regressions in the future. When in doubt I add another test case - and yet another. Model-based testing approaches (e.g. equivalence classes and parameter combinatorics) address this exact problem but usually err on the side of too many tests. And every single test that does not reveal an error now or in the future means a waste of resources and additional maintenance effort.


We can approach the question of correctness from a different angle:

Under what preconditions and constraints (e.g. the range of input parameters) should the functionality under test lead to which postconditions (results of a computation)? And which invariants should never be violated in the course?

This combination of preconditions and qualities that are expected to be present is also called a property.

Let’s formulate a property for the reverse function in plain English:

For any given list of elements, applying reverse twice should result in the original list.

If we can now translate this prose into a computer-executable form, e.g. programming language code, and also let the computer generate a wide range of examples that conform to the preconditions, then we have arrived at Property-based Testing.


The mother of all modern PBT frameworks is QuickCheck for Haskell. The property from above could be translated into QuickCheck like that:

prop_reversed :: [Int] -> Bool
prop_reversed xs =             
  reverse (reverse xs) == xs

As a functional language with a strong type system Haskell allows this very concise property specification:

At test runtime QuickCheck will generate a number (usually 100) of (mostly) random lists and call prop_reversed. If only a single call returns False the property is considered to be falsified and the test run as failed. Thus it’s important to recognize that

PBT cannot prove that a property is correct but only try to find examples to refute it!

What you can also notice is that this property is required but not sufficient. A trivial implementation of reverse which returned the input list as output list would also succeed. This raises an important question which will be tackled later in this series: Can property-based tests replace our good old examples tests or are they “just a complement”?

Let’s do it in Java

Translating the above QuickCheck function into Java with just JUnit is not simple. That’s why we pull in another test engine: jqwik. We will have a closer look at jqwik in the next article. For a start all we do is translate the Haskell from above into a jqwik test method:

import java.util.*;
import net.jqwik.api.*;

class ListReverseProperties {

	boolean reverseTwiceIsOriginal(@ForAll List<Integer> original) {
		return reverse(reverse(original)).equals(original);

	private <T> List<T> reverse(List<T> original) {
		List<T> clone = new ArrayList<>(original);
		return clone;

Let’s have a closer look at the code:

Running a successful jqwik property is as quiet as running a successful JUnit test. If not instructed otherwise, jqwik will invoke each property method 1000 times with different input parameters. As long as you write microtests running a test so often is not a problem. If needed you can tune the number as high or as low as you need.

In order to see a falsified property, we remove one of the calls to reverse:

boolean reverseTwiceIsOriginal(@ForAll List<Integer> original) {
    return reverse(original).equals(original);

and look at the test output:

ListReverseProperties:reverseTwiceIsOriginal =
org.opentest4j.AssertionFailedError: Property [ListReverseProperties:reverseTwiceIsOriginal]
    falsified with sample [[0, 1]]

tries = 1                     | # of calls to property
checks = 1                    | # of not rejected calls
generation-mode = RANDOMIZED  | parameters are randomly generated
after-failure = SAMPLE_FIRST  | try previously failed sample, then previous seed
seed = -6242034001699631739   | random seed to reproduce generated values
sample = [[0, 1]]
original-sample = [[0, 1453667410, -606426461, 0, 585908698, -1915683337, 48, -1748790884, 10, 1993507675, -325]]

Quite a bunch of information: You can see the number of test tries, the number of actually run checks, the random seed, the originally falsified sample, and the simplest found falsified sample.

If you want to see all the generated lists, just add a @Report annotation to the property annotation like this:


You will see a surprisingly large variety in both list size and number range.

Further Questions

The example in this article is rather simple. Hopefully, it got you thinking nevertheless, and some questions started to cross your sceptic mind:

These and other questions will show up again in later articles, but next time we will first have a closer glance at how jqwik works, but also at alternative PBT libs on the JVM.