Maybe you’re writing a brand new class, or perhaps you’re staring at a thousand line file and wondering - what should I test? The answer is actually simple - you should test the functional requirements. Translating that to practice however, can be challenging.

Whether you’re following TDD principles or writing new tests for existing components, it is far too easy to fall into the trap of testing the implementation versus testing the requirements. Why is that? Well, because it’s easy and intuitive. You see a method, you figure out what it does, and you write a test for it to make sure it’s doing what it does. That sounds like the right thing to do, doesn’t it? Well, it turns out it’s not! Let’s take a look at some examples, but first, let’s reiterate why we want to have unit tests to begin with.

Unit tests exist to prevent us from breaking the functionality of our software. When we make a change and a test breaks, it should indicate one of two things:

  1. We regressed an existing piece of functionality
  2. We removed, modified, or added desired functional behavior, but did not update a test to reflect this change

A well written test should not cause us to:

  1. Debug why the test broke
  2. Question whether the test is still valid
  3. Remove tests even though we did not remove functionality
  4. Reimplement the test because we changed implementation detail
  5. Generally speaking, require more time to make our code change in components/tests that are unrelated to the functional change we’re trying to accomplish

Similar to the notion of code smells, the above symptoms are often test smells to experienced testers. The best way to avoid these are to step back for a minute and think deeply about what you should test - and the answer should inevitably be a list of requirements that should always be fulfilled and never broken.

Examples

Let’s take a look at the following Sorter class:

class Sorter {
   public static void merge(int arr[], int l, int m, int r) {
       int n1 = m - l + 1;
       int n2 = r - m;

       int L[] = new int[n1];
       int R[] = new int[n2];

       for (int i = 0; i < n1; ++i)
           L[i] = arr[l + i];
       for (int j = 0; j < n2; ++j)
           R[j] = arr[m + 1 + j];


       int i = 0, j = 0;

       int k = l;
       while (i < n1 && j < n2) {
           if (L[i] <= R[j]) {
               arr[k] = L[i];
               i++;
           } else {
               arr[k] = R[j];
               j++;
           }
           k++;
       }

       while (i < n1) {
           arr[k] = L[i];
           i++;
           k++;
       }

       while (j < n2) {
           arr[k] = R[j];
           j++;
           k++;
       }
   }

   public static void sort(int arr[], int l, int r) {
       if (l < r) {
           int m = (l + r) / 2;

           sort(arr, l, m);
           sort(arr, m + 1, r);

           merge(arr, l, m, r);
       }
   }
}

Oh, it’s a mergesort! You may now be inclined to write the following tests (among others):

@Test
public void testMerge() {
   int[] arr = { 4, 3, 5, 7, 2};
   Sorter.merge(arr, 2, 3, 4);
   assertArrayEquals(new int[] { 4, 3, 2, 5, 7 }, arr);
}

@Test
public void testSort() {
   int[] arr = {3,1,2};
   Sorter.sort(arr, 0, 2);
   assertArrayEquals(new int[] {1,2,3}, arr);
}

Does anything seem wrong here?

What if we decided we wanted to change our sorting algorithm to quicksort? Now we have a dilemma. To implement quicksort, we got rid of the merge() method. But there’s a breaking test! Is this test still valid? Is it ok to remove? Oh no, a test smell!

I know, at this point you’re saying to yourself, well duh we can remove testMerge. It’s quite obvious in this contrived example (we’ll take a look at a slightly broader case shortly). But what if this was part of a much larger system with many dependencies and it wasn’t obvious? What’s the lesson learned here? The takeaway is that the requirements of this system that cannot be broken is its ability to sort. We have not deprecated this functionality, and therefore we should not be removing any tests (test smell). How that sort happens doesn’t matter - it’s an implementation detail! The only thing that should have ever been tested here is the sort() method.

Now consider this example. Suppose you’re building a stock ticker that receives a feed of the stock price of a symbol. In your application you need to be able to show the highest price of that symbol. Maybe your code and tests look like this:

public class Ticker {
   private String symbol;
   private List<Double> prices = new ArrayList<>();

   public Ticker(String symbol) {
       this.symbol = symbol;
   }

   public void tick(double newPrice) {
       prices.add(newPrice);
   }

   public double[] getSortedPrices() {
       double[] priceArray = prices.stream().mapToDouble(Double::doubleValue).toArray();
       Sorter.sort(priceArray, 0, prices.size() - 1);
       return priceArray;
   }
}

public class Application {
   Ticker yextTicker;

   public void renderMaxPrice() {
       double[] sorted = yextTicker.getSortedPrices();
       render("High: " + sorted[sorted.length - 1]);
   }
}

// Tests that the prices are returned sorted.
@Test
public void testSorted() {
   Ticker ticker = new Ticker("YEXT");
   ticker.tick(17.0);
   ticker.tick(25.0);
   ticker.tick(23.0);
   assertArrayEquals(new double[]{17.0, 23.0, 25.0}, ticker.getSortedPrices(), 0.001);
}

Seems reasonable, right? There’s a method to sort the prices and we can use that sorted array to figure out min/max/avg etc.

Now re-read that last sentence … we just described how we’re going to fulfill a requirement, not what the requirement is. This is implementation detail! What happens if we realize that we have thousands of ticks and sorting the array every time is inefficient? What if we want to just keep track of maxPrice in a variable and instead expose the following method:

public void getMaxPrice() { return maxPrice; }

public void tick(double newPrice) {
   if (newPrice > maxPrice) {
       maxPrice = newPrice;
   }
   // … 
}

Let’s say we’ve added this method and got rid of our getSortedPrices() method. Uh oh … what happens to testSorted()? What test smells are we encountering here from the list above? Again, you may say this is obvious in this context, but what if in the real system getSortedPrices() is used in hundreds of other places?

These are the kinds of situations we encounter when we unintentionally (or intentionally) test implementation and not the requirement. Implementation changes often. It’s very easy to fall into this trap; the code above seems very innocent at first glance.

Let’s talk about how we could have avoided it when we authored the test:

  1. It starts with the comment - “Tests that prices are returned sorted”. This is not a good comment. The first reason is because it is quite obvious that’s what the test is doing by reading the code. What a developer will want to know when looking at this test 2 years later is why the test is necessary. What is the functional requirement that the test is fulfilling/what shouldn’t we break? A good practice in authoring tests is to first write the comments - describe what functionality your test is testing that should not break over time. A good comment here would be along the lines of “The highest value that is ever captured by the ticker is the historical maximum value of the symbol. We need to be able to retrieve this value to render for the user.” Our updated test now looks like this:
    // The highest value that is ever captured by the ticker is the 
    // historical maximum value of the symbol. We need to be able to 
    // retrieve this value to render for the user.
    @Test
    public void testSorted() {
       Ticker ticker = new Ticker("YEXT");
       ticker.tick(17.0);
       ticker.tick(25.0);
       ticker.tick(23.0);
       assertArrayEquals(new double[]{17.0, 23.0, 25.0}, ticker.getSortedPrices(), 0.001);
    }
    
  2. The above comment leads you to a more intuitive name for the test method that corresponds to the requirement being tested. TestSorted is really not what we want to test; we don’t care if it’s sorted as we have no functional requirements around it ever being sorted. What we do care about is that we can get the maximum price back, so maybe our test should be testMaximumPrice and assertion should just check for the maximum price. Now our test looks like this perhaps:
    // The highest value that is ever captured by the ticker is the
    // historical maximum value of the symbol. We need to be able to 
    // retrieve this value to render for the user.
    @Test
    public void testMaximumPrice() {
       Ticker ticker = new Ticker("YEXT");
       ticker.tick(17.0);
       ticker.tick(25.0);
       ticker.tick(23.0);
       double[] sorted = ticker.getSortedPrices();
       assertEquals(25.0, sorted[sorted.length - 1], .001);
    }
    
  3. Having appropriate comments and names will actually lead you to design the actual class better. Do we really want to testMaximumPrice by calling getSortedPrice and checking the last element? Or should our Ticker class just expose a getMaximumPrice method? Under the covers, perhaps Ticker is still sorting and returning the highest, but this can help us clarify the interface.
    // The highest value that is ever captured by the ticker is the 
    // historical maximum value of the symbol. We need to be able to 
    // retrieve this value to render for the user.
    @Test
    public void testMaximumPrice() {
       Ticker ticker = new Ticker("YEXT");
       ticker.tick(17.0);
       ticker.tick(25.0);
       ticker.tick(23.0);
       assertEquals(25.0, ticker.getMaxPrice(), .001);
    }
    

Note above that you may argue that the test code is obvious and doesn’t need comments. This is one scenario where I would advocate against the general rule not to comment self-documenting code. Tests should always have comments explaining why they need to exist (as opposed to what they are testing, which should be obvious from the code). What seems like an obvious requirement to you today will be legacy to someone else in matter of years if not months.

As an aside, these 3 step actually demonstrate the magic of TDD done right … tests ultimately drive better design in the product!

Alas, if we had put this thought into the test to begin with, our refactoring would be much simpler when we decide to keep track of the highest price in a single variable instead of sorting our price list. Our test now tests our requirements, not how we fulfill them!