How not to write property tests in JavaScript
Property-based tests give us more confidence in our code. They’re great at catching edge-cases we may not have thought of otherwise. But this confidence comes at a cost. Property tests take more effort to write. They force you to think hard about what the code is doing, and what its expected behaviour should be. It’s hard work. And on top of that, running 100+ tests, is always going to take longer than running 3-5 example-based tests. This cost is real, and it raises the question: How do we keep ourselves from over-specifying or writing unnecessary tests?
Avoid reimplementing the function under test
The most common beginner mistake we see is re-implementing the system under test. And it makes sense. Because coming up with properties that should always be true about our code is difficult. To use a silly example, let’s imagine we’re writing some function to sort a list of comments by date. The code looks something like this:
const sortByPostDate = (comments) =>
[...comments].sort((c1, c2) => c1.posted.valueOf() - c2.posted.valueOf());
We want to make sure that the sort function results in everything being in order. If we’re not thinking too hard, we might write something like this:
describe('sortByPostDate()', () => {
it('should always return comments in sorted order', () =>
fc.assert(
fc.property(fc.array(generateComment()), (comments) => {
const sortedComments = sortByPostDate(comments);
const expected = comments.slice(0).sort(({ posted: d1 }, { posted: d2 }) => {
if (d1 < d2) return -1;
if (d1 > d2) return 1;
return 0;
});
expect(sortedComments).toEqual(expected);
}),
));
});
Here, our test reimplements the same logic as sortByPostDate()
, so it doesn’t tell us much. All we can say is that we have the ability to write the same function two different ways.
Thinking in properties
A better approach would be to ask what properties do we expect to hold when we sort our list of comments? And we can brainstorm some ideas:
- Sorting shouldn’t add or remove any elements.
- Sorting shouldn’t change any of the elements in the array.
- The posted date for the first item should be smaller than all the other posted dates.
- The posted date for the last item should be larger than all the other posted dates.
- Sorting two arrays with the same elements should produce the same result. Even if the two arrays are in a different order.
Now we can think about which of these laws we want to test. Let’s suppose we want to make sure sorting doesn’t add or remove elements. We could start by testing that the sorted array has the same length as the input array:
describe('sortByPostDate()', () => {
it('should always return a list with the same length, for any list of comments', () =>
fc.assert(
fc.property(fc.array(generateComment()), (comments) => {
const sortedComments = sortByPostDate(comments);
expect(sortedComments).toHaveLength(comments.length);
}),
));
});
That test gives us a little more confidence. But, what if the sort function removes one element, and adds another one? The .length
test won’t catch that. Let’s add another test to check that each item from the input array exists in the output array:
describe('sortByPostDate()', () => {
it('should always return a list of the same length, for any list of comments', () =>
fc.assert(
fc.property(fc.array(generateComment()), (comments) => {
const sortedComments = sortByPostDate(comments);
expect(sortedComments).toHaveLength(comments.length);
}),
));
it('should always contain each element from the input list, for any list of comments', () =>
fc.assert(
fc.property(fc.array(generateComment()), (comments) => {
const sortedComments = sortByPostDate(comments);
sortedComments.forEach((comment) => {
expect(sortedComments.includes(comment)).toBe(true);
});
}),
));
});
With that in place, we’re now covering the first two properties from our brainstorm list. If you’re paying attention though, you’ll notice something. If we remove a single test, we can’t guarantee either property. And neither of these tests addresses the actual sorting aspect of our function. Properties 3 and 4 might move us further in that direction though.
Let’s take another look at those properties:
- The posted date for the first item should be smaller than all the other posted dates.
- The posted date for the last item should be larger than all the other posted dates.
These two are corrollaries of each other. If we can show that one of them holds, then we could write a proof showing that the other property holds too. Thus, we’ll focus on the first one.
Now, if we ponder this a little, we can extend the property a bit. If we’ve sorted the array, then the first posted date should be the earliest. That is, it’s earlier than every item that comes after it. But, the second item should also have a date earlier the items that come after it. And the third one. And so on. That suggests a recursive proof for checking that we’ve sorted the array:
An array is sorted if the first value is lower than all the other values, and the rest of the array is sorted.
Putting that into code, we get:
const isSortedAsc = (list) => {
if (list.length <= 1) return true;
const [head, next, ...tail] = list;
return head <= next && isSortedAsc([next, ...tail]);
};
It’s not the most efficient code in the world. But it will test if an array of numbers are in order. And we can use it in a property test:
it('should always return elements sorted in order of post date, for any list of comments', () =>
fc.assert(
fc.property(fc.array(generateComment()), (comments) => {
const sortedComments = sortByPostDate(comments);
expect(isSortedAsc(sortedComments.map(({ posted }) => posted.valueOf()))).toBe(
true,
);
}),
));
We’ve now covered that our function sorts without modifying, adding or removing elements. But we still have one more property from our brainstorm left.
Are we over-specifying?
The last property we brainstormed was:
- Sorting two arrays with the same elements should produce the same result. Even if the two arrays are in a different order.
This is certainly something that ought to be true. So we could most certainly write a property test for it:
// A quick-and-dirty shuffle function.
const shuffle = (arr) =>
arr.reduce(
({ shuffled, toShuffle }) => {
const idx = Math.floor(Math.random() * toShuffle.length);
return {
shuffled: shuffled.concat([toShuffle[idx]]),
toShuffle: [...toShuffle.slice(0, idx), ...toShuffle.slice(idx + 1)],
};
},
{ shuffled: [], toShuffle: arr },
).shuffled;
// … Back to our test code
it('should return identical arrays, for any pair of shuffled arrays', () =>
fc.assert(
fc.property(fc.array(generateComment()), (comments) => {
const shuffledComments = shuffle(comments);
const sortedCommentsA = sortByPostDate(comments);
const sortedCommentsB = sortByPostDate(shuffledComments);
expect(sortedCommentsA).toEqual(sortedCommentsB);
}),
));
The question is, do we need this test? Does it tell us anything the others don’t? Think about it for a moment. If I asked you, how would you answer?
The answer is, yes it does tell us something. But we may not care. The ‘identical arrays’ property will fail for a specific edge case. It will fail when there’s more than one comment with the same date (down to the millisecond). In that case, the built-in sort function will leave the array entries in whichever order it finds them. And that order may be different if we’ve shuffled the arrays.
Does it matter though? Well, it depends. It depends on whatever else is going on in our system. And the reasons why we wanted to sort the list in the first place. If our goal is to show the user comments in a sensible order, it may not matter. But what if we’re trying to reconcile a stream of edits to a document? In that case, the non-determinism has potential to cause serious problems. But for the majority of cases, we won’t need that last property test.
This example generalises to a rule of thumb: Avoid specifying more than you need to. Now, someone may be thinking, this rule works for any automated test. But, for property tests, it’s useful to keep asking: “Is this property already proven (or inferred) by other properties?”
Does this need to be a property?
There’s lots of situations where property tests work, but might not be necessary. Imagine we’re creating a generic TextField
component. We’re using to help us lay out some forms for our team. It might look something like this:
const TextField = ({ id, name, label, value, placeholder = '', maxlength = 255 }) => (
<div className="FormField">
<label className="FormField-label" htmlFor={id}>
{label}
</label>
<input
type="text"
name={name}
value={value}
id={id}
placeholder={placeholder}
maxLength={maxlength}
/>
</div>
);
The question is, are there any properties that ought to hold for a component (or function) like this? Most of the function is putting the props into placeholders. Are there properties we can define here?
We do want to make sure each input prop ends up in the right place. But a handful of examples in a describe.each()
table would give us confidence there. I can think of but one property that seems important to assert here:
- The
htmlFor
prop of the label should always refer to theid
prop of the input.
If we break that linkage then it’s an accessibility fail. So we could write a property test for it:
const generateProps = () =>
fc.record(
{
id: fc.string(),
name: fc.string(),
label: fc.string(),
value: fc.string(),
placeholder: fc.string(),
maxlength: fc.double(),
},
{ requiredKeys: ['id', 'name', 'label'] },
);
describe('TextField', () => {
it('should always link the label to the input field, given any set of input props', () =>
fc.assert(
fc.property(generateProps(), (props) => {
const wrapper = shallow(<TextField {...props} />);
expect(wrapper.find('label').prop('htmlFor')).toBe(
wrapper.find('input').prop('id'),
);
}),
));
});
Now, someone may be thinking that even this is overkill. A handful of example tests in describe.each()
would be enough for this too. And in the scenario I gave, we’re using this component to lay out a single form. We might use it, say, ten times total? If that’s the scenario, we could conceivably create an example for every id
we pass in. And we know the internals here, so we can verify visually that id
doesn’t interact with other props. In that scenario, running hundreds of tests for this component might be a waste of time. We can generalise this idea to a rule too:
If you can list out all the inputs you’ll give the function, it may not need a property test.
Do write property tests for shared utilities and libraries
What if the form scenario were different? What if this is part of a design system? People might throw all kinds of weird and wonderful props at this component. In this case, property tests become a lot more valuable. Even writing the generator raises some interesting questions:
- The prop
maxlength
has type number. This means people could pass any kind of floating point value. What should happen if someone enters a negative value? Or a fractional value? The HTML specification states that this should be positive integer. But our type system can’t represent that. How do we want to handle it? - We have three required props for the component. But they’re all strings. And it’s entirely possible for someone to provide an empty string. Is that a problem? If so, what should happen if people try it?
In both cases, a property test could help, but how we write the test depends on the answers we give.
Why bother with property tests?
We’ve talked a lot about how expensive and difficult property tests are. And, given all that, it seems reasonable to ask: Why bother? Are property tests worth the effort? Wouldn’t it be better to focus on integration and end-to-end tests? After all, these tests give a lot of ‘bang for buck.’ They don’t just test that individual components are working. Instead, they test that components are working together to deliver customer value. And that’s what it’s all about, right?
That’s all true. As tests, integration, and end-to-end tests deliver the most value. But like with Test Driven Development (TDD), tests aren’t the point. The reason I became enthusiastic about TDD was not because I got lots of tests. I became enthusiastic about TDD because when I practised it, I wrote better code. The discipline of thinking about tests forced me to clarify my intent. I started writing code in smaller, more comprehensible chunks. Not only did the code need less maintenance, but when it did, I dreaded going back to the old code less.
Then I discovered property-based testing. It takes all those benefits of TDD and increases them an order of magnitude. I thought I understood my code. Then I started thinking about properties and learned I did not. Instead of thinking about whether my code worked I began to think about whether it’s correct.
Writing tests first forces you to think about the problem you're solving. Writing property-based tests forces you to think way harder.
— Jessica Joy Kerr (@jessitron) April 25, 2013
Experienced software engineers all give lip service to “thinking through edge cases.” We’re supposed to consider every possible thing the world might throw at our code. Property tests force you to actually do it.
It’s not just about edge cases though. Thinking about properties is a mindset. And this mindset is so valuable that it’s worth practising, even if you delete all the tests aftewards. Sure, you’d then need to write some other tests to catch regressions. But if property tests are slowing your builds, delete them. Copy the properties into code comments or add .skip
to your tests so you can get them back if you need. The tests aren’t the point, they’re a side-benefit.
Sure, there are no silver bullets in software development. Property tests are not magic fairy dust you sprinkle over you code to make everything better. They won’t even guarantee bug-free code. And, as we’ve discussed, they’re slow to run and hard to write. But they’re worth it. Yes, be careful with them. No, they may not suit every single situation. The act of even thinking about them though, will help you write better code.