When does Test-Driven Development make sense?
Most discussions on Test-Driven Development (TDD) focus on its mechanics: write a test, watch it fail, make it pass. But TDD is fundamentally about a shift in perspective. It is a discipline that forces a developer to consider their code from the outside in, or in other words, to make themselves a client of their own API. At times, it is useful to remind ourselves that this “client” isn’t always an abstract concept that exists to facilitate the discipline; they can be a real person who needs to get real work done.
This article considers how it is to be that person. It is an argument in favour of TDD from the grounded, practical and realistic perspective of a developer working on code written with them in mind.
Story time
Imagine you start work one morning. You lean back in your chair, watching your laptop cycle through its boot process, feeling the weight of an expertly-poured mug of fazenda reis in your hand. The images on screen flow from login graphics to desktop, startup apps flickering open and closed, until your company Slack chat opens up. It’s here that you see it. One new unread message. You hunch forward, allowing the anticipation to build, before you finally resolve to click it.
1 New Unread Message
Hi Tom!
You know how our users are able to set custom photos for their contact display pictures? Earlier this week we started migrating people to a new server for hosting these images, but we’re getting complaints. Specifically, people can’t see the custom photo they set. Instead, they either:
- See the display picture that the contact set themselves.
- A stock image we use for any contacts without a display picture.
If a customer sets a custom photo for a contact, then that should be what they see. This is causing a lot of support calls, so could you take a look as a priority?
I’ll call later to check-in, thanks!
You lean back in your chair again to contemplate the request. This is strange. Why would (let’s assume a Product Owner) contact you? This isn’t an area of the code you’ve ever worked on before. Surely, it would be faster to ask somebody familiar with this feature? Also, why did he call me Tom? (I’m assuming that if you’re reading this, there’s a good chance your name isn’t Tom.)
Weird.
You open your IDE and navigate to the part of the code relevant to this feature. Now, I’d like you to answer a hypothetical.
QuestionIn an ideal world, what would you want to see? What in your IDE would immediately lighten your mood before you even start working on a solution?
Take some time to think it over, and then read on to hear my answer.
The Perfect Bug
If this were a real scenario for me, then I’d assume this were an aggregation problem, and I’d start hunting for a composite class. For example:
class PrioritisedContactPhoto(
private val userSetPhoto: ContactPhoto,
private val remotePhoto: ContactPhoto,
private val fallbackPhoto: ContactPhoto
) : ContactPhoto {
override fun flow(contact: Contact) : Flow<Photo> =
combine(
userSetPhoto.flow(contact),
remotePhoto.flow(contact),
fallbackPhoto.flow(contact)
) { userSet, remote, fallback ->
when {
userSet.exists() -> userSet.photo()
remote.exists() -> remote.photo()
else -> fallback.photo()
}
}
}
A “contact photo” can be defined in one of three locations:
- UserSetPhoto: The user sets a custom photo for a contact, which takes absolute priority.
- RemotePhoto: The contact sets their own photo. This takes priority if the current user hasn’t set a custom photo for that contact.
- FallbackPhoto: If a photo isn’t set by the current user or by the contact, or if there is a problem retrieving photos, we specify a fallback photo so we can display something in the UI.
The bug we’re investigating suggests the UserSetPhoto implementation isn’t working. If userSet.exists() returns false, we’ll fall through to either the RemotePhoto or the FallbackPhoto. Knowing this, my next question is simple:
QuestionWhat circumstances could cause
userSetPhoto::existsto come back asfalse?
Test suites? Who cares?
To answer my earlier question, the absolute best-case scenario for me is finding a well-designed feature with a succinct test suite. I don’t want to have to guess the developer’s intentions from the code: I want access to tests that show a detailed record of the contract they defined for this feature. I want tests that show the assumptions and expectations they built into their mental model, on the screen in front of me in black and white (or white and black, if you prefer dark mode).
What might this look like?
I search for a class called UserSetPhotoTest, and after some scrolling, I find:
class UserSetPhotoTest {
// Set-Up etc...
@Test
fun `when media server responds with unauthorised, then user set photo does not exist`() = runTest {
// Server returns 401 status for every call
val mockServer = { flowOf(Unauthorised) }
val userSetPhoto = UserSetPhoto(mockServer)
userSetPhoto.exists() shouldBe false
}
// The rest of the test suite
}
And all at once, I feel a great weight has been lifted.
You’ve lost me. What just happened?
This test tells me the “bug” in question wasn’t the result of developer carelessness or oversight, but rather an outdated assumption. All features require the developer to define a boundary between the world they can control and the one they can’t. In this case, the specifics of the server interactions required an assumption:
The original author’s assumption
Our media server responds with a 401 HTTP status code, if:
- We request a photo for a contact, and
- No photo exists for that contact on the server.
Therefore, if we receive a 401 when requesting a user set photo for a contact, no photo exists.
For the original media server, this was true, but the reasoning does not carry to the new media server. That’s fine; change happens. What matters is that whoever wrote this feature understood the problem, specifically its scope and boundaries, and now I can revise their work to make it fit an updated understanding of reality.
@Test
fun `when media server responds with unauthorised, and challenge succeeds, then user set photo exists`() = runTest {
// Server returns 401 status for first call
// then 200 for the second call
val mockServer = { flowOf(Unauthorised, OK) }
val userSetPhoto = UserSetPhoto(mockServer)
// Retry after first 401.
userSetPhoto.exists() shouldBe true
}
@Test
fun `when media server responds with unauthorised, and challenge fails, then user set photo does not exist`() = runTest {
// Server returns 401 status for first call
// and for the retry
val mockServer = { flowOf(Unauthorised, Unauthorised) }
val userSetPhoto = UserSetPhoto(mockServer)
// The server isn't lying. The photo doesn't exist.
userSetPhoto.exists() shouldBe false
}
I won’t write the code for a retry mechanism, either, as it isn’t relevant here, but seeing a test suite of this quality gives me the confidence to make changes freely.
Lots of praise for buggy code…
That’s because, in a very real sense, this isn’t a bug. Let’s start with my tentative definition of what a bug is.
Bug(My) Definition: Behaviour that violates a previously defined and agreed-upon constraint. A bug is only possible when the inputs and outputs of a feature are explicit, and the actual behaviour does not match expectations.
For example, using our Contact Photo example, imagine an informal requirements document with the following section:
Contact Photo Requirements
- The user is able to set custom photos as display pictures for their contacts. These contacts are stored on a media server.
- A 401 response from the media server following a request for a custom photo for a contact should be treated as an authorisation challenge. The application should retry with the correct authorisation header. A subsequent response of:
- 200 = A custom photo exists. Show the custom photo in-app.
- 4xx = No custom photo exists. Use an appropriate fallback.
If such a document exists, then we have a bug. The input relevant to this situation (the 4xx response) is clearly defined, and so is the outcome, so the test suite is simply wrong. More typically in these situations, there is no test at all.
We aren’t dealing with this situation, though. Instead, the developer made an unavoidable judgement call about a dependency outside of their control. They aren’t responsible for the media server; they can’t control its behaviour. The best they can do is make a reasonable effort to infer its internal workings and capture them in unit tests. What these tests tell me is:
Back in the original author’s mind
I assume that the media server returns a
401when the user has not set a custom photo for a specific contact.If this is true, then the feature will behave as follows.
And I’m fine with that. They made a call, it didn’t entirely stand the test of time, but it’s documented, it’s clear, and I can build on it.
Why didn’t the author plan ahead?
Change is an intrinsic part of software development, and it’s futile to deny it or prevent it. At some point, a dependency will cause our program to fail, and nobody has the foresight to know when or where it will happen. We can do our best to predict these areas and build in safeguards, but to try to construct a system in such a way that it is unbreakable can result in brittleness that does more harm than good.
In the case of planning ahead, there is a careful balance between adding noise to code for problems that may never even materialise and stripping code back to the bare minimum, which may miss some edge cases but is easier to modify when those edge cases come to light.
ImportantA good measure of fitness is how quickly your heart rate recovers after exercise, better than measuring your resting pulse.
A good measure of a system’s design is how quickly it can react to support new requirements, more so than a working system that is resistant to change.
(That isn’t medical advice, by the way, but I’m sure you see my point. )
It essentially comes down to this closing thought.
QuestionIf you had just started a new job where you had to maintain an existing code base, would you prefer:
- A system that was, for now, working perfectly, but which had no tests or up-to-date documentation?
- A system that had open bug tickets, but extensive, rational and pertinent test suites?
For me, I would take the latter every time. Let me know whether or not you feel the same way!
Credit: HeroImage by Alexandre Debiève on Unsplash