This article was originally presented as a “brown bag talk”, an internal series of talks where Praelexis employees meet over lunch to share and discuss technical topics that interest them.
In today’s post we’re going to (briefly) explore one of my favourite topics: technical documentation. We’re going to look at some reasons why your technical documentation sucks, and, along the way, touch on how we can maybe make things better. The post is a touch hyperbolic and tongue-in-cheek, but I can assure you that the lessons contained within are not — they are extremely valuable and useful to teams of one, as well as teams of one thousand.
Also, while non-technical folks can learn something here, this post is squarely aimed at the technical / developer crowd, and sometimes specifically to Python.
Without further delay, here’s why your documentation sucks.
Documentation is like [pizza]. Even when it’s bad, it’s better than nothing
Someone on the internet, probably.
You 100% absolutely need to document your code. No matter how good your product, library, or tool is, if your documentation sucks, people aren’t going to use your tool. Period. It’s that simple. If you force people to use your tool without good documentation, they won’t just be ineffective — they’ll also dislike you. Worst of all, if you’re forced to work on your own codebase without excellent documentation, you’ll begin disliking yourself.
Not having documentation is like trying to wander around in a dark cave without a torch — it’s torturous (ha!). All you do is get lost and confused. It leads to frustration and misunderstanding how things actually work, particularly if you encounter a project for the first time (which is the same as coming back to a project a few months later).
If you think you don’t need documentation, then your documentation sucks. And not having documentation also means your documentation sucks.
This is a common mistake I see beginner programmers make. But what about the following:
Anything that helps people understand the behaviour and intention of your code is documentation!
An example of descriptive variable names:
And some good unit tests that document how things are supposed to behave:
(and please, remember to write tests!).
Use everything at your disposal to make your code easy to understand. Documentation extends quite far beyond docstrings and comments. Don’t ignore variable names, clean code and good tests. Otherwise your documentation will suck.
Another common beginner mistake (this is often how you’re taught at university, so it might not be entirely your fault). Your team asks you to document your code. And you produce something like this:
Now, I know your intentions are good when you write code like this. But I can read code. And guess what, your teammates can read code too.
Instead, rather use code to explain something that is surprising or unexpected, or is deliberate due to something non-obvious. For example, take a look at this snippet from one of our Django codebases:
If you’re not familiar with Django, you might not know that you can build up a single query by using the |
operator on multiple Q objects, instead of step-by-step narrowing down your selection. It’s a neat little trick, so the developer added a comment (and a reference to a StackOverflow question!) to explain what the intention is, and why it’s there.
Take another example:
When modifying with a session
in Django, it will not be saved unless it is assigned to a variable first, due to some technical reason. An unsuspecting developer might come along, and think they can modify the session directly via self.client.session
, but then later be confused why the session isn’t saved. Again — use comments to explain things that are surprising!
And you don’t have to take it from me either:
A delicate matter, requiring taste and judgement. I tend to err on the side of eliminating comments, for several reasons. First, if the code is clear, and uses good type names and variable names, it should explain itself. Second, comments aren’t checked by the compiler, so there is no guarantee they’re right, especially after the code is modified. A misleading comment can be very confusing. Third, the issue of typography: comments clutter code.
Rob Pike, “Notes on Programming in C”
Use comments wisely, otherwise your documentation will suck.
This is almost the opposite case of Rule 2: your code works, but it’s poorly written and confusing. So you think to yourself, “Ah hah! Rather than fix this, I’ll just add some documentation to explain what’s happening. That should be ok.”
And you’d be wrong.
Take the following implementation of Fizz Buzz:
From the comments, I understand what the code is supposed to do, but I have no clue how to actually works.
As a side note, it’s also highly probable that if you can’t write clean code, you likely won’t write good documentation anyway:
A common fallacy is to assume authors of incomprehensible code will somehow be able to express themselves lucidly and clearly in comments.
Kevlin Henney
So focus on your fundamentals! If you can’t write clean code, your code sucks.
But your documentation also probably sucks.
Our first Python-specific point 🐍.
A lot of Python developers know about PEP-8, which is the official style guide for Python code. What not a lot of Python developers know is that there is also a documentation equivalent: PEP-257.
It’s worth reading through PEP-257 (it’s not long!), but my favourite example is one I see even senior developers not do: use the imperative mood in the first line of a docstring:
The docstring is a phrase ending in a period. It prescribes the function or method’s effect as a command (“Do this”, “Return that”), not as a description; e.g. don’t write “Returns the pathname …”.
Extract from PEP-257
Follow PEP-257, otherwise your documentation sucks.
Luckily for us, the Python ecosystem is so large and mature that a number of really smart people have already spent a lot of time thinking about things like documentation. To date, there are three major docstring formats in Python that most projects will use:
Each style guide has a clear specification that you can (and should) follow.
Just because these docstring style guides doesn’t necessarily mean they’re good, but by standardising on accepted formats nets you a few easy wins:
Let’s take a look at what these three major formats look like:
It doesn’t matter too much which format you choose, as long as you choose one and stick to it!
Except maybe it does matter which docstring format you choose?
I personally (fight me!) think that reStructuredText is the wrong format. Here’s why:
Fun fact: We chose standardised on reStructuredText at Praelexis. This was at a time before NumpyDoc or Google Python Style Guide were as popular as they are now. We’re in the process of considering moving to a more readable format, now that better things are available.
Consider the following function (pulled again from one of our codebases):
And now compare that with the following:
Which would you rather have? I think it’s fairly obvious. More explanation, more reasoning, more exposed “thinking process” is almost universally preferred. You’ll see this echoed across all of the major Python projects: scikit-learn
, numpy
, pandas
, etc. all often have more documentation than code. This isn’t an accident!
Some other points to keep in mind when going beyond the bare minimum:
Also do more than the bare minimum. You’ll be thanked by your teammates, users, and yourself.
If you do the bare minimum, your documentation sucks.
Even if you assume the perfect build system: excellent unit tests, linting, style checks, and so on. You’ll still have the following issues with documentation:
When it comes to docstrings and comments, documentation is code. But it’s code that doesn’t get executed or tested (yes, I know about doctests, that’ll help with testing examples, but not with prose). As a result, you have to be extra vigilant and disciplined when it comes to maintaining your documentation. Made a point of revisiting it, making sure it’s still up to date and relevant. Otherwise you might inadvertently do the worst thing documentation can do (besides not existing): pointing someone in the wrong direction.