Documentation is a last resort. Few people want to read a document, and fewer still want to write it. I think the appropriate attitude towards documentation is to have a little of it as possible.
First, try to not write documentation
In an ideal world, we don’t need any documentation because everyone already knows everything. A couple of things you can do to reduce the amount you need:
Hire smart people
You’re already trying to do this, so really you just need to believe you’re getting it right.
If you find yourself writing up how to use or configure some commonly-used off-the-shelf tool, you’d almost always be better served by, at-most, linking to that tool’s own documentation.
There is so much writing already on the Internet about setting up laptops or how to host services in the cloud that you really shouldn’t have any need to write your own copy.
Don’t do weird things
When you build a new thing, do it in the same way as everyone else, using the same tools as everyone else, in a way that is obvious and, at worst, easily googleable.
Historically we built weird bespoke systems to get as much productivity as out of as little computer as possible, but nowadays work-hours are much more expensive than CPU hours, so the better optimisation is to make things easy and obvious for people.
Next, write as little as you can
Train your staff
Training is much better than documentation; it’s better to think of this problem as one of ‘knowledge’ rather than of ‘documentation’. Your written docs can then be simple, terse, reminders of the key bpoints.
It is always better to be trained for a situation or task than for there to simply be documentation available; apart from anything else, the training explicitly announces the presence of things that are not obvious.
Done well, a training session also gives the opportunity to test that the process or system is completely understood before there’s any negative outcomes from misunderstanding.
Ideally, somebody is responsible for each of the training sessions, and so naturally becomes a subject-matter expert in it and keeps the material up-to-date. Even if they don’t, so long as the presenter is familiar with the topic they will naturally correct the slides and other material as they deliver it.
Use signposts rather than prose
We’ve all seen the alert runbooks which essentially boils down to a list of graphs to check and commands to run to figure out the actual current state of a system. These runbooks should be replaced with annotated dashboards in your monitoring system.
Don’t send someone to a Confluence doc with links to saved Kibana queries and some if/thens; send them straight to a Grafana board with graphs showing the current state and accompanying text.
But, when you need to write some docs
Think about why you’re writing them
You shouldn’t ever be writing a document without knowing who you expect to be reading it and what they want to get out of it.
Remember that your readers don’t generally want to be reading; write a skimmable article using a large number of small paragraphs with frequent descriptive headings. The Monzo Style Guide is a great thing to follow
Organise them
There’s often four types of documentation in the tech bit of a company:
- Runbooks and playbooks for performing tasks and responding to incidents.
- Minutes from standing meetings like on-call reviews, project checkpoints and the like.
- Design and decision documents laying out the possible ways to approach a problem and explaining which is best and why.
- Policies that need to be referred to when drawing up new processes etc.
These should be searchable completely separately; if I’m searching the runbooks for the name of an alert I don’t want to find every time that came up in the fortnightly on-call review meeting.
You can do this in Confluence with Spaces, but you might equally put your policy on an intranet site and use Google Docs for minuting meetings and for storing design docs.
Make them easy to find
There’s two ways people will come to your document, either through searching for it, or having used a specific link. It’s hard to get links wrong, but you can make it easier to find documents.
The first is to use consistent terminology, which also makes for better docs in general. Decide whether to use product names (“S3”, “Cloudfront”) or generic terms (“Object Storage”, “CDN”) and also write a glossary for your peculiar internal names for things.
There’s a few places where one word can mean two opposite things: when traffic is on its way from the web-server to the firewall is it egress because it’s on its way out of the cluster, or ingress because it’s on its way into the firewall? To my mind it is egress because it’s leaving the cluster, and incoming because it’s approaching a host.
There’s lots of scope for this sort of confusion, and maintaining a glossary is a great way to both highlight the possibility of this, and to set a standard to avoid it.
The second is to accept that clumsy titles sometimes lead to better results; don’t be a fraid to do a little keyword-stuffing.
Use consistent templates
When we write documentation for a system we tend to begin with an introduction to the topic, explain the background of the problem it solves, the problems we hit along the way and then describe the perfect system and the one we ended up with.
In general, when we arrive at documentation we want to immediately know the system we ended up with and what’s odd about it. Almost nobody is interested in the history.
Once you’ve got the separation down as above, you’ll find that you put the history bit in the section for design docs, and so the runbooks don’t end up with those paragraphs.
Try to delete them
I tend to find that when left with the appropriate access to Confluence I get annoyed with out-of-date docs about every year to eighteen months and go on a deletion spree.
I think this is probably better-done in a more rigorous proces which I haven’t yet found. I’ve toyed in the past with modifying MediaWiki to get lists of unused documents, and I imagine there’s some periodic-review plugins for confluence.