Last weekend I finally got around to reading the Visible Ops handbook . I'd been meaning to read it for a while - ever since I first heard Gene Kim speak at a conference last year.
All in all it's an interesting read. I didn't find the book particularly controversial. But, then again, it is nearly 7 years since the book was first published. I think few people would disagree that many of the practices are now part of the 'received wisdom' of operations. I also found it useful to get a reminder of the ITIL jargon. Using the correct terms for this sort of thing is something I'm a bit weak on, but as the authors say in the book, having a common language is vital.
Working for a smaller company, I suppose some of the talk about audits and change-review boards sounded a bit overkill. Although it made me realise that the natural turn-off of these things is usually more to do with past experience of bureaucratic processes, rather than a feeling that audit and change-reviews are, in-themselves, a bad thing.
The other week I gave a lecture to students at Warwick about what it's like to be a software developer at Kasabi [2, 3]. I had lots of things to talk about: our product, our team, our culture, etc. However I wanted to share something about our working practices. The practices we use have certainly evolved over time, inspired from various methodologies and communities including: Scrum, Kanban, and dev-ops. However, if I had to choose one thing that I couldn't do without, I think it would be having a code-review process that's tightly coupled to the build/release process.
Nearly a year ago, our team moved to using Gerrit  as our principle review tool. The way it integrates with version control and continuous integration is fantastic. We were already following a continuous deployment model, but I feel that following the Gerrit workflow has helped us get closer to doing truly automated deployments. We're also a distributed team, so spending extra-time on code-reviews can make up for less time spent pair-programming.
So lets return to Visible Ops... for me, one of the biggest take-aways from reading the book was the idea that the acceptable number of unauthorised changes should be zero. Two points follow directly from this: 1) that code-review / change-review should be a straightforward and fast process (so that it's not routinely circumvented), and 2) that even emergency changes should go through some kind of review process. In fact, the authors of Visible Ops make the valid point that emergency changes often benefit from closer review. For me, the simplicity of using a review-tool (like Gerrit) that's built into the release process facilitates both of these goals. There have been times when I've needed to push an emergency fix. But because the tooling is relatively quick and simple to use, more and more I'm finding that I prefer to take these changes down the normal review path. Even though we're not doing heavy-weight audits, from an ITIL perspective, it's re-assuring to know that these changes are going through a robust, well-understood, and auditable process.