Check-in before code review is an antipattern
Posted at 07:00 on 09 July 2015
I’ve never been that satisfied with most explanations that I see on the Internet of why Git is better than Subversion. Usually they wax lyrical about distributed versus centralised workflows or the advantages of branching and merging, but I find that kind of misses something because it takes way too long to get to the point. The question is, what is Git’s biggest advantage, in business terms, over Subversion?
The answer is quite simple. Git supports workflows that Subversion does not, that have significant benefits for your code quality, team collaboration and knowledge sharing.
There are a few such workflows, and the ones that have become popular all have one thing in common. Code gets reviewed before it is merged into the main codebase, rather than waiting till after the fact.
In actual fact, Git’s flexibility about when you conduct code reviews leaves Subversion dead in the water. Pull requests on a web-based Git server allow you to not only review code before it is merged, but to involve the whole team in the code review and even to carry out code reviews on work that is still in progress if you’re that way inclined. In effect, every task, every user story, every feature becomes a complete conversation.
To be fair, you can adopt this workflow with Subversion, using either task branches or patches, but Subversion makes branching and merging so clunky, user-unfriendly and error-prone that it simply isn’t practical, and besides being similarly clunky, submitting patches blocks further work until your submission has been reviewed. Consequently in practice, on most Subversion-based projects, commit-before-review is the norm, and review-before-merge is only used on the most high-impact, high-risk work. And it shows—trunk-based Subversion-hosted projects almost always have a far, far lower quality than pull request-based Git-hosted projects.
Why the difference? Simple. In the commit-before-review workflow, every check-in becomes a fait accompli.
This can be a recipe for disaster.
If bad code gets checked in, you have to explicitly ask for it to be backed out or modified—and you have to follow through to ensure that this is done. Sometimes it can’t be backed out or modified, because other code has been checked in that depends on it. Contentious design decisions can all too easily be steamrollered in without any discussion—and in the event of a disagreement, backing them out can all too easily be filibustered.
There’s also a strong psychological pressure to let standards slip as well. When you’re checking in code as a fait accompli, it’s all too easy to check in ill-thought-out variable names, poor test coverage, that doesn’t follow the team’s coding standards, with useless commit summaries to boot. It’s also far too easy for your reviewer (singular—you seldom if ever get more than one person reviewing your code in this model) to decide to pick his or her battles and only focus on the more important things.
On the other hand, when the default action is “reject,” as with pull requests, the onus is on you as the author of the code to prove that your changes are fit for purpose. This gives you all the more incentive to get things right—to stick to the team’s agreed coding conventions, to write tests, to separate concerns correctly, and so on. It also means that you pay more attention to making your code readable and your commit summaries informative. After all, your team-mates (plural) are going to have to make this judgment call based on whether they can understand what you’ve done or not.
Another significant benefit of pull requests is that they dramatically improve knowledge sharing among the team. A new developer may submit a pull request that reinvents methods that already exist, or that violates coding standards that they didn’t know existed. Pull requests are an opportunity for education here—you can easily point them in the right direction. On the other hand, with commit-before-review, because it is so easy to overlook things such as these, opportunities to educate your team-mates get lost.
One other thing bears saying here. Even if you do manage to adopt a pull request-like workflow with Subversion, you still face one major limitation: changesets in Subversion are immutable. With Git, if the commit history of a task branch makes it difficult to review, you can always ask the author to revise it—clarifying commit summaries, squashing superfluous changesets, and perhaps (for experienced Git users) even teasing changesets apart. You can do this quite effectively with the git rebase --interactive
command. With Subversion, on the other hand, once it’s in, you’re stuck with it.
Pull requests are not the only advantage that Git has over Subversion. But they are the most important and the most business-critical. A pull request-based workflow with Git will give you a codebase that is much cleaner and much more robust, with fewer nasty surprises and an informative and useful source history. Trunk-based development in Subversion, on the other hand, can leave you with a very bad taste in your mouth. For this reason, sticking with Subversion raises serious questions about the quality and maintainability of your codebase.