james mckay dot net
because there are few things that are less logical than business logic

Posts tagged: dvcs

Martin Fowler and feature branches revisited

Fábio asks me this by e-mail:

I found this very interesting article (http://web.archive.org/web/20110721063430/http://jamesmckay.net/2011/07/why-does-martin-fowler-not-understand-feature-branches/) that once was in your blog, and it seems it can only be accessed in the archive, or at least I was not able to find it in the live version of your blog.

Is there a reason for that? Do you still keep the same opinion? If not, can you give me a quick hint of what to read to reach the same conclusion?

In answer to his question, I took the original blog post down in January 2013 along with everything else on my blog, because I wanted to give it a complete reboot. In the years since then, I’ve restored some of them going as far back as 2009,  but I hadn’t restored that particular post, mainly because I felt that I’d been too confrontational with it and it hadn’t made me look good. However, since it’s out there in archive.org, and people are still asking about it, I’ve now restored it for historical reference. Unfortunately, I haven’t restored the comments because I no longer have a backup of them.

A bit of historical background.

I wrote the original post in July 2011. At the time, distributed version control tools such as Git and Mercurial had been around for about six years, but were still very new and unfamiliar to most developers. Most enterprise organisations were deeply entrenched in old-school tools such as Subversion and Team Foundation Server, which made branching and merging much harder and much more confusing than necessary, most corporate developers were terrified of the concept, and vendors of these old-school tools were playing on this fear for all it was worth. This was intensely frustrating to those of us who had actually used Git and Mercurial, could clearly see the benefits, and yet were being stonewalled by 5:01 dark matter colleagues and managers who didn’t want to know. To us, Subversion was The Enemy, and to see someone of Martin Fowler’s stature apparently siding with the enemy was maddening.

On the other hand, these tools weren’t yet quite enterprise ready, with only weak Windows support and rudimentary GUIs. SourceTree was not yet a thing. Furthermore, the best practices surrounding them were still being thrashed out and debated by early adopters in the blogosphere and at conferences. In a sense, nobody properly understood feature branches yet. If you read all the discussions and debates at the time, we didn’t even agree about exactly what feature branches were. Martin Fowler, Jez Humble and others were working with the assumption that they referred to branches lasting several days or even weeks, while people such as Adam Dymitruk and myself went by a much broader definition that basically amounted to what we would call a pull request today, albeit with the proviso that they should be kept as short as possible.

These days, of course, everybody (or at least, everybody who’s worth working for) uses Git, so that particular question is moot, and we can now freely discuss best practices around branching and merging as mature adults.

The state of the question in 2017.

I’m generally somewhat more sympathetic to Martin Fowler’s position now than I was six years ago. Most of his concerns about feature branches are valid ones. Long-lived feature branches can be difficult to work with, especially for inexperienced teams and badly architected codebases, where big bang merges can be a significant problem. Feature branches can also be problematic for Continuous Integration and opportunistic refactoring, so they should very much be the exception rather than the rule. Feature toggles can also provide considerable benefits. You can use them for A/B testing, country-specific features, premium features for paying customers, and so on and so forth. I even started writing my own .NET-based web framework built around feature toggles, though as I’m no longer working with .NET, it’s not being actively developed.

However, many of my original points still stand. Short-lived branches, such as pull requests, are fine, and should in fact be the default, because code should be reviewed before it is integrated, not after the fact. Furthermore, if you’re using feature toggles solely as a substitute for branching and merging, you are releasing code into production that you know for a fact to be immature, untested, buggy, unstable and not fit for purpose. Your feature toggles are supposed to isolate this code of course, but there is always a risk that the isolation could be incomplete, or that the toggle could be flipped prematurely by mistake. When feature branches go wrong, they only go wrong in your development environment, and the damage is relatively limited. When feature toggles go wrong, on the other hand, they go wrong in production—sometimes with catastrophic results.

Just how catastrophic? The feature toggles article on Martin Fowler’s own website itself cites an example where badly implemented feature toggles cost one company $460 million in just 45 minutes. The company concerned, we are told, “went from being the largest trader in US equities and a major market maker in the NYSE and NASDAQ to bankrupt.” To be fair, there were other problems—they relied extensively on manual deployment processes, for example—but it is still a cautionary tale for anyone who thinks feature toggles should always be viewed as a Best Practice without exception.

Of course, not all feature toggles carry this level of risk, and in many cases, the effects of inadvertent exposure will be mostly harmless. In these cases, feature toggles will indeed be the better option, not least because they’re easier to work with. But in general, when deciding whether to use a feature branch or a feature toggle, always ask yourself the question: what would be the damage that this feature would cause if it were activated prematurely? If it’s not a risk you’re prepared to take, your code is better off on a separate branch.

Check-in before code review is an antipattern

I’ve never been that satisfied with most explanations that I see on the Internet of why Git is better than Subversion. Usually they wax lyrical about distributed versus centralised workflows or the advantages of branching and merging, but I find that kind of misses something because it takes way too long to get to the point. The question is, what is Git’s biggest advantage, in business terms, over Subversion?

The answer is quite simple. Git supports workflows that Subversion does not, that have significant benefits for your code quality, team collaboration and knowledge sharing.

There are a few such workflows, and the ones that have become popular all have one thing in common. Code gets reviewed before it is merged into the main codebase, rather than waiting till after the fact.

In actual fact, Git’s flexibility about when you conduct code reviews leaves Subversion dead in the water. Pull requests on a web-based Git server allow you to not only review code before it is merged, but to involve the whole team in the code review and even to carry out code reviews on work that is still in progress if you’re that way inclined. In effect, every task, every user story, every feature becomes a complete conversation.

To be fair, you can adopt this workflow with Subversion, using either task branches or patches, but Subversion makes branching and merging so clunky, user-unfriendly and error-prone that it simply isn’t practical, and besides being similarly clunky, submitting patches blocks further work until your submission has been reviewed. Consequently in practice, on most Subversion-based projects, commit-before-review is the norm, and review-before-merge is only used on the most high-impact, high-risk work. And it shows—trunk-based Subversion-hosted projects almost always have a far, far lower quality than pull request-based Git-hosted projects.

Why the difference? Simple. In the commit-before-review workflow, every check-in becomes a fait accompli.

This can be a recipe for disaster.

If bad code gets checked in, you have to explicitly ask for it to be backed out or modified—and you have to follow through to ensure that this is done. Sometimes it can’t be backed out or modified, because other code has been checked in that depends on it. Contentious design decisions can all too easily be steamrollered in without any discussion—and in the event of a disagreement, backing them out can all too easily be filibustered.

There’s also a strong psychological pressure to let standards slip as well. When you’re checking in code as a fait accompli, it’s all too easy to check in ill-thought-out variable names, poor test coverage, that doesn’t follow the team’s coding standards, with useless commit summaries to boot. It’s also far too easy for your reviewer (singular—you seldom if ever get more than one person reviewing your code in this model) to decide to pick his or her battles and only focus on the more important things.

On the other hand, when the default action is “reject,” as with pull requests, the onus is on you as the author of the code to prove that your changes are fit for purpose. This gives you all the more incentive to get things right—to stick to the team’s agreed coding conventions, to write tests, to separate concerns correctly, and so on. It also means that you pay more attention to making your code readable and your commit summaries informative. After all, your team-mates (plural) are going to have to make this judgment call based on whether they can understand what you’ve done or not.

Another significant benefit of pull requests is that they dramatically improve knowledge sharing among the team. A new developer may submit a pull request that reinvents methods that already exist, or that violates coding standards that they didn’t know existed. Pull requests are an opportunity for education here—you can easily point them in the right direction. On the other hand, with commit-before-review, because it is so easy to overlook things such as these, opportunities to educate your team-mates get lost.

One other thing bears saying here. Even if you do manage to adopt a pull request-like workflow with Subversion, you still face one major limitation: changesets in Subversion are immutable. With Git, if the commit history of a task branch makes it difficult to review, you can always ask the author to revise it—clarifying commit summaries, squashing superfluous changesets, and perhaps (for experienced Git users) even teasing changesets apart. You can do this quite effectively with the git rebase --interactive command. With Subversion, on the other hand, once it’s in, you’re stuck with it.

Pull requests are not the only advantage that Git has over Subversion. But they are the most important and the most business-critical. A pull request-based workflow with Git will give you a codebase that is much cleaner and much more robust, with fewer nasty surprises and an informative and useful source history. Trunk-based development in Subversion, on the other hand, can leave you with a very bad taste in your mouth. For this reason, sticking with Subversion raises serious questions about the quality and maintainability of your codebase.

Mercurial is doing better than you think

Here’s something interesting that I came across the other day. Apparently, Facebook has recently switched from Git to Mercurial for source control for its internal projects, and on top of that, they have hired several key members of the core Mercurial crew, including Matt Mackall, the Mercurial project lead.

Their main reason for this decision is performance: some of Facebook’s repositories are so large that they are bringing Git to its knees. They looked at the Git source code and the Mercurial source code and felt that the latter would be easier to fine tune to give them the performance that they needed.

This is an interesting development. With Git now on the verge of overtaking Subversion to the most widely used SCM in corporate settings, it’s tempting to write off Mercurial as something of a lost cause. Back in January, when I posted a suggestion on the Visual Studio UserVoice forums that Microsoft should support Mercurial as well as Git in Team Foundation Server, I thought it would be doing well to get three hundred votes and then plateau. But as it stands, it’s now passed 2,500 votes and still going strong, making it the tenth most popular open request on the forums and the second most popular request for TFS in particular, with nearly twice as many votes as the original DVCS request had. Git may have cornered the market for open source collaboration, but its unnecessarily steep learning curve and often pathological behaviour make it surprisingly unpopular with the majority of developers for whom public collaboration on open source projects is not a priority.

It’ll be interesting to see the outcome of this, but one thing is for certain: it’s not all over yet.

The writing is on the wall for Subversion as Git takes over

This time last year, the Eclipse Community Survey noted that Git’s market share had risen from 12.8% to 27.6%, while Subversion had dropped from a seemingly unassailable 51.3% to 46.0%. This year’s survey results, published yesterday, note that this trend has continued: Git/GitHub has risen to 36.3% while Subversion has dropped to 37.8%. Subversion may still be in the top slot for now, but its lead is tiny and it is rapidly losing ground.

Other data sources, such as itjobswatch.co.uk, paint a similar picture. Look at how demand for Git skills has grown in recent years:

Git demand according to itjobswatch.co.uk

Job trackers such as this tend to give Subversion a bigger lead, because they focus on the rather more conservative corporate market and purposely ignore the world of hobbyists and open source developers. But even so, the trend is clear. Thirteen percent of UK programming jobs now ask for Git experience. Seventeen percent ask for Subversion, but the gap is narrowing rapidly and it is almost certain now that Git will overtake Subversion in corporate settings by the end of this year.

We are now fast approaching the point at which not using Git will increasingly hurt developers and companies alike. As a developer, a lack of Git experience is now starting to call into question your willingness and ability to keep your skills up to date. As a company, if you don’t use Git, you will find yourself competing for good developers against companies who do. Once you’ve got used to Git, Subversion is a painful experience, and fewer and fewer competent developers will be prepared to put up with it given the choice.

Then there are third party products and services. Already we are seeing an increasing number of these coming on the market which only support Git — GitHub and Heroku being two prominent examples. Those that do support other alternatives are increasingly treating them as an afterthought, with only limited features. Even if you’re a Microsoft-only shop, Git is getting harder to avoid. Entity Framework and ASP.NET MVC, along with several other Microsoft-run projects, are now hosted using Git. Team Foundation Server is introducing Git as a first-class source control option, complete with the tight end to end integration experience which TFS users value so much. Windows Azure makes Git one of its main avenues for deployment.

Not only has Subversion fallen behind, its development is painfully slow. Subversion 1.7, originally scheduled for the spring of 2010, was only released in October 2011 — a year and a half late. Subversion 1.8 is also a year late and has had its scope cut back by a half. Subversion 1.9, tentatively slated for this time next year, could well see even more significant delays, especially if the shift in demand forces its key players to divert resources to Git-based products and services. Subversion 1.10, the first to promise some genuinely useful new features (shelving and checkpointing), is “speculatively at best” scheduled for mid-2015. It is quite possible that it may never be released.

Subversion has no future. It is old, obsolete, decrepit technology and you need to be planning for its end of life. Git, on the other hand, is rapidly becoming the lingua franca of source control throughout the entire software industry. Love it or hate it, but if you don’t take it seriously, it won’t be long before the industry doesn’t take you seriously.

On Git’s growth and the reliability of the Eclipse survey

I’ve had quite a few comments on my blog post about Git questioning my conclusions, so I thought I’d better follow them up, since my original post didn’t make it entirely clear why I’d come to the conclusions that I had.

Why did I write this post?

Because Git’s growth exceeded my best-case expectations by a significant margin.

Up to now there’s been a possibility (and quite a plausible one at that) that Git’s popularity might be an illusion, thanks to the echo chamber of the blogosphere and its noisy fanboy culture, and that its usage might be largely restricted to open source projects, Ruby on Rails shops, and hobbyists, with little traction in corporate settings. This is what the Subversion folks would like us to believe:

Greg Stein, an Apache vice chairman and vice president of the Subversion project, suggests that Git’s rise may not be a deep one. “I think a lot of what we’re seeing is that Git has got a lot of mind share. [But] I don’t know if it’s necessarily true that a lot of corporate development shops are switching [to Git],” he says.

In the past year or so, there’s been some anecdotal evidence coming to light against this position, but no conclusive quantitative surveys up to now showing the extent of Git’s adoption in the enterprise. But the past three years have given us enough data to start coming up with some testable predictions as to what this year’s results would be, to confirm whether or not this might be the case.

In the end, Git’s result was significantly higher than even my most optimistic prediction, which leaves little or no room for argument on this one. Any claims that it hasn’t yet “crossed the chasm” are now quite frankly preposterous. The myth that it is not suitable for enterprise settings has been well and truly busted.

Busted!

Isn’t this just fanboyism?

I’d just like to clarify in response to this one that I’m not a Git fanboy. I still have a strong preference for Mercurial, and have no intention of stopping using it in the immediate future. It is much more polished, much easier to learn, much more predictable, and all in all a much better advert for what distributed version control should be. It’s also much more powerful than Git users give it credit for, and much less intimidatingly elitist. In any case, I can always use hg-git to integrate it with Git repositories if necessary.

Git versus Mercurial is The Hunger Games of programmingPersonally, I feel that the software development world has been bullied into accepting Git instead by people who have insisted on treating the whole DVCS scene as if it were The Hunger Games, and if they’d only just recognised that the first and most important rule of marketing is that first impressions count the most, we’d have got to where we are now a whole lot faster.

In actual fact, many people who use Git are unhappy with it. It’s rather telling that Git not only has more “hates” on amplicate.com than TFS, but a higher ratio of “hates” to “loves.” Maxx Daymon says this:

Of the shops I know that have converted to git, it was driven by an aggressive few, and the majority of developers are and remain unhappy about it. They may get over it, but they certainly haven’t become git evangelists in the process.

Yes, Git has won, but it’s won a battle that it had no business fighting in the first place. Whether we like it or not, it’s well on its way to becoming the industry standard, so we just need to get used to it, like every other industry standard. (Heck, who in their right mind still thinks that XML was a good idea?)

Do the survey’s demographics affect my conclusions?

Two or three people asked me about potential sources of bias in the survey. While there will be demographic factors at work, as far as I can tell, these tend to reinforce my conclusions rather than undermine them.

The survey’s demographics are Eclipse users. Eclipse is a Java IDE, a whole lot of related tools, and the organisation (the Eclipse Foundation) that oversees their development. Java is, by some accounts, the most widely used programming language in the world, but it is not a trendy or cool language to work with like Python or Ruby (Android development notwithstanding), and Eclipse is not a trendy or cool IDE to use, so the chances are high that if you’re using it, you’re getting paid for it. In fact, the survey actually asked this question: only 3.8% of respondents were individuals, not associated with any organisation, and 6.7% were students.

The biggest area that is under-represented here is, obviously, Microsoft developers. This means mainly that we have little or no idea as to TFS’s market share. It also means that Mercurial usage may be under-represented too, but anecdotal evidence seems to suggest that there is a general drift away from Mercurial towards Git in the Microsoft world as well.

I wondered if Android development could be skewing the results, since Git is the de jure standard SCM for Android development, and Eclipse is the IDE recommended by Google. However, only 4.1% of respondents said that they are primarily developing mobile applications, so I don’t think that’s much of a factor after all.

All in all, it seems clear to me that we have a pretty enterprisey demographic here, certainly not one composed of early adopters and innovators. In fact, if anything, early adopters and innovators are probably under-represented. Besides, if this survey really had been influenced by early adopters and innovators, they would have shown a much higher result for Git last year as well.

There is one other issue to be addressed here. 84% of the respondents to the survey write code in their spare time, so perhaps they are early adopters and innovators after all? In actual fact, this wouldn’t make the slightest bit of difference to the survey outcome for the following reason. Any software development team will contain a mixture of early adopters, of pragmatists, of conservatives, and of laggards. But they will all use the same primary source control system. Early adopters in Subversion shops will tick the Subversion box wishing they could be ticking the Git box instead. Mercurial fans in Git shops will tick the Git box wishing they could be ticking the Mercurial box instead. Maybe some people in Git shops will tick the Git box wishing they could be ticking the Subversion box instead. And so on.

The fact remains that these are corporate developers working in corporate shops telling us what their companies are using, not what they are trying to get their companies to use.

Isn’t this an argumentum ad populum?

The implication here is that argumentum ad populum (the idea that the most popular tool is the best) is a fallacy.

This overlooks the fact that at a certain point, increased mindshare does start to produce tangible benefits. Development proceeds faster. More products and services are released that support it. It becomes easier to recruit developers who are familiar with it. It becomes easier to find training for it. The gap between it and the alternatives widens, making it an even more compelling option. The effect becomes self-reinforcing and can even end up with the leading product establishing itself as a standard. Economists call these factors “network effects,” and they are the reason why, for example, the inefficient, uncomfortable and wrist-trashing qwerty keyboard layout still has an iron grip on typing, and we have so much difficulty persuading people to consider technically better alternatives such as Dvorak or Colemak.

We’re seeing this with Git. An increasing number of services are coming online that only support Git: high profile examples include Github and Heroku. Bitbucket introduced Git support back in the autumn, only a year and a half after declaring that it would never do so. Windows Azure has first-class support only for Git and TFS, with neither Subversion nor Mercurial getting so much as a mention. Some of the Git IDEs may still be rough around the edges now, but what will they be like in a year’s time?

Another effect of increasing mindshare is that Git is turning into an industry standard. For open source, it has been the de facto standard for some time now for starters. For some areas, it is even the de jure standard. If you want to do Ruby on Rails development, or Linux kernel development, or jQuery plugin development, or Android development, or Node.js development, for example, it’s simply not up for discussion: you use Git. This may not yet be the case for whatever area of development you are involved with, but it’s only a matter of time.

Don’t tools and languages come and go ridiculously quickly?

One objection was that fashions in software development tools and languages change very quickly and developers are a pretty fickle bunch.

In actual fact, this is only true for languages and tools that are currently in vogue with the cool kids. Once something approaches the top of the mainstream, it’s a completely different ball game.

Let’s consider programming languages as an example. At the moment, there is a lot of excitement around languages such as CoffeeScript, Clojure and Scala. CoffeeScript recently entered the top ten on Github, for instance. However, if you look at the Tiobe Index, you’ll see that these languages are all fairly low down the overall popularity list. Question tags on Stack Overflow and searches of job listings paint a similar picture. The list of top ten programming languages in industry changes at a rate that can best be described as glacial. Tiobe Software noted last month that the only change to their top ten list in the past eight years has been that Objective-C is in and Delphi is out. This is understandable, because once code has been written, it tends to stick around.

Source code management tools are another area where change at the top is slow. Again, you can understand why. Your source control system has been entrusted with the safekeeping of your intellectual property — your assets, your team’s entire commercial value. It also sits at the heart of your processes and infrastructure, and changing it for another system is a major undertaking and not a decision to be taken lightly. When you do change, you do so as a carefully planned exercise spanning months or even years.

For this reason, I didn’t expect Git to poll much more than about 20% or so this year. The fact that it has gone from less than 13% market share to more than 27% in a single year is quite extraordinary, and not to be taken lightly. It’s now up near the top of the mainstream — a position from which it won’t easily be shifted — and if its current rates of growth continue, it’s on course to overtake Subversion to become the most widely used SCM industry-wide within a matter of months.

What results did I expect, and why?

Based on the figures from previous years, I expected Git to poll somewhere in the 18-20% range.

The best possible scenario would have been where Git grew industry-wide at roughly the same rate as Github. Github reached one million users in mid-August, and today, it has grown to just over 1.7 million — a year on year growth of about 90%. If Git usage industry-wide continued at this rate, we would expect it to score about 24% in the survey. However, due to the natural conservatism of corporate development shops, I thought this would be an upper limit — and a pretty unlikely one at that.

On the other hand, if Greg Stein’s analysis proved to be correct, and Git adoption really was hitting a roadblock in corporate settings, I would have expected to see its growth over the past year to have slowed: perhaps somewhere in the region of 16%, mainly because Git has been close to saturation point among high-profile open source projects and Rails shops for quite some time now. Open source projects which have only recently migrated, such as Django and MediaWiki, are fairly late adopters in this respect, and ones that are still clinging to Subversion, such as WordPress and Apache, are notoriously conservative.

The results for Mercurial took me by surprise, however. I expected it to have grown slightly — possibly as high as 6% — or at the very least to have held steady. The fact that it has lost nearly half its users was most disappointing.

What about interoperability tools?

Mercurial and Git do work fairly well together via the hg-git extension, though you may find one or two corner cases where it may not be totally seamless. For example, Git supports octopus merges (where you can merge several branches into one in a single commit) but Mercurial doesn’t, and I’m not sure whether hg-git could handle that consistently. I did also end up with duplicate changesets on one occasion a couple of years ago, though I never got round to looking into exactly why, and it may have been due to a problem that has since been fixed.

Both Git and Mercurial can be used as a front-end to Subversion, and if you are stuck with Subversion I strongly recommend it if possible. However, there are limitations to what you can achieve with these tools, and you lose out on the significant benefits to team workflow that DVCS tools can provide. They are also a very leaky abstraction: Subversion only supports a single line of development on each named branch, for instance, so you will have to make much more extensive use of rebase, which is a riskier and more difficult operation than merging. There are also plenty of ways in which Subversion users can really mess with your day, such as such as renaming a branch, or moving a subtree from one branch to another, or using file externals.

Git can be used alongside TFS with git-tfs, though I wasn’t able to get it to work due to networking issues so I don’t know how well it works.

Should you switch to Git?

My whole point of these posts has been that you can no longer dismiss DVCS out of hand as only being suitable for open source and noisy fanboys, and that you need to take it seriously, no matter how big or small your team. Whether or not you adopt Git wholescale is a different matter that depends on several factors.

As I said, switching your source control system is a major undertaking and not one to be taken lightly. On some existing projects, especially those with a lot of deeply entrenched processes and infrastructure, or long-term contracts in place, it may not even be possible. For projects whose main phase of development is over and which are now in maintenance mode, it’s probably not worth it. In some cases, specialised regulatory requirements may pose additional constraints, though personally I’m sceptical about claims to that effect: contrary to what some vendors will tell you, it’s certainly possible to be ISO9001 or CMMI compliant using Git.

However, it’s becoming abundantly clear now that the writing is on the wall for Subversion. For now, it’s still the most widely used option, and its decline over the past year may have been slower than I expected, but if current trends continue, it will be overtaken by Git industry-wide in about a year’s time. It is a very basic, spartan system even by centralised source control standards and its development is slow. No doubt it will take several years to die off completely (eight percent of developers are still using CVS, believe it or not) but it’s only a matter of time before you will need to plan for its end of life.

Despite its loss of mindshare, I still think Mercurial is worth considering as an alternative to Git. I do wonder how long it can sustain enough momentum to remain viable, but should you need to, switching from Mercurial to Git is almost trivial thanks to hg-git. But certainly for new projects, you need a very convincing reason to stick with old-school centralised source control now. And these reasons are starting to run out.