james mckay dot net

because there are few things that are less logical than business logic
04
Oct

Perforce Merge: a very nice free replacement for TortoiseMerge

No matter which source control tool you’re using, sooner or later you’ll encounter a merge conflict. When this happens, a decent graphical merge tool is a must-have.

There are two different types of merge tools. Two-way merge tools show you your version of the file and the other person’s version of the file side by side. Three-way merge tools also show you the original file in the middle. This helps clear up a lot of confusion, since you can see what the original file looked like before anyone did anything to it.

So far, I’ve been using TortoiseMerge as my merge tool of choice, since it comes with TortoiseSVN, it’s familiar, it’s reasonably usable, and it is not too ugly. The only downside is that it’s two-way, rather than three-way. TortoiseHg gives you kdiff3 by default instead, which is a three-way merge tool, but it’s an absolute eyesore and its usability leaves a lot to be desired. Up to now, I’ve always switched it out in favour of TortoiseMerge.

Recently I came across the Perforce merge tool P4Merge (hat tip: Novaleaf Game Studios) and I must say that I’m impressed. It gives a very clear, intuitive view of what’s changed, with a text editor underneath that lets you resolve the conflicts easily. The icons to the right hand side of the text editor allow you to select which version you want to cherry pick. Oh, and visually, it looks fantastic.

Perforce merge tool in action - click to view full size

P4Merge comes with the Perforce client tools which are a free download: if you’re not using Perforce itself for source control, select only the merge tool on the installation wizard and deselect everything else.

image

Once you’ve installed P4Merge, TortoiseHg will automatically detect it and list it as an option in the TortoiseHg configuration dialog or merge wizard. If you’re using Subversion or Git with their respective Tortoises, you need to specify the command line in the options dialog: Using a cool merge tool with SVN or GIT tells you how. Team Foundation Server is somewhat more complicated, but still doable: Using P4Merge with Visual Studio 2008 and TFS explains how to tackle it.

The only downside next to TortoiseMerge is that the option to cherry-pick changes only works on the block level, rather than on a line-by-line basis. However, since the resolution panel at the bottom is of course a free-form text editor, you can easily copy and paste as necessary, so this is no big deal. I think I’ll be using it as my merge tool of choice from now on.

30
Sep

Solving the tangled working copy problem with hunk selection and Mercurial Queues

Programming is full of dilemmas.

You’ll be deep in concentration, working on your new application, adding some new payment options, when all of a sudden you notice a potential race condition in a nearby method that might cause customers to be billed twice. You know it’ll take all of two lines to fix, so you pop in the fix and carry on with your new functionality.

A few minutes later, you notice that another method is pulling in an RSS feed from a hard coded, and outdated, source, so you stop to extract it to a configuration setting and use the more up to date feed.

You finish fixing up your new functionality, then you come to check in your code. Now, you have a problem. You have three separate changes tangled up in your working copy.

Most developers would simply bundle all three changes into a single commit, possibly only leaving a commit summary (you do fill in your commit summaries, don’t you?) saying “Added some new payment options to application.” This is misleading, because it doesn’t say anything about the race condition or the RSS fix.

You could say “Added some new payment options to application, fixed a race condition and used a more up to date feed.” But this doesn’t make it all that clear which part of your commit fixes which problem. Someone looking through your history six months later might see your race condition fix has introduced a regression, not realise that it is there to fix a race condition, and revert it to what it was before.

You really need to observe the Single Responsibility Principle, and split the three tasks into separate commits.

So, what do you do?

With traditional source control tools, you are likely to be told, “You should have shelved your changes, reverted your working copy, and performed these tasks as separate commits. Or, if your source control tool doesn’t support shelving, you should save a patch, then revert your working copy, then make the new change, then re-apply the patch.”

There’s just one problem with this bit of advice. It is inefficient, and a total mismatch to the way your average programmer’s brain works.

To see why, let’s rewind your last half hour of coding and start again.

You’ll be deep in concentration, working on your new application, adding some new payment options, when all of a sudden you notice a potential race condition in a nearby method that might cause customers to be billed twice. You know it’ll take all of two lines to fix, but you need to keep these changes separate.

So you shelve your changes, revert your working copy, getting prompted to save/reload/merge your files in the process, and then Visual Studio insists on reloading your entire solution because you had changed something in the .sln file. And since your solution contains more than three projects and they reference more than two assemblies that aren’t in the GAC, it takes forever to reload and you’ve got distracted onto something else while you’re waiting.

By the time you manage to start editing your project again, you’ve been completely knocked out of the zone, and you’ve forgotten why you shelved your changes in the first place.

You see? All the so-called best practice advice about shelving, reverting your working copy, and all that, overlooks one very important fact about programming, namely that it is a mentally intensive discipline that often requires you to juggle several complex details in your mind at once, and even small diversions, such as having to save files and wade through menus to find your shelving tool then think of a name for your shelve set, can have a detrimental effect on your workflow. It adds to the mental burden on you and makes your job more difficult. It’s not a best practice at all, but a workaround for the fact that you don’t have the right tools for the job.

Wouldn’t it be better to just to get the changes down as you notice them and then use a tool that lets you sort out your commits later, going through all the changes you’re checking in, cherry-picking them into a series of separate patches?

Git users wax lyrical about the index, or staging area, because it is designed to solve just this problem. It provides an intermediate store between your working copy and your history, where you can stage your changes, not just one file at a time, but one hunk at a time, using the command git add -p. Once you’ve staged your changes in this way, you can then commit them as a separate, logical change set.

Mercurial has a similar feature in TortoiseHg called “hunk selection.” By double-clicking on a change in the “Hunk selection” tab on the commit dialog, you can include or exclude it from the check-in. If you’re a command line freak, the record extension does something similar, and the crecord extension allows you to take it down to the line-by-line level.

image

You can click on “Commit preview” once you’re done to see what’s going to go in your commit.

There’s just one problem with all this though. As Eric Sink points out, you’re checking in a version of your code which you’ve never tested. This is a bad practice, and it can bite you if you ever need to run git/hg bisect to track down a regression.

So let’s sum up what your options are so far.

  • Check in everything in a single commit at once. This is bad practice.
  • Use git add -p or hg record/TortoiseHg’s hunk selection to separate out your changes into separate commits. This is also bad practice.
  • Use shelving and patches to separate out your changes. This is a hack, which slows you down and risks knocking you out of the zone and making you lose track of your changes altogether.

So is there anything we can do to fix this? As a matter of fact, it turns out that there is.

One of my favourite features of Mercurial is the mq (Mercurial Queues) extension. This may sound a little esoteric, but what it does is quite simple. You can put a whole series of commits into a separate staging area, where you can edit them, reorder them, apply them, unapply them, chop and change them, split them up or combine them together, and of course, most importantly, run your unit tests on them, to your heart’s content before applying them to your master repository.

Let’s just say I am working on some changes to my Comment Timeout WordPress plugin. I’ve done two different things: updated the version number to 2.1.2, and tidied up some code formatting. I want to separate these into two different commits. First of all, I select the hunks that I want to go into the first commit, and then I type a name for the patch into the “QNew” box (keep this short, a couple of words should do):

image

You’ll note that the “Commit” button changes to “QNew” to indicate that your next commit goes into the patch queue. Clicking this will automatically show you the patch queue and change the button to “QRefresh”:

image

You can change the message, or edit the files, or select and unselect hunks to your heart’s content, then click QRefresh. Then you can add a second commit by typing another name into the QNew box:

image

Clicking the “QNew” button creates a second patch:

image

Okay, so now we have a whole series of patches. It’s a bit like the Git index, except that rather than having just one staging area, you have several, all stacked one top of each other. In the Repository Explorer, these revisions appear as a regular part of your DAG:

image

The yellow label “qparent” indicates the parent revision on top of which the patch queue is being applied; “qbase” indicates the first patch in the queue; “qtip” indicates the last; and the blue labels give the names of the patches. You could push them to another repository if you wanted, but I don’t recommend this. Keep them on your own machine for the time being.

Now that we’ve separated out our commits into a series of patches, we can get on with the job of placating the people who are worried about best practices. Namely: testing each patch before applying it.

First, double click on “[qparent]”:

image

You’ll note that our two patches have both dropped below the line, and they’re now greyed out. If you take a look at the repository explorer, you’ll see that there’s no sign of them:

image

The last revision has been marked in bold to indicate that that’s the one where your working copy is at.

If you double click on “tidy up” it will move above the line and turn blue again, to indicate that your working copy has been updated to this version:

image

That patch is now where your working copy is at. Do whatever testing you want to do on it, then click on the next one to apply it:

image

Once you’re satisfied that all your patches are ready, right-click on any of them and choose “Finish applied”:

image

Hey presto! Your work is now all committed to your repository, ready to be pushed, pulled or otherwise shared with the big wide world.

image

There are other things you can do with patches in your queue which I haven’t covered here, such as reordering them, or combining two or more of them into one.

Patch queues and hunk selection are two extremely powerful features of Mercurial. While they require a little bit of care and attention in order to adhere to best practices, this is no more arduous than the discipline needed for any source control tool, and they can provide a significant productivity boost, simply because they let your tools work around you rather than forcing you to work around your tools.

20
Sep

On code generation

A lot of developers love code generation. Some pretty smart ones hate it. The following is my observation on the matter.

Code generation done right can save masses of time. Code generation done wrong has the exact opposite effect.

Here is what code generation done wrong looks like. I am asked to help out with your project, and told that it’s corrupting data. After a bit of rootling around, I find two hundred business service classes, each containing numerous very similar looking methods that all follow this pattern:

public bool SaveWidget(Widget widget) {
    try {
        dataSession.Save(widget);
        return true;
    }
    catch (Exception) {
        return false;
    }
}

Pffft. Pokémon exception handling all over the place. No wonder it’s corrupting data.

If you’d left your templates in your solution, with a comment explaining where to find them, I could get rid of Pikachu and Bulbasaur and friends in five minutes. But since you didn’t, I have the thankless task of spending the rest of the week going through the whole lot by hand.

Here’s how to do code generation right:

  1. Do include a header at the top of each autogenerated file, indicating (a) that it has been autogenerated, and (b) what it was autogenerated by.
  2. Do include your template files and/or scripts in source control.
  3. Do re-generate your autogenerated files as part of your build process, before running your unit tests.
  4. Don’t edit autogenerated files manually. Use partial classes and partial methods instead.

By following these four simple rules, you can ensure that if someone comes to your project at a later date and finds that there is a fundamental flaw in your generated code, they can easily fix it.

17
Sep

Diaspora

I was hoping that Diaspora would prove to be a Facebook killer, but somehow, I don’t think that’s going to happen. Their choice of technology stack — Ruby on Rails with a MongoDB back-end — has seen to that.

Rails is something of a darling among geeks, and it is great for web applications that you only install on your own servers. So too are the plethora of NoSQL databases that seem to be all the rage among that kind of crowd these days. However, they are totally unsuitable for distribution to the general public.

Why? Simply because most cheap’n’nasty web hosts do not support them.

This is something that the WordPress core developers understood. Their decision to continue supporting PHP 4 long after its end of life brought howls of derision from WordPress plugin developers everywhere, but they were thinking first and foremost about their users — people with only limited computer skills who would deploy WordPress on their own shared hosting accounts. They wanted to get the widest possible audience.

Diaspora should be targetting run-of-the-mill Facebook users, not geeks. If it had been written in PHP with a MySQL back end, it would be more likely to see widespread adoption. However, by opting for Ruby on Rails, they’ve placed a barrier to adoption in front of the kind of technically-capable-but-not-particularly-geeky users on a tight budget who would otherwise set up Diaspora seeds for local schools, churches, clubs and so on. They risk turning it into a geek ghetto.

Sigh. Perhaps I’ll just have to put up with Faceborg after all.

04
Sep

Send patchbombs to the mailing list, not pull requests to the project lead

I sent a pull request to the lead developer of an open source project that I’m starting to contribute to. This is the default that both github and bitbucket give you, and the first thing that newbie open source contributors on those sites will think of doing.

I got this response from him:

I’d *massively* prefer patch submission via patchbomb to the … list – that’s generally where they come in, and sometimes people other than me find potential problems in the patch. Can you do that?

D’oh! My bad! It’s the same kind of thing as the rule that you should ask the whole community, not just one of its members. Admittedly I had noticed the existence of the mailing list, but only after I sent off the pull request, so his response did not surprise me.

Moral of the story: if you want to contribute to an open source project, look for a mailing list before you do anything else. And if there is one, use it.

23
Jun

Phone faux pas

This blog entry is directed at two individuals in particular. Probably neither of them read my blog though, but those of you who do should take great care not to emulate either of them.

First, to the gentleman who was at the rear of the central section of coach number eight in the 18:32 Southern train from London Victoria to Southampton Central and Bognor Regis this evening, who got off at Three Bridges.

Your favourite TV programme as a child may have been Top Cat. Your favourite TV programme as a thirty-something adult may still be Top Cat. You may think it’s cool to have the Top Cat theme tune as the ringtone on your mobile phone even though you’re in your thirties or forties. But when said mobile phone starts blaring out said ringtone at full volume in a train full of tired commuters for several minutes, it gets extremely annoying. Please, choose something less ingratiating.

Besides, what on earth were you smoking that you slept through a ringtone like that for nearly ten minutes? It must have been pretty potent.

Second, to whoever rang said gentleman no less than ten times round about 7pm this evening.

Unless your problem is genuinely important and genuinely urgent, ringing someone’s phone repeatedly when they don’t answer the first couple of times is rude. They may not be in a position to answer, and by calling them over and over and over and over again for several minutes, you are sending out a signal that you are a demanding, obnoxious type with no respect whatsoever for other people’s personal space. Just leave a message, and if they agree that it’s important, they’ll get back to you. Besides, you could be inflicting the theme tune to Top Cat on a train carriage full of tired commuters on their way home, because the person you’re calling is too non compos mentis to answer.

21
Jun

TortoiseHg as a github client on Windows

(Update: I’ve updated these instructions for Mercurial 1.6/TortoiseHg 1.1.)

I’m going to get really controversial here and say that I think Mercurial is better than git. My reasoning (as with the reasoning of everyone else who takes sides in this particular debate) is entirely subjective, so we won’t belabour the point here too much. Nevertheless, some of us do have a preference for one over the other, and many Subversion refugees like me who do most of their work in Windows tend to lean towards Mercurial.

But there’s no denying that github is fast becoming the Facebook of open source programming (albeit hopefully without the unethical bits, Farmville, and people tagging you in embarrassing photos for all and sundry to see), and if you want to strut your stuff as a developer, that’s the place to do it. Github is, of course, a hosting facility for git repositories, as one would expect of a site whose name says what it means and means what it says.

Fortunately, it is quite possible to use Mercurial as a client against github repositories via the hg-git extension, and you can pull and push from one to the other pretty much losslessly.

However, setting it all up on Windows is not entirely straightforward, and there doesn’t seem to be a decent guide to it anywhere on the Internet: most of the instructions that you read assume that you’re using either (a) Linux or a Mac, (b) the command line, or (c) both. You also have to figure it out from various places all over the web, and searches on Google and Stack Overflow proved to be surprisingly fruitless. Furthermore, the most comprehensive howto that I came across elsewhere contained several instructions that were just plain wrong.

So, after spending two solid evenings struggling against a myriad of error messages and cryptic dialog boxes, I finally managed to get it working, and for future reference (and anyone else who wants to know how), I’ve documented what I’ve found actually works for me as best I can.

1. Install TortoiseHg and hg-git.

Install TortoiseHg 1.1 or later. If you are using an earlier version, upgrade: these instructions may work if you don’t, but I can’t make any guarantees.

I downloaded hg-git by cloning the repository. You can get it from either github and Bitbucket. The advantage of cloning the repository is that you can upgrade to the latest version quickly and easily by hg pull then hg update, or use the graphical tools if you prefer. You can also easily switch between the bleeding edge version of the code and a stable release if you like.

hg clone http://bitbucket.org/durin42/hg-git c:\abc\mercurial\hg-git

I downloaded hg-git into the directory c:\abc\mercurial\hg-git. If you put it elsewhere in your filespace, alter these instructions to suit.

2. Update to the appropriate version of hg-git.

If you are using TortoiseHg 1.1, you will need to use hg-git 0.2.3. If you ignored my advice to upgrade, and are still using version 1.0, you will need to use hg-git 0.2.1. Don’t use version 0.2.2: it doesn’t work with either version of TortoiseHg.

The official hg-git documentation tells us that we also need to download and install Dulwich 0.4.0 or later. The latest version of hg-git requires Dulwich 0.6.0. In any case, Dulwich is included with TortoiseHg (version 0.6.0 with TortoiseHg 1.1; version 0.5.0 with TortoiseHg 1.0) so you don’t need to do anything else there. Open up the TortoiseHg repository explorer on your clone of hg-git, choose the “Tagged” radio button to show only tagged releases, and update to version 0.2.3:

image

3. Configure Mercurial to use hg-git and an appropriate SSH client.

To do this, you need to edit your mercurial.ini file. You can get to this simply by choosing “Global Settings” on the TortoiseHg context menu in Windows Explorer, and clicking “Edit file” to bring it up in Notepad. Add the following lines to your configuration file:

[extensions]
hggit = C:\abc\mercurial\hg-git\hggit

[ui]
ssh = "C:\Program Files\TortoiseHg\TortoisePlink.exe"

The [extensions] section loads hg-git into Mercurial; the ssh option in the [ui] section specifies an SSH command line client to use to communicate with github. TortoiseHg gives us TortoisePlink, which works fine for me.

4. Create a public key/private key pair.

There are some instructions on github on how to create a public key/private key pair. Unfortunately, these don’t tell you that key pairs come in two formats: OpenSSH (as used by git itself and github), and PuTTY (as used by Tortoise Everything).

A simpler approach is to download PuTTY (you can get it from here) and use PuTTYgen to generate your key pair:

PuTTYgen screenshot

Once you have generated your SSH key, copy and paste the “Public key for pasting into OpenSSH authorized_keys file” into github. Save your public key and private key to your hard disk somewhere.

5. Start Pageant

Pageant is a program that stores all your private keys in memory, where the SSH client used by Mercurial, that we configured above, can find them. It comes with both PuTTY and TortoiseHg. You can set it to load in your private key(s) when you log on to Windows by creating a new shortcut in the Startup folder of your Start menu with this command:

"C:\Program Files\TortoiseHg\Pageant.exe" "c:\abc\github.ppk"

Note that if you don’t start Pageant first and load in your private key, you will not be able to push to github.

6. Clone a repository and start pushing!

You should make sure that you get the format of your repository URL correct. It should be:

git+ssh://git@github.com/your-github-username/your-repo-name.git

The rest from there on is all plain sailing. All being well, you should now be able to pull from your github repository and push changes back up as if it were a Mercurial repository.

Things to check if it goes wrong.

Now all this is a bit of a fiddly process, there is plenty of room for error, and some of the error messages you are likely to get can be a little bit cryptic. However, most of it was due to me trying things that weren’t properly documented, and they all boiled down to a few things that you can check if you’ve followed the above instructions:

  • Are you using the correct version of hg-git? While you can use versions later than 0.2.1, you need to use a later version of Dulwich than that which comes with TortoiseHg 1.0.
  • The “ssh” option in your mercurial.ini file should only specify the name of the executable, without command line options. Some articles tell you that you can fill in the path to your private key in this option. Personally, I couldn’t get this to work, so I just stuck with Pageant.
  • Is Pageant running?
  • Is your private key loaded into Pageant?
  • Do your public and private keys match?
  • Is your private key saved in PuTTY format? If you generated your key pair using git, as per the instructions on github, it will be saved in OpenSSH format instead, and Pageant can’t handle that.1
  • Have you specified the URL to your github repository correctly? The version I gave above works, while missing out various parts of the URL (e.g. using “github.com” instead of “git@github.com“) doesn’t.
1 You can tell the difference between a PuTTY private key and an OpenSSH private key by opening them in Notepad. An OpenSSH private key will start off looking like this:

-----BEGIN RSA PRIVATE KEY-----
<transmission line noise>
-----END RSA PRIVATE KEY-----

whereas a PuTTY private key will look like this:

PuTTY-User-Key-File-2: ssh-rsa
Encryption: none
Comment: imported-openssh-key
Public-Lines: 6
<transmission line noise>
Private-Lines: 14
<transmission line noise>
Private-MAC:
<transmission line noise>
17
Jun

Someone must be wrong about solar weather

The Daily Telegraph reliably informs us that we should expect the mother of all solar storms in 2013, which will unleash power cuts, the apocalypse, and Visual SourceSafe on us all.

New Scientist, meanwhile, reliably informs us that not only is the sun unusually quiet at the moment, with its current solar minimum carrying on much longer than normal, but that it could possibly remain that way for much of the rest of the 21st century.

One of them must be wrong.

11
Jun

The meaning of football

It’s World Cup time again. For the next few weeks, a certain sport will be celebrated, broadcast, and grossly over-hyped worldwide 24 hours a day by all and sundry.

It is to this particular game that the word “football” refers. Not, as they seem to think on the other side of the Atlantic, to some pretender to the name.

This one should be a no-brainer. Since the word “football” is a combination of the words “foot” and “ball,” logically it should be used to describe the game that involves the greatest amount of direct interaction between foot and ball. Running around with a vaguely haggis-shaped object under your arm just doesn’t quite cut it somehow, unless you reject the idea that words should say what they mean and mean what they say.

Perhaps in the interests of semantic integrity, we should rename American football to something like “arm haggis.” It’s more of a mouthful, and might confuse people into thinking it’s Scottish, but even so it’s a more accurate description of their particular sport than “football.”

10
May

Introductory videos on IOC containers

Dependency injection is one of those concepts in computer programming that looks weird and complex when you don’t understand it, but once you do, you wonder how you managed without it. A bit like distributed source control. Unfortunately, if you don’t understand it properly and implement it incorrectly, you can end up losing the benefits of it and end up wondering, “What was the point?”

For developers new to the concept, David Hayden has a series of video tutorials that provide what’s probably the best introduction to it that I’ve come across. He uses Microsoft’s Unity Container for most of his examples, but the concepts can easily be adapted for other libraries such as Ninject, Autofac, or Castle Windsor. He explains in some detail how to use them properly within both ASP.NET and WebForms, and demonstrates what kind of things they can achieve: