This is why I'm not on Facebook: 100 million Facebook users' info has been made available for free download: http://gu.com/p/2ty2k/tw 17 hrs ago

February 2010

04
Feb

Catching Exception is almost never justified and almost always harmful

I was doing an ad-hoc review of another developer’s code not long ago when I saw something like this:

try {
    return bool.Parse(GetSomething());
}
catch (Exception) {
    return false;
}

I gently pointed out to him that this is a bad practice. Apart from the fact that you can use bool.TryParse() instead of bool.Parse(), your GetSomething() method may be throwing exceptions indicating a rather more serious problem, such as your database being down.

Catching Exception is one of my pet peeves, but sadly it’s far too common, even among smart developers that I’d have expected to know better, cropping up in commercial products and open source projects alike. Part of the problem is the code samples in the MSDN documentation itself, which are littered with completely unnecessary try ... catch (Exception) blocks, that people copy and paste without thinking about it. But it’s also a quick and dirty hack — it’s easier to simply catch Exception and cross your fingers than to look up the documentation to find out exactly what you should be catching.

But this is reckless and dangerous. Catching exceptions inappropriately can lead to some very serious bugs in your code — serious, because you are deliberately ignoring them while they wreak havoc with your data. In one instance, I was asked to troubleshoot an application where a database upgrade had been botched and nobody had noticed for several days until the users started complaining that their changes weren’t being saved. You may also be ignoring misconfiguration, missing assemblies, external services being offline, and so on. And even if the effects aren’t serious, the bugs can still be particularly difficult to track down, as your logs will likely contain misleading error reports, if indeed they contain any error reports at all.

Catching general exception types without re-throwing them is almost never justified, and almost always harmful.

The correct approach to exceptions is to allow them to bubble up to the topmost level of your code, and handle them there by logging them and presenting an approriate error message to the user. For ASP.NET applications, this is the Application_Error event handler in your global.asax file, or perhaps an error logging framework such as ELMAH. For console applications, it is your Main method. For separate threads, it is the topmost method of the thread. And so on.

Well written code has very few try ... catch blocks. The most common case where you would have a general exception handler is when you need to roll back a transaction or otherwise leave your application in a consistent state when you re-throw:

try {
    BeginTransaction();
    DoStuff();
    Commit();
}
catch {
    RollBack();
    throw;
}

Aside: when you re-throw the exception, always use throw; here (which preserves the stack trace), not throw ex; (which doesn’t).

Apart from that, you should only catch specific exception types that you are both able and willing to handle meaningfully. Certainly, catching Exception should be treated as the nuclear option — and if there really is no alternative, you should always log the exceptions and rigorously justify your decision both in comments and in a code review. And next time you are tempted to write catch (Exception), ask yourself this question:

What would this code do if the exception were due to a botched deployment, an out of memory error, or a misconfiguration?

02
Feb

Are deletionists harming Wikipedia?

There’s a discussion over on the Colemak forums at the moment about the Wikipedia problem. It seems that, not content with having the article deleted on the grounds of non-notability a while ago, some Wikipedians are trying to eradicate every last mention of the layout from anywhere on the site. The deletion decision had eventually ended up as a redirect to a section on the Keyboard layout article, but it seems that even that’s been removed now, by a particularly argumentative individual who is rigidly and inflexibly applying his interpretation of the Reliable Sources policy.

Now as a satisfied Colemak typist I may be somewhat biased on this matter, but this one should be obvious. Colemak may be a pretty niche subject, but it has been covered a couple of times in the media—not a lot, but usually sufficient to at least get a “no consensus” decision in an AfD debate, which automatically defaults to “keep.” On top of that, it is included in X11 and every Linux distribution going. It’s one of only about half a dozen options for keyboard layout variant displayed on the installation screens of Ubuntu. It’s right in your face, not tucked away in some obscure and dangerous config file. Everyone who installs Ubuntu will be aware of it. Some of them will want to find out more about it. And they will expect Wikipedia to say something about it. But it won’t.

Of course, if it were just Colemak that were affected, I’m sure you could just dismiss this as a fanboy rant on my part, but this actually illustrates a much wider problem. With over three million articles, on everything from minor league ice hockey players to fictional foods in Babylon 5, Wikipedia is now the first place people turn to for information on anything obscure and only marginally notable. Wikipedia’s end users expect it to be an indiscriminate collection of information. Yet an indiscriminate collection of information is one of the things that Wikipedians are adamant that Wikipedia is not.

This is like being told that a problem in Sage or QuickBooks that is causing your tax return to be filled out with gibberish is not a bug, but a feature.

The problem is that there is a massive disconnect between Wikipedia’s users—casual visitors who often don’t even bother to create an account—and its overlords—the regular, active Wikipedians with edit counts in the thousands or even tens of thousands and an encyclopaedic knowledge and understanding of its policies. It is at its most striking in the whole inclusionist versus deletionist debate. And the deletionists are alienating a lot of would-be Wikipedians.

It turns out that this is one of the biggest criticisms levelled at Wikipedia by occasional editors. People come onto the site knowing nothing of Wikpedia’s policies, but plenty about some—possibly very niche—subject. They make half a dozen or so edits, then return a week later to find that their article has been deleted with no apparent explanation. Or perhaps it will be flagged with a deletion debate, crammed full of arcane and cabalistic abbreviations such as WP:NFT, WP:NOTE, WP:V, WP:WAX, WP:SOAP, WP:IAR, and so on, all pointing to Wikipedia’s byzantine and convoluted policies, guidelines and procedures. What kind of impression does this leave the casual editor? That Wikipedia is a hideout for a bunch of antisocial, bureaucratic teenage control freaks—a kind of online equivalent to the kids on the beach who kick the sandcastle you’ve just spent three hours building into your face. And since first impressions count the most, they will go off, never contribute anything else, and rant on blogs and forums about how insular and out of touch with Real Life these Wikipedians are.

Why is this harming Wikipedia? Because these are the people who contribute the overwhelming majority of substantive, meaningful content to the site.

This study by Aaron Swartz will be particularly enlightening to anyone who doubts this claim. His research on a data dump of Wikipedia indicated that most contributions of actual substantive content are made by new and casual users, many of whom never even create an account and most of whom only make a handful of edits to the site. Regular Wikipedians, on the other hand, tend to spend most of their time tidying things up—moving text around, correcting spelling mistakes, wikifying things—and deleting stuff.

I’ve sometimes looked at these deletion debates and wondered how many of the people voting for deletion with reference to obscure areas of Wiki policy even begin to understand the subject matter of the article under discussion itself. Some of the arguments for deletion of Colemak are laughable for starters. They’d have us belive that nobody uses it (a brief glance at the activity on the forums and the Facebook group and even the AfD debate itself will quickly dispel this notion); that X11 is an anarchic free-for-all where you could submit a patch containing a rootkit backdoor and it would be accepted; and that the only way to enable Colemak in Ubuntu is to edit some obscure and dangerous config file where it’s buried in a list of gazillions of options and a slight typo will make your computer unbootable.

Certainly, searches for reliable sources are usually cursory: no hits on Google News, no hits on Google Scholar, so delete. Blogs are automatically not considered reliable sources, even if they’re written by experts in the industry such as Tim Bray, Simon Willison or Jeff Atwood. In fact, Jeff Atwood’s Wikipedia entry also fell foul of the deletionists a year ago, when Stack Overflow was in public beta, which shows just how completely out of touch with reality they are. (Incidentally, web development is one area in particular where WP:RS is a very bad metric for notability, simply because it’s an industry where a lot of key activity happens at the grassroots level. The sources that web developers regard as reliable enough for practical purposes are generally high profile blogs like Jeff’s, while the academics writing papers on how to use lines of code per day as a productivity metric are frequently regarded as an irrelevance at best and harmful at worst.)

There’s also a lot of bluster and bullying goes on when the deletionists crop up. Throwing acronyms around sends a signal to newbies that they’re not welcome. If you Twitter about a deletion debate, you’re accused of canvassing and booed off. Anonymous accounts and new users are often regarded with suspicion as potential sock puppets. Most people find it hostile and intimidating, and perhaps even a bit childish, but the deletionists don’t care. They’re so obsessed with making Wikipedia what they think it should be that they’ve completely lost sight of the end users.