Someone's having a firework party just down the road. Not sure why -- I know it's the Fourth of July, but this isn't America! 3 hrs ago

June 2007

30
Jun

Another day, another OS reinstallation

After three weeks or so of running Ubuntu on my laptop at home as my primary OS, I reinstalled Windows on it yesterday evening. No doubt this is a move that will meet with howls of derision from everyone who expects me to be an über-geek, and the bearded sandal-wearing idealists who think that Microsoft should be nuked, but the fact is that I just don’t think much of Linux on the desktop. It’s more secure and more stable than Windows, and less prone to spyware and all that, and it has some great geeky features (I just love that 3D Sierpinski screensaver) but it has one big problem: visual aesthetics.

Besides Ubuntu’s depressing brown colour scheme, which makes it look like a plate of mince, the biggest problem is fonts. Ubuntu’s default out of the box fonts are ghastly, dumpy, squat monstrosities, and the rendering engines in both Gnome and KDE are pathetic, giving uneven stroke widths and nasty colour fringing even on the Windows core fonts, no matter what settings I used for the sub-pixel rendering. I think they must be using a similar approach to Apple, in attempting to preserve font shapes over and above on-screen crispness and readability, though Apple does it a lot better. Or maybe I’m just spoilt: once you have seen ClearType in action on Windows, the Linux sub-pixel font rendering seems pretty lame by comparison.

Another thing about Linux is software. I really missed Windows Live Writer, especially having used the new beta 2 version with its much improved WordPress support, and while I guess I could have tried installing it using Wine, I decided in the light of the fonts issue not to bother. There are a couple of equivalents available for Linux, such as Drivel, but they are nowhere near as slick as Windows Live Writer. I also much prefer Corel Draw (and Paint.NET for the simpler stuff) to the Gimp, Microsoft Office to OpenOffice, and of course I was missing out on Visual Studio.

This isn’t to say that I won’t be using Linux at all of course. I have been running Ubuntu servers on VMWare both at work and at home and I will almost certainly continue to do so. I don’t know if I’ll try a desktop installation of Ubuntu on VMWare though: when I’ve done this in the past it tends to get neglected somewhat, though it does occasionally come in useful for things such as testing cross-browser compatibility. However, I don’t think I’ll be making much use of it as a primary OS in the immediate future.

29
Jun

Case sensitivity is evil, but we still have to live with it

The issue of case sensitivity in programming languages is one of those religious wars that we developers get into every now and then. Some people—the kind who were brought up on Unix, C, and all that, swear by it. Others think it is the worst travesty to befall computing since software patents. It is especially prevalent in the .NET world, where you have two main languages: C#, which is case sensitive, and VB.NET, which isn’t.

I am not particularly keen on case sensitivity myself, though most of my work is done in case sensitive languages. It could be argued that it is an advantage that it trains the mind to pay close attention to small details, but there are already so many small details to take careful note of in code that programming does that anyway, and it just adds to the burden. It doesn’t necessarily make your code look any tidier either. In fact, it can actually cause more problems than it solves. In fact, will someone please enlighten me as to exactly what problems case sensitivity is supposed to solve in the first place?

Take for instance this C# code snippet:

class Foo {
    private int bar = 0;

    public int Bar {
        get { return bar; }
    }
}

Now this is all well and good — a private field encapsulated as a read only property. This is the kind of thing that you encounter daily when you are working with C# code. When I was more inexperienced I used to use this convention all the time: camelCase for the fields and PascalCase for the properties. However, one simple typo can spell disaster:

class Foo {
    private int bar = 0;

    public int Bar {
        get { return Bar; }
    }
}

Spot the difference? The getter for Bar, rather than returning the contents of the field, will now call itself, giving unwanted recursion and a stack overflow. And do you think IntelliSense makes it any better? In your dreams. I had this problem bite me several times simply because IntelliSense sneakily changed the case of bar to Bar and I didn’t notice, before I wised up and started prepending the private fields with an underscore:

class Foo {
    private int _bar = 0;

    public int Bar {
        get { return _bar; }
    }
}

Now this is a simple example, but there are other more complex ones that I could give. And because most people don’t tend to notice the exact case of identifiers, it is all too easy to end up getting the wrong one — or even, particularly if you are maintaining someone else’s code, to fail to notice that there is a wrong one to get in the first place.

Naming conventions can help with this. Both .NET and Java have specific standards, but even then there is still scope for ambiguity. You are supposed to write identifiers in PascalCase or camelCase depending on its visibility and purpose, with the first letter of each word in the identifier capitalised. However, in some cases it isn’t that clear whether you should consider some identifiers as one word or two. Do you write Filename or FileName, for instance?

It would be easy if all languages were case insensitive, like VB, Delphi or Fortran. Unfortunately, even these languages often have to communicate with other languages and platforms. In .NET in particular, you may have to slap a [CLSCompliant] attribute on your assembly one day so that it can interoperate with someone else’s code in a language on the other side of the Great Divide. When that happens, it you have a namespace called ee.cummings in one place and EE.Cummings in another, both VB.NET and C# will choke on it. Alternatively, you may need to port your code from a language that is case insensitive to one that is case sensitive, or vice versa.

The problem with case sensitivity is that it is so pervasive. All the C-like languages are case sensitive, at least in part1, as are Python and Ruby. That means nine of the current top ten programming languages. Some important cross-language protocols such as XML and SOAP are case sensitive. If you expose some methods as a web service, they are case sensitive. Some bits of URLs may or may not be case sensitive depending on the underlying operating system and the web programmer’s predilections. And then there are those times when you can’t quite remember off the top of your head.

To avoid problems, I operate with two basic guidelines, regardless of what language or platform I am using.

1. When choosing new identifier names, assume that the language is case insensitive. Don’t choose names for your identifiers that vary by case alone. This means that if the language is case insensitive, or your code has to interface with or be ported to such a language, you will avoid name collisions.

2. When using existing identifiers, assume that the language is case sensitive. Be consistent in the case you use when referencing identifiers. This means that if the language turns out to be case sensitive after all, your variables will be found correctly. IntelliSense and other similar technologies can help here, provided you have stuck to (1).

Love it or loathe it, case sensitivity is here to stay. We still have to live with it, so we might as well just get used to it. Still, it would be more bearable if programming languages and operating systems enforced both these rules, rather than just one or the other. This happens in the .NET framework when you mark your assembly with the [CLSCompliant] attribute, but this only extends to publicly visible members, so you still have scope for the problem outlined in the examples above. Even better would be for you to be able to choose for yourself which case sensitivity rules you wanted to use as a compiler or interpreter option.

1 PHP variables are case sensitive; functions and constants are case insensitive. Some implementations of JavaScript are stricter about case sensitivity than others.

20
Jun

Typing perfection?

I have given up on Dvorak once and for all. It does make for much more disciplined typing, but I found that just as I was getting up to speed on it, I was beginning to experience some discomfort in my right hand and arm. There are some nasty artefacts in Dvorak, perhaps the worst of which is the position of the L key, in the top right hand corner of the keyboard. Having to stretch your pinky as much as that gets really sore after a while. Since the main reason why I started looking into alternative keyboard layouts was that for the past two years I have been experiencing some general fatigue and mild discomfort on and off in my right arm in the first place, I thought that it would be prudent to take note. I was also finding it very uncomfortable to type URLs on my Kinesis keyboard, where said pinky has to do the Riverdance to handle the forward slash and the shift key for the colon, then move out of the way to let your right middle finger handle the “www”.

At the moment I am back on QWERTY at work and hating it. However, there is a very promising new kid on the block as far as keyboard layouts are concerned: Colemak. Unlike Dvorak, it takes QWERTY as its starting point and only shuffles some of the keys around, leaving almost all the punctuation and symbols and some of the less frequently used letters in pretty much the same place. This makes for a much more comfortable typing experience that is also much easier to learn, and it has none of the nasty artefacts of Dvorak either.

The Colemak layout

After only two or three evenings, I am already more comfortable with it than I was after three weeks of going completely cold turkey on Dvorak on my first attempt back in July 2000. It also seems that switching to and fro between QWERTY and Colemak will be much easier than switching back and fro between QWERTY and Dvorak. You can get full instructions on how to use it, and a Windows installer, from the Colemak website. Hopefully it won’t be too long before I am good enough at it to be able to use it at work too.

Update: I didn’t eventually switch to Colemak in the end. (See discussion.)

08
Jun

Another crack at Dvorak

Over the past week or so I’ve been having another go at typing Dvorak again. I’ve been getting rather frustrated recently at my long standing indiscipline and uncoordinated habits on a QWERTY layout, particularly in my right hand, and I’ve been anxious to ditch it in favour of something a bit more sane and logical for quite a while now, the problem being, of course, the amount of time and effort it takes to make the switch, and the dire impact that it has on your productivity during the first couple of weeks.

This time I think it is within my grasp, however. This is my third attempt to get Dvoraking, and I can now easily manage over 30 words per minute on my laptop, where I have rearranged the key caps. I am still a bit slower on my Kinesis keyboard which does not have the keys relabelled, so I am having to learn to touch type properly on it, and that makes it a bit more of an effort. Nevertheless, it is now at the stage at which the impact on my productivity is minor enough for it to be tolerated at work, and once you reach that stage, it is plain sailing all the way.

Dvorak keyboard layout

I am well impressed with just how much more comfortable it is than qwerty, and also that it seems to encourage and even enforce much more disciplined keyboard habits. I find that the fingers on my right hand tend to gravitate naturally to the home keys for their resting position now, for instance, whereas on qwerty they tended to gravitate to anything but the home keys. I am also finding it much easier to type without having my palms resting on the wrist rests at the front of the keyboard all the time.

One thing I have found however is that if you frequently have to remote desktop into other computers and servers, a reprogrammable keyboard is absolutely essential. Terminal Services uses the keyboard layout programmed into the server rather than your local machine, so unless you are prepared to switch back and forth all the time between qwerty and Dvorak (and everything that I have read on the subject is unanimous that you shouldn’t while you’re learning), relying on the ability to change the keyboard settings in Windows simply won’t cut the mustard. The keyboard switcher in Windows can be pretty temperamental at the best of times, and nice as it would be to switch all the servers I access to Dvorak, there are other people around who also have to log in as an administrator as well as me, and if they end up typing gibberish or are unable to even log in thanks to the Dvorak layout, they are likely to get rather annoyed.

It seems that there are quite a few alpha geeks and bloggers who type Dvorak. Well known Dvorakists include Bittorrent inventor Bram Cohen and WordPress head honcho Matt Mullenweg. For a light-hearted and entertaining look at the benefits of the Dvorak keyboard layout, check out DVZine.org, a Dvorak advocacy site in web comic format. It’s a seriously cool intro to it that is well worth a read, even if you don’t plan to switch.

05
Jun

Bad Behavior does not like Windows Live Writer

There is a bug in the newly released Windows Live Writer beta 2 that causes it to choke if you are also using Bad Behavior on your blog.

I first discovered this when I installed it yesterday to check it out. When it refused to update the theme from my blog, I wondered at first if there was a problem with my custom theme, but then half an hour later I looked at my home page again to find that all the comments on my blog had closed. A quick investigation showed that Bad Behavior had been choking on the requests from Windows Live Writer and logging the failed attempts, which were then being picked up by my new plugin, Three Strikes and You’re Out.

It turns out that the problem stems from the fact that Bad Behavior expects Internet Explorer to include an “Accept” header with every HTTP request, and if it gets something that claims to be Internet Explorer yet doesn’t match up to its expectation, it throws an error.

Fortunately, it is not too difficult to fix this, though you do need to tweak the code base of Bad Behavior. Open the file msie.php in the bad-behavior subdirectory of your Bad Behavior plugin and find the lines which say:

if (!array_key_exists('Accept', $package['headers_mixed'])) {
    return "17566707";
}

Change this to read as follows:

if (strpos($package['headers_mixed']['User-Agent'], "Windows Live Writer")
    === FALSE && !array_key_exists('Accept', $package['headers_mixed'])) {
    return "17566707";
}

You should then be able to use Windows Live Writer on your blog once again, without losing the protection offered by the Bad Behavior plugin.

03
Jun

Comment Timeout 2.0 and friends

The first alpha versions of my new WordPress comment plugins are now available for download.

Comment Timeout 2.0 closes comments on posts on your blog a certain time after they are posted. It has been rebuilt from the ground up to incorporate some new features:

  • You can now override the default settings to allow certain posts to have the discussion kept open for a shorter or longer time, or even indefinitely.
  • You can define a “popularity level” above which the discussion can be kept open for an even longer period of time if you so desire.
  • You can have comments on older posts sent to the moderation queue instead of closing the discussion altogether.
  • The comment form now indicates when the discussion for a particular post will close.

Some features were added to version 1.3 but have now been spun off into two separate plugins:

Three Strikes and You’re Out examines your Bad Behavior logs and your spam queue and closes comments across the board on your blog when you are visited from any IP addresses that have been repeatedly misbehaving (the default settings are three times in a week). It also defines a couple of hooks and adds a new logging table to the database, so other plugins can register naughty events (e.g. failed captcha tests) or override the counting mechanism (e.g. to implement whitelists or blacklists).

Link Limits rejects comments which contain BBCode or more than two normal hyperlinks. I’ve found that this blocks approximately 80% of spam, yet genuine comments exceeding these limits are almost non-existent. It informs your commenters that this restriction is in place. It also logs any violations to Three Strikes And You’re Out, but it works perfectly well if you do not have Three Strikes And You’re Out installed.

I’ve marked them all as “alpha 1″ status, which means use at your own risk, though I am dogfooding them on my own blog. If you have any problems with them, I’ve written a post on how to report problems with WordPress plugins — please read it before giving me a shout, though I do welcome feedback and suggestions of course.

Starting with these plugins, I have changed the licensing terms. Whereas the old versions were GPL, these ones are available under the MIT X11 licence. It is GPL compatible but doesn’t have the “copyleft” element. This means that if you wanted to, you could adapt it for use with another, non-GPL, CMS or blog program.