@ayende You ought to try Mercurial. in reply to ayende 1 week ago
08
Mar

Command line instructions are not a good marketing strategy

Dear fellow Mercurial fans,

Please stop using the command line when you’re writing articles telling us how wonderful Mercurial is.

I don’t need to be convinced that it is superior to Subversion. I’ve been using it for about nine months alongside our central Subversion repository at work, as well as for my private projects at home, and there’s no doubt in my mind which is better by a long shot. Easy branching and merging, and local versioning for experimental development and refactoring, are killer features as far as I’m concerned. And ease of use is supposed to be its big selling point over git.

But other developers do need convincing, and if you’re apparently fanboying the command line, it doesn’t help. In fact, it’s downright embarrassing. Remember, you may be a Linux geek who writes code for fun at weekends, but most of them are nine to five Windows developers who switch out of code mode the minute they leave the office and don’t want to have to learn anything new unless it’s strictly necessary. To them, it looks elitist, arrogant, off-putting, and Luddite.

When I first heard about Mercurial and git about two years ago, neither of them had any form of graphical user interface to speak of. It was a case of hg this, hg that, git this, git that in a command shell versus TortoiseSVN’s repo-browser, show log and commit dialogs. You know, like, where you can actually see what you’re doing? Where you can frequently figure out what you need to do by experimentation and educated guesses rather than having to wade through a morass of man pages? Forget it, I thought. Come back to me in a year or two’s time when you have a decent graphical front end for it. In the meantime, I’m sticking with TortoiseSVN.

Heck, I’m the kind of developer who likes to try out new things. I like Linq, and MVC, and jQuery, and Python, and Colemak keyboards. I know Linux and I’m not afraid to use it. If I was put off by the impression that Mercurial was command-line only, what hope do you have of convincing the rank and file Windows developers who are scared of the command prompt?

Nowadays, of course, we have TortoiseHg, which gives it a decent, powerful and intuitive front end. In fact it was TortoiseHg that sold me on Mercurial in the first place, because it lets you see exactly what you’re doing when you’re branching and merging, as well as flattening out the learning curve dramatically. Just take a look at its repository explorer, for instance:

image

See? You even get a nice little graph showing you exactly where all your branches are. Context menus make it easy to figure out what to do next and actually do it. Oh, and it shows you the most recent changes first, rather than just vomiting everything out onto the screen and leaving you staring at changeset zero, like you get when you run hg log:

image

To a seasoned developer, there are advantages to the command prompt. It’s easier to type into your blog, easier to copy and paste, and easier to script. But there is a time and a place for everything, and introductory tutorials for tools with perfectly good graphical front ends are not the time and place for a command prompt. Doing a screen capture, firing up Paint.net and cropping your image to the right size may be more of a faff, but in an introductory tutorial, merely typing hg push instead is either outright elitism or sheer laziness. Please, cut it out. Use TortoiseHg to introduce Mercurial, and keep the command line for more advanced tasks.

04
Feb

Catching Exception is almost never justified and almost always harmful

I was doing an ad-hoc review of another developer’s code not long ago when I saw something like this:

try {
    return bool.Parse(GetSomething());
}
catch (Exception) {
    return false;
}

I gently pointed out to him that this is a bad practice. Apart from the fact that you can use bool.TryParse() instead of bool.Parse(), your GetSomething() method may be throwing exceptions indicating a rather more serious problem, such as your database being down.

Catching Exception is one of my pet peeves, but sadly it’s far too common, even among smart developers that I’d have expected to know better, cropping up in commercial products and open source projects alike. Part of the problem is the code samples in the MSDN documentation itself, which are littered with completely unnecessary try ... catch (Exception) blocks, that people copy and paste without thinking about it. But it’s also a quick and dirty hack — it’s easier to simply catch Exception and cross your fingers than to look up the documentation to find out exactly what you should be catching.

But this is reckless and dangerous. Catching exceptions inappropriately can lead to some very serious bugs in your code — serious, because you are deliberately ignoring them while they wreak havoc with your data. In one instance, I was asked to troubleshoot an application where a database upgrade had been botched and nobody had noticed for several days until the users started complaining that their changes weren’t being saved. You may also be ignoring misconfiguration, missing assemblies, external services being offline, and so on. And even if the effects aren’t serious, the bugs can still be particularly difficult to track down, as your logs will likely contain misleading error reports, if indeed they contain any error reports at all.

Catching general exception types without re-throwing them is almost never justified, and almost always harmful.

The correct approach to exceptions is to allow them to bubble up to the topmost level of your code, and handle them there by logging them and presenting an approriate error message to the user. For ASP.NET applications, this is the Application_Error event handler in your global.asax file, or perhaps an error logging framework such as ELMAH. For console applications, it is your Main method. For separate threads, it is the topmost method of the thread. And so on.

Well written code has very few try ... catch blocks. The most common case where you would have a general exception handler is when you need to roll back a transaction or otherwise leave your application in a consistent state when you re-throw:

try {
    BeginTransaction();
    DoStuff();
    Commit();
}
catch {
    RollBack();
    throw;
}

Aside: when you re-throw the exception, always use throw; here (which preserves the stack trace), not throw ex; (which doesn’t).

Apart from that, you should only catch specific exception types that you are both able and willing to handle meaningfully. Certainly, catching Exception should be treated as the nuclear option — and if there really is no alternative, you should always log the exceptions and rigorously justify your decision both in comments and in a code review. And next time you are tempted to write catch (Exception), ask yourself this question:

What would this code do if the exception were due to a botched deployment, an out of memory error, or a misconfiguration?

02
Feb

Are deletionists harming Wikipedia?

There’s a discussion over on the Colemak forums at the moment about the Wikipedia problem. It seems that, not content with having the article deleted on the grounds of non-notability a while ago, some Wikipedians are trying to eradicate every last mention of the layout from anywhere on the site. The deletion decision had eventually ended up as a redirect to a section on the Keyboard layout article, but it seems that even that’s been removed now, by a particularly argumentative individual who is rigidly and inflexibly applying his interpretation of the Reliable Sources policy.

Now as a satisfied Colemak typist I may be somewhat biased on this matter, but this one should be obvious. Colemak may be a pretty niche subject, but it has been covered a couple of times in the media—not a lot, but usually sufficient to at least get a “no consensus” decision in an AfD debate, which automatically defaults to “keep.” On top of that, it is included in X11 and every Linux distribution going. It’s one of only about half a dozen options for keyboard layout variant displayed on the installation screens of Ubuntu. It’s right in your face, not tucked away in some obscure and dangerous config file. Everyone who installs Ubuntu will be aware of it. Some of them will want to find out more about it. And they will expect Wikipedia to say something about it. But it won’t.

Of course, if it were just Colemak that were affected, I’m sure you could just dismiss this as a fanboy rant on my part, but this actually illustrates a much wider problem. With over three million articles, on everything from minor league ice hockey players to fictional foods in Babylon 5, Wikipedia is now the first place people turn to for information on anything obscure and only marginally notable. Wikipedia’s end users expect it to be an indiscriminate collection of information. Yet an indiscriminate collection of information is one of the things that Wikipedians are adamant that Wikipedia is not.

This is like being told that a problem in Sage or QuickBooks that is causing your tax return to be filled out with gibberish is not a bug, but a feature.

The problem is that there is a massive disconnect between Wikipedia’s users—casual visitors who often don’t even bother to create an account—and its overlords—the regular, active Wikipedians with edit counts in the thousands or even tens of thousands and an encyclopaedic knowledge and understanding of its policies. It is at its most striking in the whole inclusionist versus deletionist debate. And the deletionists are alienating a lot of would-be Wikipedians.

It turns out that this is one of the biggest criticisms levelled at Wikipedia by occasional editors. People come onto the site knowing nothing of Wikpedia’s policies, but plenty about some—possibly very niche—subject. They make half a dozen or so edits, then return a week later to find that their article has been deleted with no apparent explanation. Or perhaps it will be flagged with a deletion debate, crammed full of arcane and cabalistic abbreviations such as WP:NFT, WP:NOTE, WP:V, WP:WAX, WP:SOAP, WP:IAR, and so on, all pointing to Wikipedia’s byzantine and convoluted policies, guidelines and procedures. What kind of impression does this leave the casual editor? That Wikipedia is a hideout for a bunch of antisocial, bureaucratic teenage control freaks—a kind of online equivalent to the kids on the beach who kick the sandcastle you’ve just spent three hours building into your face. And since first impressions count the most, they will go off, never contribute anything else, and rant on blogs and forums about how insular and out of touch with Real Life these Wikipedians are.

Why is this harming Wikipedia? Because these are the people who contribute the overwhelming majority of substantive, meaningful content to the site.

This study by Aaron Swartz will be particularly enlightening to anyone who doubts this claim. His research on a data dump of Wikipedia indicated that most contributions of actual substantive content are made by new and casual users, many of whom never even create an account and most of whom only make a handful of edits to the site. Regular Wikipedians, on the other hand, tend to spend most of their time tidying things up—moving text around, correcting spelling mistakes, wikifying things—and deleting stuff.

I’ve sometimes looked at these deletion debates and wondered how many of the people voting for deletion with reference to obscure areas of Wiki policy even begin to understand the subject matter of the article under discussion itself. Some of the arguments for deletion of Colemak are laughable for starters. They’d have us belive that nobody uses it (a brief glance at the activity on the forums and the Facebook group and even the AfD debate itself will quickly dispel this notion); that X11 is an anarchic free-for-all where you could submit a patch containing a rootkit backdoor and it would be accepted; and that the only way to enable Colemak in Ubuntu is to edit some obscure and dangerous config file where it’s buried in a list of gazillions of options and a slight typo will make your computer unbootable.

Certainly, searches for reliable sources are usually cursory: no hits on Google News, no hits on Google Scholar, so delete. Blogs are automatically not considered reliable sources, even if they’re written by experts in the industry such as Tim Bray, Simon Willison or Jeff Atwood. In fact, Jeff Atwood’s Wikipedia entry also fell foul of the deletionists a year ago, when Stack Overflow was in public beta, which shows just how completely out of touch with reality they are. (Incidentally, web development is one area in particular where WP:RS is a very bad metric for notability, simply because it’s an industry where a lot of key activity happens at the grassroots level. The sources that web developers regard as reliable enough for practical purposes are generally high profile blogs like Jeff’s, while the academics writing papers on how to use lines of code per day as a productivity metric are frequently regarded as an irrelevance at best and harmful at worst.)

There’s also a lot of bluster and bullying goes on when the deletionists crop up. Throwing acronyms around sends a signal to newbies that they’re not welcome. If you Twitter about a deletion debate, you’re accused of canvassing and booed off. Anonymous accounts and new users are often regarded with suspicion as potential sock puppets. Most people find it hostile and intimidating, and perhaps even a bit childish, but the deletionists don’t care. They’re so obsessed with making Wikipedia what they think it should be that they’ve completely lost sight of the end users.

16
Dec

Can your database versioning tool do this?

I’ve been evaluating DbGhost recently. It’s one of those database lifecycle management tools that, at first glance, seems to be based around the whole schema comparison/data comparison approach.

Unfortunately, I distrust this approach intensely simply because it’s such a leaky abstraction. Nevertheless, people who have used DbGhost tend to wax lyrical about it, and some people even report that their DBAs like it. This means that either they’re missing something about it, or else I am.

Indeed, it seems that DbGhost does allow you to throw custom scripts into the mix somehow or other, for the cases that schema and data comparison just can’t handle.

So, rather than give any specific comments on DbGhost, or any other database lifecycle management solution, I shall propose a scenario that can be used for evaluation of any tool or approach to database lifecycle management.

This scenario is completely fictitious and not related to anything I’m working on, but it represents the kind of changes that you are likely to come across sooner or later in your application lifecycle. And in recent months I’ve had to perform several database refactorings much more complex than this.

Let’s say you have a database, with a Users table, which in Version 1.0 looks something like this:

Schema for v1.0

As you can see, it is not in first normal form, as you discover when one of your users phones up complaining that they can’t register six e-mail addresses. So in version 1.1, you extract the Email fields to a separate table:

Schema for v1.1

This can be done using a SQL query like this:

create table UserEmails(
    EmailID integer not null identity(1, 1) primary key clustered,
    UserID integer not null foreign key
        references Users(UserID)
        on update cascade on delete cascade,
    Email nvarchar(100)
)

insert into UserEmails(UserID, Email)
    select UserID, Email from Users
        where Email is not null and Email != ''

insert into UserEmails(UserID, Email)
    select UserID, Email2 from Users
        where Email2 is not null and Email2 != ''

insert into UserEmails(UserID, Email)
    select UserID, Email3 from Users
        where Email3 is not null and Email3 != ''

alter table Users
    drop column Email, column Email2, column Email3

Now one point about this refactoring is that it is impossible to complete it correctly using schema comparison and data comparison tools. This is true of any normalisation refactoring, or indeed, any refactoring where you are moving live, constantly changing data from one table to another. Another point is that these are fairly common scenarios. They are not some obscure academic concept only of interest to PhD students; it is inevitable that you’ll have to get your hands dirty and write some SQL at some stage in your application’s lifecycle if you want to hit the high notes with it. In fact, in my experience, approximately 20-30% of all database migrations that I write are beyond the scope of schema and data comparison.

So that brings you to version 1.1 of your product. Then you realise that your website-y fields are not only not in first normal form, they’re out of date. You still have a field for your users’ Pownce profiles! In case you’d forgotten, Pownce doesn’t exist any more. In fact, nobody used Pownce in the first place even though it had people like Robert Scoble fanboying it. And what about the other social networking sites that aren’t listed? Like, for instance, Flickr, Delicious, or Github? It seems that a bit more normalisation is in order:

Schema for v1.2

We’re now up to version 1.2 of our product. But it doesn’t stop there! In version 1.3, we do even more normalisation. Look at those IsAdministrator and IsStaff columns. We need to move them into a separate Roles table to give us more granular control over our website security:

Schema for v1.3

We now have three upgrades to our product, and in each upgrade, we have performed a database refactoring that to the best of my knowledge and understanding can not be handled correctly by schema comparison and data comparison tools alone. These changes need to be scripted by hand, there are no two ways about it.

So here are my questions for DbGhost, or for any competing product or process for managing your database changes:

  • Does it give you a means to upgrade this database to version 1.3 from any previous version, be it 1.0, or 1.1, or 1.2?
  • In one step?
  • Despite the fact that none of the migrations are possible using schema comparison or data comparison tools?
  • And is the process idiot-proof, intuitive and well documented?

This kind of thing is very straightforward with a migration-based approach, such as that offered by Ruby on Rails. I haven’t yet figured out how to do it with a comparison based tool such as DbGhost, but if it can handle it nonetheless, I will be most impressed.

30
Nov

Keep your passwords safe with KeePass

Website logins scare me. It’s frightening how many incompetent and/or lazy and/or irresponsible web developers there are out there who see nothing wrong with storing passwords in plain text in a database, and even worse, give attackers wiggle room to find them by peppering their code with SQL injection vulnerabilities.

Unfortunately, with so many different websites implementing their own login systems, inevitably you have to create dozens of different accounts. And to get round this, pretty much everyone re-uses their passwords all over the place.

The result of this is that if you register on, say, a Christian dating website that subsequently gets hacked, you run the risk of your Facebook account being compromised.

But it simply isn’t practical to have a different password for every site you register on.

Or is it?

Recently I decided to do something about it, so I downloaded and installed KeePass. It’s a Windows program that keeps all your passwords in a strongly encrypted database, allowing you to have different passwords for every site where you have an account, and make them as strong as the site will allow. It has an auto-type feature, where you can get it to enter your user name and password into a web input form for you, and there is a version that you can save on a USB key disk and run on any computer, even if you don’t have administrative rights on it.

image

With a tool such as this, you can make your passwords as strong as you like. I set the password generator to choose 25 character passwords containing any kind of character that it’ll give me: letters, numbers, punctuation marks, brackets, you name it. Passwords such as these would keep all the computers in the world guessing well into the Degenerate Era.

I’m now trying to remember all the websites where I’ve ever registered an account, so I can change my password on all of them. I’ve done all the high risk ones that I use regularly, such as my bank, my web hosting, Facebook and so on. Google has been jogging my memory on various other ones — some of which I had forgotten even existed.

28
Oct

A day of Stack Overflow

Half a dozen or so of us from work were at the London Stack Overflow Dev Days event with several hundred other developers today. I’ve been pretty impressed with the way Jeff Atwood and Joel Spolsky’s enterprise has turned out to be such a resounding success, and I’ve also been an avid reader of Jeff’s blog, Coding Horror, for several years now, so I was naturally delighted to get the opportunity to go.

Encountering Joel and Jeff in real life was an interesting experience, since I’ve only ever read their blogs and Twitter feeds up to now. Over the past year or so, I’ve had to get used to seeing certain people in real life that most people only ever see on TV or on the Internet, but it still seems a bit odd when you do. It certainly gives you a totally different impression of them from what you had before though. You can certainly see why Joel and Jeff in particular are both so successful in what they do: as well as being excellent online communicators, they are both brilliantly engaging and entertaining public speakers. So too was Jon Skeet, who gave a very funny talk about localisation entitled “Humanity: Epic Fail,” assisted by a sock puppet called Tony the Pony. Joel’s talk on FogBugz was a pretty hard sell, but it certainly looks impressive, boasting a feature set that makes Trac look like Notepad.

The other talks included an introduction to Python by Michael Sparks of the BBC, who explained to us Peter Norvig’s 21 line spelling corrector (it didn’t escape my attention that Jon Skeet spent the lunch break porting it to C#); introductions to mobile development for no less than three rival platforms (Google Android by Reto Meir, iPhone by Phil Nash, and Qt/Nokia by Pekka Kosonen); introductions to jQuery (Remy Sharp) and Yahoo! Developer Tools (Christian Heilmann); and an academic talk on “How not to design a scripting language” by Paul Biggar, who recommended the book “Engineering a Compiler,” by Cooper and Torczon as a superior alternative to the Dragon Book.

I spoke to Jeff during the afternoon break and asked him if he had any plans to publish the best of Coding Horror in a book. He said he’d thought about it a bit, but wasn’t entirely convinced it was worth doing. It’s something I’ve recently thought that he’d do well to do—a lot of his posts are ones I’d consider “must-reads” for every working developer, and if he did, I’d buy it in a shot. He wouldn’t be the first person to do something like that either—after all, Joel did it (twice), and so did Raymond Chen. It was interesting what he asked me when I told him I work for Parliament—he was most interested to know whether Britain is part of Europe or not. It’s a good question, that. Officially we are, but unofficially I sometimes think that as a country, we’re not entirely sure ourselves.

There were just two disappointments to the day. One was the catering. I was half expecting something along the lines of a buffet lunch—after all, I do tend to think of the Fog Creek Way as one where they go the extra mile to get these things perfect—but it turned out to be the kind of mass produced sandwiches that you get in a motorway service station that are all ridiculously overpriced, taste exactly the same as each other, and don’t meet with my approval anyway because they’re spread up with margarine. The other disappointment was the venue itself. Kensington town hall simply is not big enough for however many of us (800? 1000?) were there today. Consequently it felt very crowded and claustrophobic, and even a little bit uncomfortable, especially during the breaks when we all crowded into the foyer and had to form a queue stretching seemingly all the way to Barking and back to get to the food.

The day ended at about ten past six and I came away with a whole lot of freebies: a Qt rucksack, a copy of the Aardvark’d DVD, a handful of FogBugz pens, and a handful of Stack Overflow, Server Fault and Superuser stickers. All in all, it was a pretty full day (I had to get up half an hour earlier than usual and I got home an hour and a half later than usual, and sitting through seven hours of talks was pretty intense) but it was well worth it.

26
Oct

How to validate a URL in .NET

System.Uri.TryCreate.

You don’t need to use regular expressions.

More generally, if you are trying to do something extremely common, the chances are that whatever framework you’re using, there’s a method or function somewhere in there which will do it for you. And it will almost certainly do it much better than your home-brewed solution will.

29
Sep

Book review: The Art of Unit Testing

Unit testing is one of the programming disciplines that, to my mind at least, sets apart the reputable developers from the cowboys. Test driven development has become increasingly popular over the past few years, and for many teams, it’s becoming as fundamental a discipline as using source control. It’s an idea that sounds almost like a no-brainer at first: you write a bunch of methods that feed test data into your code and examine the result, and they report either success or failure. Then when you start extending and modifying your code, you can re-run your tests to check that everything still works as it’s supposed to, and you haven’t broken any dependencies.

Yet of all the programming disciplines that I’ve had to get to grips with over the years, unit testing has probably been one of the most difficult. Setting up your project to facilitate automated testing involves quite a lot of groundwork. You have to configure a test environment, perhaps with a test database or other form of test data source, and you also have to completely re-think the way you write your code. It requires greater discipline — it’s all too easy to think “Stuff that,” and slip into the old mindset of just slapping down your code and testing it manually. Some legacy code may seem almost impossible to test, especially if it is a Big Ball of Mud with long, multi-purpose methods and no clear separation of concerns. Then there is the question of what to do with external dependencies — components that are beyond the control of your unit tests, such as third party web services, input devices, and even the current date and time.

The Art of Unit TestingIf you’re wrestling with problems such as these, Roy Osherove’s book, The Art of Unit Testing, is a must-read. Most treatments of unit testing that I’ve come across only cover the topic at a fairly basic level, but this one picks up where they leave off. I was pleased to see some good solid chapters on stubs, mocks and dependency injection, for instance: these are essential tools needed to break external dependencies. I was also pleased to see a whole chapter devoted to working effectively with legacy code; another whole chapter is devoted to the political and organisational implications of introducing unit testing into your organisation.

The book assumes a basic familiarity with object oriented programming and design patterns (which every working developer should understand properly anyway) but it isn’t a difficult read, and it explains things very clearly. Code samples are given in C#, but developers working with other languages will also find much in it of benefit. There are plenty of suggestions that may not have occurred to you, and you’ll end up much more aware of the “tricks of the trade.”

My only disappointment was that four major, and potentially thorny, aspects of the subject — database, UI, web and thread-related testing — were only given a brief overview in the second half of Appendix B. Of course it could be argued that these were outside the scope of this book — indeed, Osherove argues that unit testing in some of these areas in particular have a relatively low return on investment — and in others, unit testing frameworks and methodologies are still very much in their infancy. However, I was hoping for a somewhat more extensive treatment of these aspects of the subject and it was a little bit disappointing that they only got six pages.

But this doesn’t detract from the value of the book. Whether unit testing is your “thing” or not, it is very much a must-read for every working .NET developer, and indeed for developers working with other languages too. It deserves to be as much a classic as books such as Code Complete, Design Patterns, or Martin Fowler’s Refactoring.

25
Sep

Joel Spolsky, cowboy coder

Some people at work think I’m a bit of a Joel Spolsky fanboy. I’ve certainly got a lot of useful hints and tips out of his blog, Joel on Software, and I’ve been recommending it as a must-read for fellow developers and managers alike.

Until now.

What’s really, really annoying me about Joel these days is that he’s started preaching cowboy coding. And I don’t like it.

His latest post, The Duct Tape Programmer, illustrates this perfectly. He has yet another dig at Architecture Astronauts who come up to you when you’re racing to get an upgrade ready for deployment and tell you that you need to refactor your core code to use multi-apartment threaded COM. Not unreasonable. But then, he goes on to wax lyrical about Jamie Zawinski, who is, to use Joel’s words, the Pretty Boy of Software Development. He wrote Netscape Navigator in the 1990s and is Joel’s hero because—get this—he didn’t write unit tests:

Zawinski didn’t do many unit tests. They “sound great in principle. Given a leisurely development pace, that’s certainly the way to go. But when you’re looking at, ‘We’ve got to go from zero to done in six weeks,’ well, I can’t do that unless I cut something out. And what I’m going to cut out is the stuff that’s not absolutely critical. And unit tests are not critical. If there’s no unit test the customer isn’t going to complain about that.”

Now perhaps Mr Zawinski is brilliant enough to get away with a “duct tape programming” approach, but it is not something that we should be encouraging, not least because it is deeply unprofessional. Besides, what he says about unit testing is just WRONG.

Here’s the score with test-driven development. Once you know what you’re doing, it doesn’t slow you down overall. Yes, it takes longer to write the code in the first place, but this is offset by a decrease in time spent debugging, and on top of that, you get a dramatic increase in confidence in your code, a more robust design, and a reproducible, easy to run way of verifying that your code does what it’s supposed to. Every professional developer who is concerned about quality these days writes unit tests insofar as they can. If you don’t like the idea, you are a cowboy coder. Period.

Joel’s sentiments such as these have been bugging me since the start of this year, when he crossed swords with “Uncle Bob” Robert C Martin on the Stack Overflow podcast over the SOLID principles. Back in January on the Stack Overflow podcast, he made this observation:

Joel revisits the SOLID principles, and compares them to designing replaceable batteries, or a headphone jack, into your product. Appropriate in some narrow cases, but not all the time. Imagine a consumer product where every single part of it could be plugged in and replaced with another compatible part. Is that realistic?

Actually, Joel, I can perfectly well imagine such a consumer product. In fact I own one such consumer product myself where this kind of design pattern is very important. It’s called a car.

Let me explain. On the journey up to Faith Camp in August, another driver ran into the back of said car and left it looking rather sorry for itself. The repairs took three weeks, and involved replacing the rear bumper, the rear hatch, and various panels. Ford makes appropriate parts that can be slotted and bolted into place with relative ease as replacements for the damaged originals.

Imagine a car built the way Joel’n’Jeff seemed to be suggesting, where it was just one monolithic structure, with non-interchangeable parts. One ding like that and it’s a writeoff. When the tyres wear thin, or the bulbs go, you have to scrap it. Is that realistic?

Architecture astronauts are a straw man here. We’re not talking about highfalutin over-engineered ProviderSingletonVisitorAdapterMediatorRepositoryFactory design patterns, nor are we talking about more layers of abstraction than The Princess and the Pea, we’re talking about a common sense approach to software development. It’s the kind of thing that you’re either doing anyway but you just don’t know what it’s called, or else you get that kind of “aha!” moment when you first encounter it that makes you think, “Now how come I never thought of that before?”

One other thing. When you’re repairing a car after an accident, you most certainly do NOT just use duct tape and WD-40. No matter how much of a Pretty Boy you think you are.

16
Sep

If you are saving passwords in clear text, you are probably breaking the law

The Christian dating website that got hacked by 4chan back in August was a textbook illustration of why you should not store users’ passwords in plain text. Most of the users of the site had re-used their user names and passwords on Facebook and their e-mail accounts, which were compromised as a result, in many cases in extremely embarrassing ways.

Reading about this (and a similar event somewhat closer to home a week later) has got me thinking about the whole issue again. A couple of years ago, Mats Helander proposed on his blog that saving plain text passwords should be illegal. (Unfortunately he lost his domain name to squatters a few months later, but the post is still up in the Wayback Machine.) His post was in response to some of Jeff Atwood’s readers, who pointed out that many web developers have bosses and clients who insist on them storing passwords in clear text so that they can e-mail password reminders to their users. To be sure, you can try explaining to them that there are alternative approaches that don’t compromise usability, but if your boss is an “I’m not a computer person” type, or just doesn’t care, you might as well try to strike a match on jelly, or you may even find your job on the line. However, if you could tell your boss or clients that they were asking you to do something illegal, you’d be in a much stronger position to push back.

Now I am not a lawyer, but the other day, I took a close look at the Data Protection Act 1998, and if I understand it correctly, saving passwords in clear text is indeed illegal here in the UK.

The relevant part of the Act is Schedule 1, Part I, paragraph 7, which states the seventh of eight Data Protection Principles:

Appropriate technical and organisational measures shall be taken against unauthorised or unlawful processing of personal data and against accidental loss or destruction of, or damage to, personal data.

This is expanded on in Schedule 1, Part II, paragraphs 9-12, which tells us how to interpret this principle. Paragraph 9 in particular says:

9 Having regard to the state of technological development and the cost of implementing any measures, the measures must ensure a level of security appropriate to—

(a) the harm that might result from such unauthorised or unlawful processing or accidental loss, destruction or damage as are mentioned in the seventh principle, and

(b) the nature of the data to be protected.

It should be noted that these restrictions apply to “personal data” as well as to “sensitive personal data.”

As Mats argued, and I would reiterate, and the Christian dating website/4chan incident illustrates dramatically, losing people’s passwords has the potential for immense harm. Defacing Facebook profiles can cause serious embarrassment and possibly even wreck careers, but if the attacker then gets access to your e-mail account, they can obtain or request new passwords for even more sensitive websites such as your bank, your credit cards, and so on.

It seems obvious to me that storing plain text passwords in a database most certainly does not “ensure a level of security appropriate to the harm that might result from such unauthorised or unlawful processing or accidental loss” as required by the law. The state of technological development provides us with a much better solution — a one-way salted hash, which is computationally infeasible to reverse engineer — and since there are still perfectly adequate solutions to the login recovery problem, the cost of doing so is negligible.

I’d be interested to hear from anyone who specialises in the legal issues surrounding computer security whether my understanding of the Data Protection Act is correct here. Do you concur with my conclusions? Or do you think that the law need to be made more explicit on this matter?