Someone's having a firework party just down the road. Not sure why -- I know it's the Fourth of July, but this isn't America! 3 hrs ago

Programming

29
Jun

Why SQL Server 2005 database projects in VSTS are a bad idea

I’ve been working a bit lately with a project that uses SQL Server 2005 database projects in Visual Studio 2008 Team System. These are different from the conventional database projects that you get in Visual Studio Professional, since they have extra features that allow you to do schema and data comparisons, and, in theory at least, manage database deployments and migrations.

The idea is that you should be able to design your database using visual designers rather than having to write all that nasty SQL code to script it for you. Visual designers make things so much easier at the planning and initial design stage, and once you are done, you can use the various schema comparison and script generation tools to generate your production database.

The problem comes when you want to manage your database’s entire lifecycle. I’m sure that many developers will have scratched their heads at some stage about this problem. You chop and change your database on your development server, most likely using the visual tools—but how do you reliably replicate these changes on your live server?

The Microsoft approach here is to rely on the schema comparison tools to generate change scripts that you can then run against your database. Some people think it’s a silver bullet. I beg to differ.

The first problem is that while schema comparisons can make a good starting point, the scripts they generate don’t always work properly out of the box, if at all. Some database refactorings simply can’t be done using schema comparison tools. Examples include normalisation refactorings such as moving data from one table or column to another; introducing constraints or changing a column’s data type when you need to do some data cleanup first; or modifying reference data. Even relatively straightforward refactorings—or even, in some cases, no refactoring at all—can be problematic: if your production and development databases get their collation orders out of sync, for instance, the script may refuse to run at all. And the thought that anyone would blindly use this option on the project properties page makes me shiver:

Perform "smart" column name matching when you add or rename a column

In other words, you’re asking it to guess what’s changed.

Testability—and when you’re dealing with an abstraction as leaky as this one, testability is vital—is another issue. Unfortunately, SQL Server 2005 database projects have serious shortcomings in this area too. They do offer unit testing features, but these only apply to the final database. There doesn’t seem to be any way of integration testing your migrations themselves: you don’t have a consistent record of what’s changed in a format that can easily be applied to a blank or reference database, so there’s no way of verifying that you’re getting the expected results when you’re going from “before” to “after.”

Then there are the change scripts generated by schema compare itself. They are a morass of long winded, convoluted, hard to maintain, spaghetti code. Adding a column to a database table involves dropping and re-creating the table: this is understandable if you need to put the column in the middle of the roster, since SQL Server does not have an AFTER clause in the ALTER TABLE statement like MySQL does, but even if you add a column on to the end, it still drops and re-creates the table. A task that needs only one or two lines of code ends up taking eighty. If you rely on schema comparison tools, sooner or later you are going to need to edit your scripts, and when that happens you’ll find that you’d have been quicker just writing the change that you needed by hand in the first place.

All in all, this seems far too leaky an abstraction to give me any confidence in it to manage a database lifecycle. There is simply no substitute for scripting every database migration, checking it into source control, and having your unit tests run them all on a blank, or reference, database, and having some record in your production database of which scripts have been run and which haven’t. And while schema comparison tools may be a life saver if you lose track of things for any reason, they are a very poor alternative to generating your migration scripts by hand.

01
Jun

Why would anyone not use source control?

There’s a question over on Stack Overflow that asks if there are any good reasons for not using source control. It’s a question I’ve been racking my brains over for a while now, especially since you do occasionally encounter people who claim they have good reasons not to. The most common such reason that I come across is that they’re a lone developer — an excuse that simply shows that they haven’t a clue what source control actually is.

One person pointed out that physicists are particularly unlikely to use source control:

For the casual programmers – those to whom programming is just a tool, such as many of the people I work with (scientists) – much of the work is hackish and small scale, there may be a dozen other things that are more likely to fail outside the code which could also be eliminated with better practices.

As a colleague put it, “we don’t get published for writing beautiful code”.

Interesting point that. Most programs written by physicists tend to be no more than a few hundred lines long, or even just a Microsoft Excel spreadsheet, and once they’re debugged and working, they usually don’t change. This is of course the exact opposite of business and web programming, where requirements change faster than you can keep up with them. However, you can’t really generalise here. I’d be very surprised, for instance, if NASA doesn’t use some from of source control for the Mars rovers.

Another person gave an answer that was especially worth commenting on:

“For the first 10 years of kernel maintenance, we literally used tarballs and patches, which is a much superior source control management system than CVS is” –Torvalds

If you’ve got quick/easy/automatic backups, you’ve already got 95% of what most of us use VC for. Somebody with a local DVCS repository on his HD but no backups is actually in much worse shape.

Using a VCS does have a real cost, and it’s usually a small one but not always. Every VCS I’ve ever used, I’ve had days where I had to fight with it for hours just to get it to do something that should have been simple.

To those that think “There are no good reasons not to use version control”, where does it end? Must every project have 100% unit test code coverage? Must every project have code reviews? Coding standards? A complete functional spec?

There’s a whole spectrum of programming projects in the world. Not everybody is writing code for the space shuttle. Sometimes being able to diff my code from 11:00am and 11:30am is simply not that important.

Some are merely managing globally-distributed teams of thousands writing operating system kernels.

This is another interesting point — if the Linux kernel managed fine without source control for ten years, why should we use it? In actual fact, the commenter is not entirely correct: the Linux kernel has been under source control since 2002 and Linus Torvalds even wrote his own source control system because he was dissatisfied with all the others that were available at the time. But this is an indictment of CVS in particular, not of source control in general — at the time the choice that you had was between that and something costing an arm and a leg.

This highlights another fairly common reason why people shy away from source control: they perceive it as being more trouble than it’s worth. In recent years, most developers’ first experience of source control has been Subversion. Once you get used to it, Subversion is pretty powerful and works very well, but unfortunately it is not a good example to throw at beginners when telling them they need to use source control. Getting your project under source control in the first place with it is a faff, and I’ve lost count of the number of times that it’s gotten so confused with itself that I’ve had to do a fresh checkout just to get it working properly again. And all those extraneous .svn directories that pollute your project’s filespace can be a major irritation at times.

So what is the best option to convince the naysayers? In a word: Mercurial.

Recently I’ve been playing with some of the new distributed source control systems such as Git and Mercurial, and I get the impression that they are much better suited to new and casual developers than Subversion. They’re a lot easier to use for starters — in combination with visual front ends such as TortoiseHg, you can get your entire project under source control with only three or four mouse clicks. They also have fewer pitfalls and gotchas — you can rename and delete files and directories much more easily without creating a whole lot of confusion, for instance.

Another big advantage of modern distributed source control systems such as Mercurial is that they scale down as well as up. Mercurial creates a single .hg directory in your project’s root which acts as a complete repository in and of itself. For a lone developer this is probably all you need, in tandem with a decent backup strategy, and it even makes it entirely reasonable to get your throwaway scripts under source control. After all, throwaway scripts have a rather nasty habit of not being as throwaway as we first thought they would be.

For development teams, you can have a central repository in addition to the developers’ personal ones, and push the changes to the central server once you’re done. For really big projects, you can have a whole hierarchy of source control servers, with changes being pushed up to the next level once they have passed quality control and whatever other processes you may have in place.

There may have been reasonable excuses for not using source control five years ago on small, trivial projects. But with the latest generation of tools, these excuses are getting flimsier and flimsier every day. Even for physicists.

30
May

Sorry, but I am not a SharePoint expert

I’ve just been taking a look to see who’s following me on Twitter, and it seems that I’ve picked up a handful of SharePoint developers along the way. No doubt this stems from the fact that two of my most popular blog entries are SharePoint posts, almost entirely due to the fact that they feature rather prominently in various Google searches of a SharePointesque nature. It makes me wonder if I’ve unwittingly picked up a bit of a reputation as something of a B-list SharePoint guru.

Well I’m sorry to disappoint you folks, but I’m not one.

Those particular blog entries were actually my initial impressions of the first, and so far the only, SharePoint project I have ever worked on. I had only the vaguest idea of what I was doing and most of my SharePoint efforts at the time were firmly in the “cargo cult” category, as they generally are when you’re plunged in at the deep end with a new, unfamiliar and complex technology, no training, and a tight deadline. Furthermore, neither of them were intended as knowledge base type articles, but as rants — one of them about insanely over-complicated functionality and the other about an idiotic MSDN knowledge base article that didn’t work.

Now I have no idea what effect this post is going to have on my Feedburner subscriptions and Twitter following. If you fall into that category you’re more than welcome to stick around of course, but I just don’t have anything more to say on the subject. I think my SharePoint skills advanced beyond the cargo cult stage as the project progressed, but since I have not been developing for that particular platform for nearly a year now, I am no longer blogging about it either.

15
May

Inital SQL files breaking in Django syncdb

There’s a fairly long-standing bug in Django where initial SQL files break if they contain a double hyphen in a quoted string when you run manage.py syncdb. This bug also causes problems if you’re trying to create stored procedures in MySQL and need to change the delimiter to allow multiple statements.

Seems the problem is some overly naive code to split the SQL file into individual statements. Unfortunately there’s no easy fix without at least partially reinventing the SQL parser or spawning a separate mysql command line client process. Best approach in the meantime is to avoid using stored procedures and check any scripts that you do have using Django’s unit test framework.

11
May

Reinventing the wheel, badly

A few years ago, I inherited a VB.NET application in which every method (many of which were four hundred lines long, copied and pasted all over the place, and peppered with vague sounding variable names such as blnRunIf and blnRunElse) contained this boilerplate code:

Sub MyMethod()
    Try
        '
        ' ** snip ** '
        '
    Catch ex As Exception
        Throw New Exception ("MyClass.MyMethod::" + ex.Message)
    End Try
End Sub

Those who do not understand Exception.StackTrace are doomed to reinvent it, badly.

04
May

Handling exceptions in assembly-level setup methods in MbUnit

MbUnit allows you to run assembly-level setup and teardown methods as part of your unit tests using the AssemblyCleanUpAttribute:

[assembly: AssemblyCleanUp(typeof(AssemblyCleaner))]
public class AssemblyCleaner
{
    [SetUp]
    public static void SetUp()
    {
        // blah
    }
    [TearDown]
    public static void TearDown()
    {
        // blah
    }
}

This is very useful if you want to do something like restore your database to a known configuration, perhaps incorporating all your change scripts into your unit tests. Unfortunately, there is a little gotcha. If your SetUp() method throws an exception, none of your unit tests will run, but MbUnit will still report success.

For what it’s worth, I think this is a bug, not a feature, but there is a way round it. Capture any exception, and create a unit test that re-throws it:

[assembly: AssemblyCleanUp(typeof(AssemblyCleaner))]

public class AssemblyCleaner
{
    private static Exception setupException = null;

    [SetUp]
    public static void SetUp()
    {
        try {
            // blah
        }
        catch (Exception ex) {
            setupException = ex;
        }
    }

    [TearDown]
    public static void TearDown()
    {
        // blah
    }

    internal static void RethrowSetupException()
    {
        if (setupException != null) {
            // Wrap the original exception to preserve its stack trace
            throw new InvalidOperationException(
                "An error occurred when setting up the tests",
                setupException);
        }
    }
}

// You need to have your unit tests in a separate class.
// MbUnit doesn't like you including test fixtures
// in your assembly cleanup class.

[TestFixture]
public class AssemblyCleanerTest
{
    [Test]
    public void ReportSetupException()
    {
        AssemblyCleaner.RethrowSetupException()
    }
}
29
Apr

XsltArgumentList violates the Single Responsibility Principle

Can somebody please enlighten me as to why XsltMessageEncountered and AddExtensionObject are members of System.Xml.Xsl.XsltArgumentList?

It seems like a violation of the Single Responsibility Principle to me — neither of them are anything to do with the list of arguments that you pass into your transform. It would make more sense if they were members of XslCompiledTransform instead.

27
Apr

Using jQuery? Check out jQuery UI

I’ve been a fan of jQuery for quite some time now, but for various reasons I’d never got round to investigating jQuery UI until earlier this week, when I had my attention drawn to it when I was asked why I hadn’t used it in a project that I’ve been working on. So this weekend, seeing as I needed a date picker and a tab control for another project, I decided to download it and have a tinker.

It’s a whole bunch of UI widgets and interactive features that sit on top of jQuery, in much the same way as Scriptaculous sits on top of Prototype, and I’m pretty impressed with it. It’s every bit as easy to use as jQuery itself, and it follows the same philosophy — for example, you can add a date picker to as many textboxes as you like with a single line of code:

$('.datepicker').datepicker();

There are other widgets in addition — an accordion, an in-page dialog box (which seems to be a possible alternative to Cody Lindley’s Thickbox), a progress bar, and a slider. Interactions include drag and drop, resizing, fancy selecting, and drag and drop sorting, and there is a whole bunch of interesting effects and transitions on top of that. There are also seventeen different visual themes to choose from, or alternatively, you can design your own either manually or using a handy web-based theme roller application.

There’s quite a lot of code to it — a full installation with one of the seventeen themes on offer will add 300K to your page download, so it’s probably not ideal for pages on your site that get a lot of traffic from different visitors. But you can reduce the size of the files to only include what you need in a custom generated download. If you are already using jQuery and are looking for a date picker, or a tab control, or any of the other widgets and effects that it has to offer, it’s well worth checking out.

20
Apr

Some thoughts on the role of open source experience in recruitment

Jeff Atwood wrote an interesting post the other day where he asked the question, Is Open Source Experience Overrated? He quoted an anonymous developer who had lamented not being able to find a job despite being the architect of a couple of open source projects:

One company seemed impressed with my enthusiasm for the job but it was part of their policy to provide coding tests. This seemed perfectly reasonable and I did it by using the first solution I thought about. When I got to the phone interview, the guy spent about five minutes telling me how inefficient my coding solution was and that they were not very impressed. Then I asked whether he had looked at the open source projects I mentioned. He said no – but it seems his impression was already set based on my performance in the coding test. The coding test did not indicate what criteria they were using for evaluation but my solution seemed to kill the interview.

I must confess that I was a little bit incredulous when I read this. It seems the correspondent was expecting the recruiter to consider his open source contributions as a substitute for his poor performance in the coding test — and it would be an irresponsible recruiter who did that.

The problem with open source contributions is that they don’t necessarily tell a recruiter what they need to know about you. They indicate passion and enthusiasm for programming, which is a plus, but they don’t necessarily indicate competence — and passion is no substitute for competence. In fact, some open source projects, such as this example cited by Ayende the other week, are very badly written and do not reflect well on the developer. And in the recruitment process, it’s competence that a responsible recruiter will be looking for first and foremost. That’s why it’s so important to have a coding test. The team needs to have a certain baseline standard, and that standard needs to be treated as a “must have” — failing the technical test should be an automatic “no hire,” regardless of passion or experience. Yes, they should take a look at your open source contributions, but they only really become relevant in the later stages of the recruitment process, once they have carried out the technical screening and are getting into the final round of interviews.

Here are some examples of things that a recruiter may want to know about your abilities that he or she is unlikely to find in a typical open source project:

  • How quickly you can write code
  • How quickly you can troubleshoot code
  • How quickly you can learn new technologies
  • The extent of your understanding of the company’s core technologies (how much you have committed to memory, and how much you need to refer to the documentation)
  • How well you can understand and implement customer requirements
  • How clean you can keep your code when working to tight deadlines
  • Whether you can grok recursion
  • Whether you understand Big O Notation and its implications for performance
  • How well you can integrate with the rest of the team

Different companies also have different attitudes towards open source. The ones that are most likely to be interested will advertise job openings in the open source community itself — they reason that that way they have a generally richer pool of talent to draw on than the hordes of unqualified and mediocre people that you get in places like monster.com and recruitment agencies. Larger organisations, on the other hand, are only likely to take notice if your contributions are to a project that they’re actually using, such as NHibernate or Tomcat, and some companies actually have a prejudice against it, so it pays to do some research and bear in mind that YMMV.

One other thing. If you want to pitch your open source contributions to your prospective employer, you need to make sure that your code is the best quality that you can give it. You might find that they take it into account in ways that you hadn’t expected, and if you’re doing stupid things such as silently swallowing exceptions, or writing thousand-line, hundred-parameter, multi-responsibility, copy-and-paste methods, or giving them names like DoIt(), or if your code formatting is sloppy and difficult to follow with inconsistent indentation, it may work to your disadvantage if they ask a technical reviewer to look at it. Open source coding isn’t like coding for work or profit: you have no deadlines, so you can afford to give it a lot more tender loving care.

14
Apr

Beware of second hand contracts

There is a corollary to the concept of technical bankruptcy that I wrote about last week. When you take over a project that was originally built by another company, there will obviously be the inevitable ramp-up time on top of that, as your developers get to grips with the system, but on top of that, you are taking on that project’s technical debt.

We learned this the hard way with one of the worst projects I ever worked on. The client needed a new front end to their website in a timescale of only six weeks, and the sales guys said that of course we could do it, as non-technical sales guys generally do. However, only when they had signed on the dotted line did we get to look at the code.

The levels of technical debt were horrendous. The system, which had little or no coherent documentation, consisted of dozens of classic ASP files of extremely ugly copy-and-paste code, two thousand line stored procedures with up to eighty parameters each, and I’m sure that some of the identifier names were in Klingon. Five months of late nights, much stress and one ruined Christmas later, we eventually delivered something that I considered approximately halfway between “me-ware” and “pre-alpha” in quality, then spent the next two months frantically fixing bugs in the system.

There is a cautionary tale here. When you are taking over a system that was developed by another company, there is absolutely no way that you can give any estimates whatsoever until you have looked at the code base. A five-stage booking form may not sound like a big deal, but when you have to plug it into convoluted and undocumented spaghetti code that borders on the incomprehensible, all bets are off. In this case, the business logic was all in the two thousand line stored procedures, which were designed entirely with the workflow of the old site in mind, and the workflow of the new site was substantially different. On top of that, the client also wanted web services integration, so we had to build a fairly complex abstraction layer on top of it.

In any case, you should certainly ask why the previous company isn’t supporting it any more. It may be that the technical debt grew to levels that they can no longer handle, and that is why they are fobbing it off onto you.