james mckay dot net

because there are few things that are less logical than business logic

March 2009

30
Mar

Django custom manage.py commands not committing transactions?

Over the weekend I was doing a bit more Django development, and I ran into a rather strange problem. I had written a custom command for manage.py whose last task was to update some rows in a couple of database tables using some raw SQL commands. Strangely, the changes weren’t showing up when I looked at the data in MySQL Query Browser afterwards. Even more strangely, when I copied and pasted the raw SQL and ran it in Query Browser, it worked fine.

A little bit of extra code revealed that the SQL commands were being run, which absolutely screamed that the code was being run in a transaction that wasn’t getting committed. Fixing it was simply a case of adding these two lines at the end of my custom command:

from django.db import transaction
transaction.commit_unless_managed()

I’m not sure whether this is a bug or a feature. Django’s default transaction behaviour — in HTTP requests at any rate — is the autocommit model: every change you make is automatically committed as soon as it is run. I’d have thought it would be the same in management commands, but clearly it is doing something rather different here. If anyone can enlighten me, I’d be interested to know.

26
Mar

OO: the lingua franca of modern software design

There is no excuse whatsoever these days for working professionally as a developer and not being thoroughly familiar with object oriented programming.

Occasionally I meet people who have some crazy idea that it isn’t necessary, claiming that it leads to over-complicated code that is difficult to understand. Indeed, there are some software packages — WordPress springs particularly to mind — that make very little, if any, use of OO, yet they do pretty well. Furthermore, I wouldn’t necessarily expect new junior developers to be completely up to speed with it — after all, we all need to start off somewhere, and it’s perfectly reasonable to cut them some slack right at the start of their programming careers.

However, if you are working professionally with code, whether as a web developer or as any other kind of developer, then regardless of whether or not you are actually using it, getting to understand OO properly needs to be right at the top of your list of priorities.

Why? Because OO is the lingua franca of modern software design.

OO has been mainstream for nearly twenty years now, and before that it had been knocking around in academic circles and research labs since the sixties. In fact, it’s now pretty pervasive — nearly every framework out there is built round OO principles, and you can’t work as a developer for long without running into it.

More importantly, OO is a communication tool for discussing your application architecture and design decisions with your fellow developers. If you don’t understand it properly, or if you misunderstand it, you will not be able to hold a meaningful discussion with your colleagues about the structure and high level architecture of your code. You will not be able to understand why computer programs are designed the way they are, or why frameworks work the way they do. You will not be able to design your code to be easily extensible, nor will you be able to build usable, stable service interfaces. And you will be the one writing convoluted spaghetti code that is almost impossible to maintain.

You need to understand it correctly, though. A lot of developers think that they understand OO, but in actual fact, have numerous weird, fuzzy conceptual misunderstandings as to what it’s all about. Just wrapping a bunch of procedural code in a class doesn’t make it object oriented any more than going into McDonald’s turns you into a hamburger. Not knowing the difference between static methods and instance methods is a dead giveaway here — if you are creating instances of a class consisting entirely of helper methods, you’re doing it wrong. I’ve come across numerous examples over the years of this kind of thing:

public class Helper
{
    public int SumOfSquares(int a, int b)
    {
        return a * a + b * b;
    }
}

/* elsewhere */

var objHelper = new Helper();
return objHelper.SumOfSquares(a, b);

which could just as easily be written as:

public class Helper
{
    public static int SumOfSquares(int a, int b)
    {
        return a * a + b * b;
    }
}

/* elsewhere */

return Helper.SumOfSquares(a, b);

Now this is a relatively trivial example and one that is fairly easy to fix, though I have come across variations on this theme with the potential to introduce resource leaks or other obscure, hard to track down bugs. Much more serious, however — and harder to fix — are the maintenance nightmares, such as methods taking dozens of arguments, all of them primitive types such as int, char or string. Or very long methods containing a lot of complicated copy and paste code. Or huge, ten thousand line classes with more responsibilities than the Secretary General of the United Nations. These are the kind of problems that OO, design patterns and the SOLID principles are designed to solve, people.

Well designed OO code may be difficult for novice developers to understand, but it is not a problem to experienced OO practitioners. On the other hand, poorly written code, written by people with vague, woolly ideas about how code is supposed to be written, is a huge problem to novices and experts alike.

17
Mar

Making the most of your source control summaries

The summary for Django’s changeset number 9756 caught my attention recently:

Fixed #8138 — Changed django.test.TestCase to rollback tests (when the database supports it) instead of flushing and reloading the database. This can substantially reduce the time it takes to run large test suites.

This change may be slightly backwards incompatible, if existing tests need to test transactional behavior, or if they rely on invalid assumptions or a specific test case ordering. For the first case, django.test.TransactionTestCase should be used. TransactionTestCase is also a quick fix to get around test case errors revealed by the new rollback approach, but a better long-term fix is to correct the test case. See the testing doc for full details.

Many thanks to:

* Marc Remolt for the initial proposal and implementation.
* Luke Plant for initial testing and improving the implementation.
* Ramiro Morales for feedback and help with tracking down a mysterious PostgreSQL issue.
* Eric Holscher for feedback regarding the effect of the change on the Ellington testsuite.
* Russell Keith-Magee for guidance and feedback from beginning to end.

Before I came across this particular essay (via Eric Holscher and Simon Willison), like most developers, I only ever put a short one-sentence summary in my source control commits. It had never occurred to me to do anything else — in fact when Trac lists the changes in your repository, it only displays the first line anyway. Some developers don’t even do that, instead leaving the summary blank altogether.

But hey, I thought — why not? Why shouldn’t we write a whole two paragraphs to add a bit more detail to our descriptions of our source control commits? After all, something short and simple may be OK when you’re the only developer working on a project, but when you’re working as part of a team, you need to keep the communication up.

You see, suppose that one day you are testing some code that you have been working on and you discover that it is barfing. So you scratch your head for a bit, step through it in the debugger, and so on, and you narrow it down to one particular line of code. You look at the revision logs and you find that I was the one who checked in the offending code. You’ll probably be upset with me, but if I have given a decent explanation of what I’ve done when I check in my changes to source control, it will make the situation a lot easier for all concerned.

So don’t just gloss over your source control summaries: make the most of them. You’ll be doing all your colleagues a massive favour if you do.