@ayende You ought to try Mercurial. in reply to ayende 1 week ago
14
Jan

The top 25 most dangerous programming errors

(Via BBC News and Coding Horror): The 2009 CWE/SANS Top 25 Most Dangerous Programming Errors is a list of the most significant programming errors that can lead to serious software vulnerabilities, published by the US National Security Agency. Everyone working with code in any capacity whatsoever, at any level, needs to know this stuff cold. Everyone who manages them needs to make sure that they do. And everyone who recruits them needs to ask about this stuff at interview time. There’s really no excuse for hiring people who think that it’s okay to construct SQL commands by smashing strings together willy-nilly with user input.

I was rather disappointed to see that it isn’t explicit enough on the issue of plain text passwords in your user database, nor is there any mention of the increasingly popular password anti-pattern of asking users for their Gmail passwords so you can import their contact list. Both of these are particularly insidious because in addition to being frighteningly dangerous from the point of view of identity theft and phishing, they are frequently demanded by bosses and clients who either don’t see why they should be a problem or are willing to take on the quite unacceptable risks that they introduce.

01
Sep

Commenting your code for speed reading

Here are a couple of important facts about code that you write, that many developers tend to overlook:

  1. You will spend much more time reading it than writing it, often after not having looked at it for months.
  2. Other people will also have to read it.

With this in mind, where possible, I like to use a coding style that helps me to speed-read my code. The idea is that you should be able to scan through a source file, picking out individual methods very quickly, and go from a grand overview of what’s going on to a detailed look at individual methods. It’s a style that I first picked up at university when looking at another developer’s code, and I’m yet to see anything that (in my humble opinion at any rate) beats it for clarity. It looks like this:

/* ====== MyMethod ====== */

/// <summary>
///  XML comment goes here
/// </summary>
/// <param name="str">
///  A <see cref="System.String" /> which is passed as a parameter
///  to your method.
/// </param>
/// <returns>
///  An integer, representing something or other returned by your
///  method.
/// </returns>

int MyMethod(string str)
{
    // do something
}

The key features are as follows:

  1. The method signature is prefixed by a header comment and an (optional) doc comment block.
  2. The header comment consists of the method name surrounded on each side by six equals signs.
  3. There are two blank lines before the header comment, one blank line between the header and the doc comment, and one blank line between the doc comment and the method signature.

This may sound pretty pedantic and exacting, but it’s just my style, and I find it clear, crisp and very effective for the purpose. There are of course other similar variations on the theme, so it’s more the principle that matters rather than the exact details. Highlighting the name of your method or property in a header comment in particular works wonders: you can page through your source file in seconds, and perhaps even relax the focus in your eyes a bit, and zoom in on the method you’re after very quickly. The decorative equals signs draw attention to the text of the headers without overwhelming them, which means you can pick them out and read them in a fraction of a second when you’re scanning through your source code.

One thing that I’m not too happy about, however, is the XML documentation comments that Microsoft has come up with for C# and VB.NET. The angle bracket tax gives you a whole lot of visual clutter to deal with. Javadoc and PHPdoc style comments seem much more elegant and simple:

/* ====== MyMethod ====== */

/**
 * Doc comment goes here
 *
 * @param str
 *  A String which is passed as a parameter to your method.
 * @returns
 *  An integer, representing something or other.
 */

int MyMethod(string str)
{
    // do something
}

Some developers tend to go a bit overboard with decorative bits and pieces on their comments. This is a real-life example from a PHP date/time library that I have encountered:

//==============================================================================
// +---------------------------------------------------------------------------+
// | Function: getMonthName                                                    |
// +---------------------------------------------------------------------------+
// | Accepts:  Nothing                                                         |
// +---------------------------------------------------------------------------+
// | Returns:  The name of the month                                           |
// +---------------------------------------------------------------------------+
// | Description:                                                              |
// |                                                                           |
// | Returns the name of the month of the timeStamp                            |
// +---------------------------------------------------------------------------+
//==============================================================================
	function getMonthName()
	{
		return date('F', $this->_timeStamp);
	}
	
	
//==============================================================================
// +---------------------------------------------------------------------------+
// | Function: getWeekDayName                                                  |
// +---------------------------------------------------------------------------+
// | Accepts:  Nothing                                                         |
// +---------------------------------------------------------------------------+
// | Returns:  The name of the weekday                                         |
// +---------------------------------------------------------------------------+
// | Description:                                                              |
// |                                                                           |
// | Returns the name of the week day of the timeStamp                         |
// +---------------------------------------------------------------------------+
//==============================================================================
	function getWeekDayName()
	{
		return date('l', $this->_timeStamp);
	}

Very pretty, but the decorations on the comments dwarf both the code and the comments themselves somewhat. Personally I find it slower to scan — partly because the comments take up more space, which means you get less code on the screen at the same time, but also because it looks rather distracting. The difference is only a fraction of a second for each method, but it does add up in a big class.

Besides, that kind of bling is a complete faff to type.

Needless to say, this isn’t the only thing you need to do to make your code easy to read. You also need to choose sensible method names — not too long and not too short — that describe what your method does as precisely and clearly as possible within about 30 characters or less. Consistent indentation is also pretty important (Visual Studio is great in this respect — it does it all for you if you press Ctrl-K, D) as is keeping your line length down to something sensible. I find that the maximum line length you can sensibly get away with is about 96 characters: beyond that, you start to have to scroll horizontally all over the place, and your lines wrap when you try to print it out. Oh, and if you use the visual designers in SQL Server Management Studio, please tidy up the resulting SQL after you copy and paste it into your stored procedures. Otherwise you are making life difficult for whoever comes after you who has to understand and maintain the stuff.

What do you think? Do you use any particular stylistic conventions on your comments, or do you find incremental search, document outlines, syntax highlighting and code folding sufficient? Can my style of commenting be improved on?

09
Jul

What are valid reasons for hating a programming language?

Marco Arment (in a response to Jeff Atwood) came up with a list of five common complaints that are not valid reasons for hating a programming language:

  • You’re unfamiliar with it.
  • You don’t like the language’s vendors.
  • Idiots often use it to write bad code.
  • It doesn’t fully resemble your favorite language.
  • You don’t like its syntax.

PHP’s not a perfect language, of course. Nothing is. But it’s by far the best language to use for nearly any web application, as long as you use an appropriate framework and good coding practices. Any language without either of those is bad.

I’d agree that PHP has its foibles, such as its lack of closures, its bloated global namespace and its ad-hoc, inconsistent approach to conventions. However, none of these are insurmountable, and in fact, once you get used to its foibles, working with PHP isn’t all that hard.

A few days ago I finished up on several days’ work on a PHP project and started on a SharePoint workflow. It’s an exercise that I highly recommend for anyone who is tempted to write an anti-PHP rant, because it will put everything in perspective, and give you a list of valid reasons for hating a programming language, platform or framework:

  1. Vertical and infinite learning curve.
  2. Unnecessary complexity.
  3. Useless documentation.
  4. Excessive pitfalls, gotchas and leaky abstractions.
  5. Design that obstructs you from adopting best practices.
  6. Meaningless, cryptic and uninformative error messages, buried in a massive log file deep in an obscure part of the bowels of your file system.
  7. Painful debugging process.

Check this out: to debug a SharePoint workflow you can’t just hit run in Visual Studio. You have to attach the debugger to the w3wp.exe process, and even then it still won’t hit your breakpoints or break on exceptions. According to this blog post, you have to copy the .pdb file by hand into a part of the GAC’s directory structure that you can’t even access through Windows Explorer, so you have to do it via the command line. On top of that, the edit-compile-test loop is agonising — it can take two minutes to re-compile, deploy, and re-load your test site into your browser, after which you may well have to remove and restart the workflow.

Admittedly it’s not all bad. The vertical and infinite learning curve at least gives you some immense satisfaction when you finally “get it”. However, getting there feels at times like climbing the Inaccessible Pinnacle in rollerblades with your hands tied behind your back.

06
Jun

How to become a better .NET developer

If I can give one single piece of advice to ASP.NET developers anywhere, it will be this:

Learn another web development environment.

I really can not emphasise this strongly enough. From what I’ve observed, developers who only work with ASP.NET seem to have quite a bit of difficulty thinking outside of the Microsoft box. I am frequently confronted with indiscriminate and even inappropriate use of aspects of the .NET framework that don’t scale, such as DataSets, view state, or drag-and-drop programming. There’s nothing wrong with all these per se, but one of the most important things you need to know about how to use them is when not to use them. When all you have is a hammer, everything starts to look like a nail.

The ASP.NET Web Forms model in particular was originally designed to make web development look like Windows development, and ease the transition for VB6 developers from programming for rich Windows clients to the web. The result of this is that it has made the easy aspects of web development almost brain dead, while introducing a horrendously leaky abstraction layer that makes the hard things even harder, with masses of gotchas and pitfalls to trip you up if you venture outside it.

Languages such as PHP, Ruby on Rails or Python don’t have the same leaky abstractions, so developers tend to not only program “closer to the metal” but to think closer to the metal as well. This is why most of the cool sites, with stunning Ajax effects, tend to be written in these languages and target these platforms, while ASP.NET is largely languishing in the enterprisey world of Dilbert-esque cubicle farms.

I recommend you choose your alternative carefully, however. Rails and Python are the best choices. They will teach you patterns, practices, conventions, O/R mapping, MVC, and all round agile and pragmatic programming, and they tend to be taken up by smart and experienced developers who know what they’re doing. I have mixed feelings about Java: while you can learn a lot from it, like .NET it is very enterprisey, and at a time when everyone is getting excited about dynamic languages, Java is heading in completely the opposite direction. And I certainly don’t recommend PHP as a learning exercise: it is a beginners’ language — and a mind-bogglingly badly designed one at that — and while PHP guys are generally pretty enthusiastic and some of them are quite smart, and there are some decent PHP frameworks such as CakePHP and Symfony, the overwhelming majority of the PHP community simply don’t have what it takes to be programmers. Having said that, you need to know it, simply because it’s so pervasive.

You should also learn Linux if you can. It will teach you about modular design and the value of scripting everything that can be scripted. This is right at the heart of why Unix is Unix: a large part of its philosophy involves chaining text-based programs where the output of one can be passed as the input to another, to produce some fairly powerful command-based functionality, and scripting repetitive tasks so that their outcomes can be reliably reproduced. These are philosophies that seem largely lost in the world of Windows, which relies much more heavily on the visual, drag, drop and click approach of dialog boxes and wizards, even though they are every bit as essential if you want to have robust procedures and practices in place.

And whichever platform you take on board, you simply must familiarise yourself thoroughly with CSS, DHTML, JavaScript and Ajax, and at least one JavaScript framework such as Prototype or jQuery.

Personally, I still think that ASP.NET is technically the best platform on which to develop scalable, high performance, reliable web applications. However, in order to make the most of it, you need to have a good feel for what approaches you can import and learn from other platforms. Otherwise you will be stuck with the limitations and leaky abstractions of Web Forms.

31
May

Productivity metrics: garbage in, garbage out

I came across this article today when I was googling for a link for another blog entry. I was flabbergasted to see that it was written by someone with a PhD, appears in a professional engineering journal, and is currently linked from their home page:

Over time, there have been many attempts to define metrics that effectively measure software development productivity. Most of the ones that I have seen are amazingly complicated and very difficult to apply.

I think there is a simpler productivity metric which should be used across the industry: the total number lines of code in the organization divided by the number of people who are working on that code (including QA as well as development). For short, I will call this metric the LOC per head.

I propose that this measurement is an excellent representation of the development organization’s true productivity. If the number rises, it means that the development organization is more productive. If it decreases, it means that the organization is less productive

Ah, the old lines of code chestnut again. For some reason, managers seem to love it. The only problem is, it’s totally brain-dead. Like government targets, any formal productivity metric can and will be gamed — usually with disastrous results, as Joel Spolsky points out.

You want lines of code? Be prepared for your code base to be poisoned with endless copy and paste code and needless repetition, which, as any competent developer will tell you, is a nightmare to maintain. Or you may even end up with a joker on your team who decides to script the process and produce a million lines of code a second without even turning up at the office.

Besides, some frameworks such as Ruby on Rails or jQuery allow you to accomplish much more with far fewer lines of code. The first release of 37 Signals’ Ta-Da List — a full-blown commercial product — contained less than 600 lines of Ruby code. So does that make DHH and colleagues unproductive? Of course not! On the contrary — it makes them brilliant.

You want lots of check-ins to source control? Fine, you’ll end up with dozens of them just to correct a single spelling mistake — and as a side effect, a version history that leaves everyone totally confused as to exactly what’s been going on.

You want lots of bug fixes in the issue tracker? Expect your developers to deliberately write bugs into their code so that they can “fix” them.

You want to compensate for this by penalising bug reports? You’re asking your developers to mislead your testers about what functionality is actually in the code base so they’ll pick up on fewer bugs.

And so on, and so on.

As the old computing adage goes, garbage in, garbage out.

29
May

What is the difference between a web designer and a web developer?

We got an application in from a seemingly very talented web designer the other day in response to our job posting. With some pretty impressive artwork on her online portfolio, she might be a serious consideration if we were looking for someone to fulfil a role involving primarily graphic design.

However, there is just one question. We are looking for a developer, rather than a designer — so will she make the grade in that particular department?

I get the impression that the difference between web developers and web designers is somewhat lost on many people. This is probably quite understandable — the edges between the two is a rather blurry one, with a good deal of overlap, and both require a lot of creativity — and many people manage to handle both roles remarkably well. However, they involve completely different skill sets and aptitudes.

Designers tend to focus very much on the front end. They are (or at least they should be) good at art and graphic design, and if they are designing for the web, they should know HTML and CSS. They will be able to produce great WordPress themes, Flash animations and other eye candy. They most likely also know some basic PHP, MySQL and JavaScript.

The great unknown, however, is how well they can handle the more technical aspects of building a web application. Some of them are good at this, some are not so good. It is all too easy to forget that web development is software development — as a web developer, you are concerned with the much more technical aspects of the job. You need to understand database normalisation and object oriented design patterns, for starters, otherwise you will end up producing unnecessary duplication and bad code. You also need to have a firm grasp of security — at the very least you should understand topics such as SQL injection, cross site scripting and defence in depth. Then there are other aspects such as data structures, string manipulation, regular expressions, web services, scalability, caching, threading, concurrency, transactions, and so on. If any of that sounds like Klingon to you, then either you are not a developer or else you need to mug up on a few basic essentials.

Indeed, since you have to understand fairly difficult concepts such as concurrency, scalability and threading, web development can actually be harder to get right than traditional desktop development.

I sometimes wonder if web development gets such a bad reputation for the quality of code sometimes because there are a lot of people out there describing themselves as web developers when actually they are better suited to working as web designers. In order to be a good developer you need to be able to think at multiple levels of abstraction at the same time, pick up on patterns in things, and so on. Not everyone has the brain circuitry that enables them to do this.

By all accounts, a good test of this is how you handle recursion. Many people — even some computer science students — simply can’t understand it, viewing it purely as a bug that causes a stack overflow and therefore needs to be avoided. However, being able to use recursion effectively is a fundamental skill that crops up over and over again in programming. Traversing a directory tree, the nodes in a DOM document, or the page structure in a hierarchical content management system, should be second nature to all developers everywhere.

20
May

Where are all the passionate .NET developers?

We’re looking to take on another developer.

The majority of our work is in C#/.NET, so obviously we’ve adjusted our skills requirement accordingly. However, what we are really looking for are smart people who get things done and have a real passion for what they are doing. If you’re smart and passionate, it isn’t a disaster if you don’t have two years of .NET experience, because smart, passionate developers can pick up pretty much anything very quickly, and besides, in this game you have to be learning very quickly all the time.

So, how can you identify the passionate ones?

For starters, I personally think that CVs tell you very little. When I see your average developer CV, my eyes tend to glaze over and all I see is white noise. They show that you have x years of experience in y platform, and that you know what all the current buzzwords are, but that is about it. They don’t tell me whether you spent those x years cutting and pasting code snippets out of those stupid PHP tutorials that teach you to write SQL injection vulnerabilities, or whether you were implementing recursive algorithms and Markov chains in your sleep.

No, the easiest way to get a decent first impression is to Google them and see what their online footprint looks like. You can typeset your CV in Comic Sans for all I care, but if we find you have a blog, we will sit up and take notice. Merely the fact that you are going beyond the 9-5 mentality and showcasing your skills to the world puts you head and shoulders above the crowd.

However, even then, there are blogs and there are blogs. Some developer blogs are very dry indeed — they consist of little more than a string of deadpan howtos and regurgitations of whatever SDK you are using. I’m not saying your blog shouldn’t contain any of those at all, but you need to convey some life with them. What’s the story behind the bug you’re blogging about? What’s your opinion on Hungarian notation? I don’t care if you say something I don’t agree with — the very fact that you actually have an opinion and aren’t being totally insipid is worth a tremendous amount.

Even better are contributions to an open source project. They don’t have to be in .NET — if all your publicly showcased code is in PHP, that’s fine. Rails is even better, simply because Rails developers seem to be the most passionate ones of the lot. One of the best conversations with another developer that I’ve had in a long time was with a Rails developer at MiniBar about a year ago. His enthusiasm was infectious.

And this is where my gripe is. Why don’t we see the same passion and enthusiasm in .NET land?

This is something I’ve noticed in general. PHP often has a reputation for producing a lot of bad code, but PHP developers are much more likely to blog, and their blogs frequently seem to have a lot more sparkle to them. The PHP guys that I know may not necessarily be brilliant coders, but they almost all have much more passion and drive than their .NET and Java counterparts. I think it’s fair to say that this exhibits itself in higher standards too, particularly visually: more often than not, PHP and Rails blogs are pure eye candy, and you certainly never see any of them producing anything as gross as purple and blue Lucida Sans.

You see it in the open source world too. My friend Sam McGeown recently lamented the fact that there are no real .NET WordPress killers. I don’t think it’s likely that there ever will be either: open source is generally acknowledged to be very much a second class citizen in the Microsoft ecosystem, and far too many open source .NET projects simply peter out and die completely after a year or two.

Some people think the problem is that Microsoft has been dragging its heels over open source for far too long. This is true to an extent, but apart from that, the problem is that the .NET (and to a lesser extent, Java) ecosystems are just too enterprisey for their own good. They tend to find their niche in large development teams in large companies, where developers are generally small fish in a huge pond. In the enterprise, you are spending all day every day implementing frustratingly crazy business rules, and you are not writing code for the end users but for their bosses, who often won’t sign off on an Ajax drop down search if it costs them an extra five hundred pounds. In an environment such as that, code gets written to the lowest common denominator and there can be little impetus to pull out all the stops and go the extra mile. The way up the career ladder is not to become a better developer, but to step off the coding ladder altogether and into project management, or enterprise architecture, or an MBA, and make way for another generation of mediocre programmers.

Unfortunately, nearly all the developers in the .NET ecosystem seem to have most of their commercial experience in that kind of setup. They can maybe offer us seven or eight years of experience as 9-5 developers, but the passion just isn’t there. Sure, there are people who buck the trend, but I can’t avoid the conclusion that the overwhelming majority of smart, passionate, enthusiastic developers work with PHP, Rails and Python.

26
Sep

Is your code held together with bits of string?

Meh.

I hate naive code that sends data to a database by concatenating it into a SQL string.

Unfortunately, there is far too much of it knocking around — no doubt because of the proliferation of rubbishy tutorials that teach beginners that that is the way to do database access.

Take this C# example:

public int InsertEvent(DateTime date, string description)
{
    using (SqlConnection cn = new SqlConnection(connectionString)) {
        cn.Open();
        SqlCommand cmd = new SqlCommand(cn,
            "insert into Events (Date, Description) values "" +
            date.ToString() + "", "" +
            description + "";select @@IDENTITY");
        return (int)cmd.ExecuteScalar();
    }
}

It’s not just the SQL injection vulnerability that makes this code stink like a sewer: it has localisation problems as well.

Here in the UK we write dates as day/month/year, so today would be 26/09/2007. However, on the other side of the pond, they write dates as month/day/year, so today would be 09/26/2007. So if the locale of your ASP.NET application is different from the locale of your database login, you will get either the wrong date or a data conversion error.

On your development computer and your production server you will probably have your locales set up so that it all works correctly. However, it causes problems when you have to set up the application on a new box — for instance, when another developer starts to work on the project. Especially if the other developer is in another country.

Please stop doing this!!!!

Any decent, modern programming language will let you use parametrised queries to keep your SQL and data separate. These also allow you to send dates and times to the database in native, unambiguous datetime format, avoiding any thorny localisation issues, and they all but eliminate SQL injection vulnerabilities.

The only excuse for using string concatenation in this way is that you still have to support PHP 4 which does not give you the option of parametrised SQL queries for MySQL. Even then, if at all possible, you should be upgrading to PHP 5, which does.