james mckay dot net

Blah blah scribble scribble waffle waffle
25
Sep

Dolstagis.Web 0.3 is out

The latest version of my hobby project has now been released to NuGet for anyone who wants to tinker.

When I first started it off a year ago, it was mainly an experiment to see just how long it would take me to write a minimum viable alternative MVC framework. I was mainly inspired in this by tinkering with some of the alternatives to ASP.NET MVC such as NancyFX, FubuMVC and so on, but also because I’d run out of ideas for something to keep me entertained on my daily commute into London. In the end it took me about a month or so.

Of course, there’s a vast difference between “minimum viable” and “actually usable,” and all I was interested in at the time was a basic proof of concept, so once I had the bare minimum up and running, I decided to just park it. But then back in August I was inspired to take it a bit further, so since then I’ve been doing a bit more work on it, and now I think I’ve got it to a point at which I can realistically start dogfooding it on another hobby project.

Over the coming weeks I’ll be blogging about some of the features I’ve been building into it, but for now here’s a summary of what I’ve been working on over the summer:

  • Everything is now OWIN-based rather than going through a custom abstraction layer based around HTTP handlers and modules.
  • The feature switching API has now been implemented. You can base your features switchable on a configuration setting, or a go-live date, or you can even write your own custom feature switches based on various aspects of your request, such as user agent or IP address.
  • Session state, cookie handling and a rudimentary user authentication mechanism are now available.
  • The routing engine has been rewritten. Each Feature now gets its own route table, which can be switched out for a custom implementation if the out-of-the-box option isn’t suitable for your needs.
  • I’ve taken a first stab at implementing some basic model binding.
  • Version 0.1’s Modules have been renamed to Features for consistency with the concept of feature switches.

Here are some ideas that I’m thinking of implementing in due course:

  • A Razor view engine adapter
  • Anti-XSRF protection
  • A Ruby on Rails-style “message flash”
  • An asset pipeline for bundling and minification
  • Some form of content negotiation
  • An A/B testing plugin built around the feature switch API
  • The ability to declare dependencies between features

It’s still not quite production ready yet, so it’s best to stick to breakable toy projects for now. Suggestions are, of course, always welcome.

18
Sep

How not to do logging: catch-log-throw

Way back in the mists of time, I worked on a project whose log files started spiralling out of control.

This wasn’t actually surprising, because the codebase in question was riddled with method after method that looked something like this:

public Widget GetWidget(int id)
{
    log.Debug("Getting widget with id " + id);
    try {
        var result = repository.GetWidget(id);
        if (result != null) {
            log.Debug("Successfully got widget with id: " + id);
        }
        else {
            log.Debug("No widget found with id: " + id);
        }
        return result;
    }
    catch (Exception ex) {
        log.Warn("Error fetching widget with id " + id, ex);
        throw;
    }
}

I’d objected to this about a year previously, but had encountered some stiff resistance from our team’s Best Practices Guy, who had been responsible for it in the first place.

The problem here is that the same exception was being logged multiple times, complete with deep stack traces, cluttering up the log files, making them very difficult to read and in the process making them grow out of control.

This is what catch-log-throw does.

But it doesn’t just cause problems with your infrastructure. It makes your code hard to read, hard to review, and easy to miss things. Our Best Practices Guy denied this when I said so, claiming that it was perfectly clear what it was doing, but you’ll see what I mean when I strip out the logging statements:

public Widget GetWidget(int id)
{
    return repository.GetWidget(id);
}

Two things become obvious here:

If you really do need this level of detail in your logs (and you usually don’t), a cleaner way to do it is to use an aspect-oriented framework such as Castle DynamicProxy or PostSharp. There’s really no need to clutter up your codebase with noise like this.

As a general rule, you should only log exceptions in catch { } blocks where they are not being re-thrown. If you’re catching it to recover from it and continue, log it as a warning; if you’re reporting an error to the end user, log it as an error. In general, a catch { } block should either log the exception or re-throw it. Unless you have a very good reason to do so, it shouldn’t do both.

11
Sep

How not to do logging: unnecessary abstractions

This is a very common pattern that I see over and over again in project after project:

public class MyService
{
    private ILogger _logger;
    /* snip */

    public MyService(ILogger logger, /* snip */)
    {
        _logger = logger;
        /* snip */
    }
}

There are a few problems here.

1. Don’t use dependency injection to create your loggers.

The problem with using dependency injection to create your loggers is that it denies you access to one of the most useful features of these logging frameworks: hierarchical loggers. The recommended way to instantiate loggers is to have just one for each class, with each logger being named after the class in which it is used. For example, with log4net, you would do this:

namespace MyNamespace
{
    public class MyService
    {
        private static readonly ILog log
            = LogManager.GetLogger("MyNamespace.MyService");

        /* snip */
    }
}

With NLog it is even simpler:

namespace MyNamespace
{
    public class MyService
    {
        private static readonly Logger log
            = LogManager.GetCurrentClassLogger();

        /* snip */
    }
}

Why is this so important? Simple. It allows you to fine tune your logging output on a namespace-by-namespace or a class-by-class basis. For example, you could send debugging information from NHibernate’s internals to a separate file, or log debug information only for your e-mail handling classes.

On the other hand, when you’re using your IOC container to create a logger, you can only specify a single named logger right across the board for your entire application. Sure, some IOC containers give you a way to determine the type of object into which you are injecting your logger, but others don’t, and I’ve never seen this done anyway even with those that do. You end up completely losing access to the hierarchy.

Another problem with using an IOC container here is that it limits your use of logging to classes that were created by the container in the first place. Sure, if your container exposes a service locator as a singleton (as for example StructureMap does with ObjectFactory.Instance) you could use that, but it’s ugly, not all IOC containers do that, and those that do aren’t always used that way anyway.

Finally, injected loggers do have an impact on performance. If you are injecting your loggers, your IOC container has to do more work each time you instantiate a new service, in order to locate the right logger and pass it in as a parameter. On the other hand, by creating loggers as static readonly members, you are only creating a single logger once per AppDomain for each class. This performance difference is admittedly small, but with classes that are instantiated frequently, it can easily add up.

2. Don’t abstract your loggers in application code.

There’s a case for writing an abstraction layer around your loggers when you’re creating a NuGet package for third party distribution. Some .net developers use log4net because it’s the one everyone’s heard of; others swear by NLog because it’s the best; and then you have the Microsoft-only crowd who won’t touch anything other than the Logging Application Block. As a third party library developer, you have to support all three. (Well, maybe not so much the third, because the kind of people who use the Logging Application Block are often the kind of people who won’t touch your library with a barge pole because you’re not Microsoft.)

But as an application developer, you don’t have to support anyone other than yourself, so an abstraction layer is superfluous here.

Besides being superfluous, the main problem with abstracting your loggers is that most people get it wrong. Your typical logging facade looks like this:

public interface ILogger
{
    void LogFatal(string message);
    void LogError(string message);
    void LogWarning(string message);
    void LogInfo(string message);
    void LogDebug(string message);
}

That’s all. You’re denying access to a whole lot of important features of your logger. For example, consider this code:

foreach (PropertyInfo prop in type.GetProperties())
{
    _logger.LogDebug(String.Format("Property {0} has type {1}",
        prop.Name, prop.PropertyType.Name));

    DoSomething(prop);
}

Even if your logger’s logging level is set to something higher than Debug, you are still calling String.Format and various reflection properties in a loop. In some cases, this can have a significant performance impact. What you should be doing instead is making use of log4net’s IsDebugEnabled property:

foreach (PropertyInfo prop in type.GetProperties())
{
    if (log.IsDebugEnabled) {
        log.DebugFormat("Property {0} has type {1}",
            prop.Name, prop.PropertyType.Name);
    }

    DoSomething(prop);
}

Exceptions are another one. You need to be able to pass exceptions to your logger, especially at the warning, error and fatal levels.

3. Don’t mock your loggers in tests.

Of course, all this raises the question about testability. What about mocking your loggers, you may ask?

The answer is simple: you don’t need to.

Your logging statements shouldn’t affect the outcome of your tests. If they do, then you must be doing something pretty esoteric with your logging and getting it wrong, in which case, your tests should fail.

In any case, your tests are one place more than any other where you should be able to inspect your logging output. Mocking them loses you access to this vital information.

04
Sep

A maturity model for best practices

As developers, we’ve all encountered them. People who promote or even attempt to enforce a specific (usually inefficient, misunderstood, and inappropriately applied) way of working, and who dismiss any suggestion to the contrary out of hand with “you’re not sticking to best practices.”

Here’s my response to such people. It’s a quick and dirty scale to assess your maturity in how you deal with the concept of best practices:

Level 0: You don’t have any concept of best practices. You just fudge along and more or less busk it.

Level 1: You are aware of some common practices, which you follow simply because you are told that they are “best practices.” You attempt to master them because you believe that doing so will be advantageous to your career. You think that if you don’t stick to them, you’ll run into problems down the road, but you can’t say what those problems are.

Level 2: You are able to explain the benefits of your “best practices.” You promote them and seek to enforce them on your team, and dismiss any suggestion of alternatives as cowboy territory. You are aware of what problems they claim to solve, but not of how effective they are at actually solving them in practice.

Level 3: You are able to explain the disadvantages of your “best practices,” to propose alternatives, and to assess which ones are actually relevant to your specific situation.

Level 4: You look for evidence that your “best practices” actually deliver the benefits that they claim to deliver. You are able to identify those that do not, or whose supposed benefits are out of date, no longer relevant, fallacious, or confused with other concepts.

Level 5: You are able to come up with criteria by which to evaluate best practices for fitness for purpose.

Where do you think these Best Practices Guys fit in?

28
Aug

Sorting out the confusion that is OWIN, Katana and IAppBuilder

I’ve been doing some more work on my MVC hobby project lately, and one thing I’ve been working on has been replacing the rather poorly thought out abstraction layer around the host process with OWIN.

If you’ve never come across OWIN before, it’s the new standard way of decoupling .net-based applications and frameworks from the servers that run them, a bit like WSGI for Python or Rack for Ruby. It means that you can host your web application not only in IIS but also in a console application, or a Windows application, or even in Apache under Mono on a Linux server. The first version of the standard was finalised about two years ago.

The OWIN specification is elegantly simple. You just have to provide a delegate of type Func<IDictionary<string, object>, Task> — or in other words, something that looks like this:

public Task SomeAppFunc(IDictionary<string, object> environment);

where the environment dictionary provides a standard set of keys containing things such as the request and response headers, body, and so on. This delegate is called the AppFunc. The values are all BCL types, so you don’t have to take dependencies on anything else. In fact, the OWIN specification explicitly says this:

OWIN is defined in terms of a delegate structure. There is no assembly called OWIN.dll or similar. Implementing either the host or application side the OWIN spec does not introduce a dependency to a project.

So putting all this together, a “Hello World” OWIN application would look something like this:

public Task HelloWorldAppFunc
    (IDictionary<string, object> environment)
{
    var responseHeaders = environment["owin.ResponseHeaders"]
        as IDictionary<string, string[]>;
    var responseBody = environment["owin.ResponseBody"]
        as Stream;
    responseHeaders["Content-Type"] = new string[]
        { "text/plain" };
    var writer = new StreamWriter(responseBody);
    writer.WriteLine("Hello world");
}

That’s it. But now we need to find somewhere to host it — and here, we run up against a problem.

The problem is that most of the OWIN “hello world” tutorials that you see on the web simply don’t look like that. Take for instance the one you see on www.asp.net:

public void Configuration(IAppBuilder app)
{
    // New code:
    app.Run(context =>
    {
        context.Response.ContentType = "text/plain";
        return context.Response.WriteAsync("Hello, world.");
    });
}

Just a minute … what’s this IAppBuilder? And where in the OWIN specification are there classes with a Response property, or a ContentType property and a WriteAsync method?

What you are looking at is not OWIN itself, but a set of libraries created by Microsoft called Katana. These libraries provide, among other things, some strongly typed wrappers around the AppFunc defined in the OWIN specification, so in one sense they’re useful in reducing boilerplate code.

The problem here is that Katana is built on an obsolete pre-release draft of the OWIN specification. The IAppBuilder interface was originally described in initial drafts of the OWIN specification, but it has since been removed. IAppBuilder is defined in a NuGet package called owin.dll, but the community voted to sunset this back in May, and it’s now considered deprecated; new OWIN-related libraries should not use it. That’s why it’s so difficult to find any documentation on IAppBuilder itself: a Google search for IAppBuilder.Use merely leads to a couple of extension methods provided by Katana.

So…given our nice shiny AppFunc, how do we host it?

In theory, we should be able to just pass it to the host process. Some OWIN hosts, such as Nowin, let you do just that, by passing it into the ServerBuilder.SetOwinApp method. With Katana, it’s a little bit more complicated.

The IAppBuilder interface declares a method called Use, whose method signature looks like this:

void Use(object middleware, params object[] args)

Intuitively, you’d expect to be able to just pass your IAppBuilder method into the Use method. Unfortunately, if you try this, it throws an exception. What you actually have to do is to pass a middleware delegate. OWIN middleware (and this isn’t documented in the spec) is a delegate which takes one AppFunc and returns another AppFunc:

using AppFunc = Func<IDictionary<string, object>, Task>;
using MiddlewareFunc = Func<AppFunc, AppFunc>;

Confused? So was I at first.

The AppFunc that was passed in to the MiddlewareFunc is simply the next step in the chain. So your AppFunc should do what it has to do, then either call or ignore the AppFunc which was passed in. For example, this middleware would just log the start and end of each invocation to the console:

app.Use(new Func<AppFunc, AppFunc>(next => async env => {
    Console.WriteLine("Starting request");
    await next(env);
    Console.WriteLine("Ending request");
}));

If you are writing a self-contained application rather than middleware, your AppFunc will be the last step in the pipeline, so you will want to ignore the “next” AppFunc. You would therefore do this:

app.Use(new Func<AppFunc, AppFunc>(ignored => HelloWorldAppFunc));

There are other ways of registering OWIN apps or middleware with a Katana host, by passing a middleware type or instance with a specific signature, or by using one of Katana’s strongly-typed wrappers, but none of these are defined in the OWIN specification, so I won’t dwell on them here.

Fortunately, this is set to be clarified in ASP.NET vNext: there’s been a lot of feedback from the community that IAppBuilder shouldn’t be the only way of creating an OWIN pipeline, and that the Katana wrapper classes, OwinMiddleware, OwinRequest and OwinResponse, have been causing some confusion, so the means to host a raw OWIN application or middleware will become more transparent. In the meantime, I hope that this clears up some of the confusion.

21
Aug

Query Objects: a better approach than your BLL/repository

If you’ve been following what I’ve been saying here on my blog and on the ASP.NET forums over the past month or so, you’ll no doubt realise that I’m not a fan of the traditional layered architecture, with your presentation layer only allowed to talk to your business layer, your business layer only allowed to talk to your repository, only your repository allowed to talk to your ORM, and all of them in separate assemblies for no reason whatsoever other than That Is How You Are Supposed To Do It. It adds a lot of friction and ceremony, it restricts you in ways that are harmful, its only benefits are unnecessary and dubious, and every implementation of it that I’ve come across has been horrible.

Here’s a far better approach:

public class BlogController : Controller
{
    private IBlogContext _context;

    public BlogController(IBlogContext context)
    {
        _context = context;
    }

    public ActionResult ShowPosts(PostsQuery query)
    {
        query.PrefetchComments = false;
        var posts = query.GetPosts(_context);
        return View(posts);
    }
}

[Bind(Exclude="PrefetchComments")]
public class PostsQuery
{
    private const int DefaultPageSize = 10;

    public int? PageNumber { get; set; }
    public int? PageSize { get; set; }
    public bool Descending { get; set; }
    public bool PrefetchComments { get; set; }

    public IQueryable<Post> GetPosts(IBlogContext context)
    {
        var posts = Descending
            ? context.Posts.OrderByDescending
                (post => post.PostDate)
            : context.Posts.OrderBy(post => post.PostDate);
        if (PrefetchComments) {
            posts = posts.Include("Comments");
        }
        if (PageNumber.HasValue && PageNumber > 1) {
            posts = posts.Skip
                ((PageNumber - 1) * (PageSize ?? DefaultPageSize));
        }
        posts = posts.Take(PageSize ?? DefaultPageSize);
        return posts;
    }
}

A few points to note here.

First, you are injecting your Entity Framework DbContext subclass (the implementation of IBlogContext) directly into your controllers. Get over it: it’s not as harmful as you think it is. Your IOC container can (and should) manage its lifecycle.

Secondly, your query object follows the Open/Closed Principle: you can easily add new sorting and filtering options without having to modify either the method signatures of your controllers or its own other properties and methods. With a query method on your Repository, on the other hand, adding new options would be a breaking change.

Thirdly, it is very easy to avoid SELECT n+1 problems on the one hand while at the same time not fetching screeds of data that you don’t need on the other, as the PrefetchComments property illustrates.

Fourthly, this approach is no less testable than your traditional BLL/BOL/DAL approach. By mocking your IBlogContext and IDbSet<T> interfaces, you can test your query object in isolation from your database. You would need to hit the database for more advanced Entity Framework features of course, but the same would be true with query methods on your repository.

Fifthly, note that your query object is automatically created and populated with the correct settings by ASP.NET MVC’s model binder.

All in all, a very simple, elegant and DRY approach.

14
Aug

If your tests aren’t hitting the database, you might as well not write tests at all

Out of all the so-called “best practices” that are nothing of the sort, this one comes right up at the top of my list. It’s the idea that hitting the database in your tests is somehow harmful.

I’m quite frankly amazed that this one gets as much traction as it does, because it’s actively dangerous. Some parts of your codebase require even more attention from your tests than others — in particular, parts which are:

  1. easy to get wrong
  2. tricky to get right
  3. not obvious when you’re getting it wrong
  4. difficult to verify manually
  5. high-impact if you do screw up.

Your data access layer, your database itself, and the interactions between them and the rest of your application fall squarely into all the above categories. There are a lot of moving parts in any persistence mechanism — foreign key constraints, which end of a many-to-many relationship you declare as the inverse, mappings, migrations, and so on, and it’s very easy to make a mistake on any of them. If you’ve ever had to wrestle with the myriad of obscure, surprising and gnarly error messages that you get with both NHibernate and Entity Framework, you’ll know exactly what I mean.

If you never test against a real database, but rely exclusively on mocking out your data access layer, you are leaving vast swathes of your most error-prone and business-critical functionality with no test coverage at all. You might as well not be testing anything.

Yes, tests that hit the database are slow. Yes, it’s off-putting to write slow tests. But tests that don’t hit the database don’t test things that need to be tested. Sometimes, there are no short cuts.

(Incidentally, this is also why you shouldn’t waste time writing unit tests for your getters and setters or for anaemic business services: these are low-risk, low-impact aspects of your codebase that usually break other tests anyway if you do get them wrong. Testing your getters and setters isn’t unit testing, it’s unit testing theatre.)

“But that rule just applies to unit tests. Integration, functional and regression tests are different.”

I agree there, and I’m not contradicting that. But if you’re saying “don’t hit your database in your unit tests” and then trying to qualify it in this way, you’re just causing confusion.

Regardless of what you are trying to say, people will hear “don’t hit the database in your tests, period.” People scan what you write and pick out sound bites. They see the headline, and skip over the paragraph about integration tests and so on as if it were merely a footnote.

By all means tell people to test their business logic independently of the database if you like, but phrase it in a way that’s less likely to be misunderstood. If you’re leaving them with the impression that they shouldn’t be testing their database, their data access layer, and the interaction between them and the rest of your application, then even if that isn’t your intention, you’re doing them a serious disservice.

24
Jul

Interchangeable data access layers == stealing from your client

This is one of those so-called “best practices” that crops up a lot in the .net world:

You need to keep the different layers of your application loosely coupled with a clean separation of concerns so that you can swap out your data access layer for a different technology if you need to.

It all sounds right, doesn’t it? Separation of concerns, loose coupling…very clean, SOLID, and Uncle Bob-compliant, right?

Just. A. Minute.

The separation of concerns you are proposing is high-maintenance, high-friction, usually unnecessary, obstructive to important performance optimisations and other requirements, and, as this post by Oren Eini aka Ayende Rahien points out, usually doesn’t work anyway.

In what universe is it a best practice to allocate development time and resources, for which your client is paying, towards implementing a high-maintenance, high-friction, broken, unnecessary, non-functional requirement that they are not asking for, at the expense of business value that they are?

In the universe where I live, that is called “stealing from your client.”

Nobody is saying here that separation of concerns is bad per se. What is bad, however, is inappropriate separation of concerns — an attempt to decouple parts of your system that don’t lend themselves to being decoupled. Kent Beck has a pretty good guideline as to when separation of concerns is appropriate and when it isn’t: you should be dealing with two parts of your system which you can reason about independently.

You can not reason about your business layer, your presentation layer, and your data access layer independently. User stories that require related changes right across all your layers are very, very common.

Every project that I’ve ever seen that has attempted this kind of abstraction has been riddled with severe SELECT n+1 problems that could not be resolved without breaking encapsulation.

(Nitpickers’ corner: I’m not talking about test mocks here. That’s different. It’s relatively easy to make your test mocks behave like Entity Framework. It’s orders of magnitude harder to make NHibernate or RavenDB behave like Entity Framework.)

If you can present a valid business case for making your persistence mechanism interchangeable, then it’s a different matter, of course. But in that case, you need to implement both (or all) the different options up-front right from the start, and to bear in mind that the necessary separation of concerns almost certainly won’t cleanly follow the boundary between your business layer and your DAL. You should also warn your client of the extra costs involved, otherwise you won’t be delivering good value for money.

16
Jul

The Anaemic Business Layer

The three-layer architecture, with your presentation layer, your business layer and your data access layer, is a staple of traditional .net applications, being heavily promoted on sites such as MSDN, CodeProject and the ASP.NET forums. Its advantage is that it is a fairly canonical way of doing things, so (in theory at least) when you get a new developer on the team, they should have no trouble in finding where everything is.

Its disadvantage is that it tends to breed certain antipatterns that crop up over and over and over again. One such antipattern is what I call the Anaemic Business Layer.

The Anaemic Business Layer is a close cousin of the Anaemic Domain Model, and often appears hand in hand with it. It is characterised by business “logic” classes that don’t actually have any logic in them at all, but only shunt data between the domain model returned from your ORM and a set of identical model classes with identical method signatures in a different namespace. Sometimes it may wrap all the calls to your repository in catch-log-throw blocks, which is another antipattern in itself, but that’s a rant for another time.

The problem with the Anaemic Business Layer is that it makes your code much more time consuming and difficult to maintain, since you have to drill down through more classes just to figure out what is going on, and you have to edit more files to make a single change. This in turn increases risk because it’s all too easy to overlook one of the places where you have to make a change. It also makes things restrictive, because you lose access to certain advanced features of your ORM such as lazy loading, query shaping, transaction management, cross-cutting concerns or concurrency control, that can only properly be handled in the business layer.

The Anaemic Business Layer is usually symptomatic of an over-strict and inflexible insistence on a “proper” layered architecture, where your UI is only allowed to talk to your business layer and your business layer is only allowed to talk to your data access layer. You could make an argument for the need for encapsulation — so that you can easily change the implementation of the methods in the business layer if need be — but that’s only really important if you’re producing an API for public consumption by the rest of the world. Your app is not the rest of the world, and besides, those specific changes tend not to happen (especially for basic CRUD operations), so I’d be inclined to call YAGNI on that one.

The other reason why you might have an Anaemic Business Layer is that you’ve got too much going on in your controllers or your data access layer. You shouldn’t have any business logic in either, as that hinders testability, especially if you’re of the school of thought that says your unit tests shouldn’t hit the database. But if that’s not the case, then it’s time to stop being so pedantic. An Anaemic Business Layer serves no purpose other than to get in the way and slow you down. So ditch your unhelpful faux-“best practices,” bypass it altogether, and go straight from your UI to your repository.

10
Jul

On dark matter developers and the role of GitHub in hiring

The term “dark matter developer” was coined by Scott Hanselman a couple of years ago, when he wrote this:

My coworker Damian Edwards and I hypothesize that there is another kind of developer than the ones we meet all the time. We call them Dark Matter Developers. They don’t read a lot of blogs, they never write blogs, they don’t go to user groups, they don’t tweet or facebook, and you don’t often see them at large conferences. Where are these dark matter developers online?

The problem with Scott’s post is that he doesn’t give a very clear definition of dark matter developers. Certainly, I get the impression that a lot of people confuse dark matter developers with 501 developers, or low-end developers, or technological conservatives. Take a look at this Hacker News thread for instance—some people seemed to be categorising themselves as dark matter developers even though they were actively contributing to open source projects.

Here’s my suggestion for a more precise definition of a dark matter developer:

A dark matter developer is someone who has not made any evidence publicly available that they are able to do what their CV and LinkedIn profile claim that they can do.

This is pretty much the point that Troy Hunt makes in his blog post, “The ghost who codes: how anonymity is killing your programming career.” To be sure, he does throw in a curve ball where he confuses passion and competence, and I did take him to task on that one at the time, but that aside, the whole point he was making was actually a valid one. What evidence have you made publicly available that you really do have the skills listed on your CV or your LinkedIn profile?

Sadly, for well over 90% of developers out there in the market for a job, the answer is: none whatsoever.

I’ve heard plenty reasons why this might be the case, but I’m yet to hear a good reason for it. For example, some people claim that requiring a GitHub account discriminates against busy people and people with families. That seems like just making excuses to me. You don’t have to spend five hours a day at it, or even five hours a week — and a recruiter expecting something of that order probably would be discriminating unfairly. But if you can’t manage to rustle up five hours in a year to put together one or two small but well-written projects to showcase when you’re looking for a job, do you really have your priorities right?


Cueball: There’s a reason for everything.
Megan: Yeah, but it’s not always a good reason.
—”Time”, xkcd

This certainly wouldn’t be accepted in certain other lines of work. You wouldn’t hire a photographer or an architect who didn’t have a portfolio, for example, nor would you take an academic seriously without the all-important track record of peer-reviewed publications in scholarly journals. With that in mind, it seems a bit odd to me that we shouldn’t have something similar in the very industry that practically invented the online portfolio. Or that having something similar, we fail so spectacularly at actually making use of it.

The gold standard here is, of course, an active GitHub account. Some people have objected to the concept of GitHub as your CV in recent months for various reasons, but none of them have come up with any better suggestions, and the fact remains that even just one or two public GitHub or Bitbucket pull requests or similar shows that your code has been reviewed and endorsed by other developers. But even if you haven’t had any pull requests, any code up there is better than nothing. Even simple snippets will do—GitHub lets you post gists on https://gist.github.com/ just by copying and pasting from your IDE to your browser. In the absence of code itself, informed discussions about programming still carry some weight. A blog, or some answers on Stack Overflow, are both ways of supporting your credentials.

In the end of the day, being a dark matter developer probably won’t stop you getting a job, and similarly an online presence won’t guarantee you one. It simply isn’t feasible at this stage to systematically reject everyone who isn’t on GitHub, as some people advocate — especially in the .net ecosystem which is still frustratingly conservative and traditional. But having some code publicly available for hiring managers to review could certainly make the difference between your CV being noticed and it being lost in a pile along with thirty other CVs that all look exactly the same as each other. Besides, things are changing, and I sometimes wonder — five years or ten years from now, might dark matter developers end up finding themselves unemployable?