Fuzzy dates aren’t as good an idea as you think

I received an e-mail from a colleague the other day about some code that I’d recently pushed to GitHub. Since I’d pushed some more changes round about the time he sent the e-mail, I needed to know which revision he was referring to.

There’s just one problem:

image

Of course, I could have got the proper times from SourceTree or typing git log in the console, but it’s still annoying, especially since the GitHub page was more easily to hand. And GitHub does show you the exact time as a tooltip—something I missed at the time—but it’s still annoying, especially if you have to hover over half a dozen different datestamps to find the one you’re looking for.

We need to have a rethink about fuzzy dates. Yes, I know that it’s friendly and cuddly and warm and fuzzy and cute (and more interesting to code) to say “two days ago” or “eighteen hours ago,” but when I’m trying to refer back to 17:22 precisely, it’s utterly useless and just adds friction without providing any value whatsoever.

Inseparable concerns

Separation of concerns is often cited as the reasoning behind the traditional three-layer architecture. It is important, otherwise you will end up with a Big Ball of Mud.

However, in order to separate out your concerns, you must first categorise them correctly as either business concerns, presentational concerns or data access concerns. Otherwise you will end up with unnecessary complexity, poor performance, anaemic layers, and/or poor testability.

Unfortunately, most three-layer applications completely fail to categorise their concerns correctly. More often than not this is because it is simply not possible to do so, as some concerns fall into more than one category and can’t be refactored out without introducing adverse effects. I propose the term inseparable concerns for such cases.

The key to separation of concerns is to let it be driven by your tests. Under TDD, the first thing you would do if a particular line of code contained a bug would be to write a failing unit test that would pass given the expected correct behaviour. It is what this test does that tells you whether the code under test is a business concern, a data access concern, or a presentational concern.

It is a presentational concern if the test simulates raw user input, or examines final rendered output. For example, mocking any part of a raw HTTP request (GET or POST arguments, cookies, HTTP headers, and so on), verifying the returned HTTP status code, or examining the output generated by a view. In general, if it’s your controllers or your views that you’re testing, it’s a presentational concern.

It is a business concern if the test verifies the correctness of a business rule. Basically, this means that queries are business concerns, period. If they are not returning the correct results, then they have not implemented some business rule or other correctly. Other examples of business concerns include validation, verifying that the data passed to the database or a web service from a command is correct, or confirming that the correct exception is thrown in response to various failure modes.

It is a data access concern if the test requires the code to hit the database. Note that this is where the so-called “best practice” that your unit tests should never hit the database breaks down: if you are adhering to it strictly, sooner or later you will encounter a bug where it stops you from writing a failing test. Most people, when confronted with such cases, skip this step. Don’t: TDD should take precedence. Set up a test database and write the test already.

It is an inseparable concern if it falls into more than one of the above categories. Pretty much any performance-related optimisation that you do will be an example here. For example, if you have to bypass Entity Framework and drop down to raw SQL, you will have to hit the database to verify that business logic is correct. Therefore, it is both a business concern and a data access concern.

Inseparable concerns are much more prevalent than you might expect. IQueryable<T> is the best that we’ve got in terms of making your business and data access layers separable, but, as Mark Seemann points out, it still falls short because NotSupportedException. Another example is calling .Include() on a DbSet to include child entities. Although this is a no-op on Mock<IDbSet<T>>, you can’t verify that you are making the correct calls to .Include() in the first place without hitting the database. Besides which, if you’re mocking DbSet<T> instead of IDbSet<T>, as you’re supposed to be able to do with EF6, calling .Include() throws an exception.

I would just like to stress here that inseparable concerns are not an antipattern—they are a fact of life. All but the simplest of code bases will have them somewhere. The real antipattern is not introducing them, but trying to treat them as if they were something that they’re not.

The Repository Facade

Most developers use the term “Repository” to refer to a wrapper or abstraction layer around your O/R mapper, supposedly to let you switch out one persistence mechanism for another. However, if you look at its definition in its historical context, you’ll see that this isn’t what it refers to at all.

The Repository pattern is a part of your O/R mapper itself.

The Repository pattern was first described as follows in Martin Fowler’s Patterns of Enterprise Application Architecture:

Mediates between the domain and data mapping layers using a collection-like interface for accessing domain objects.

Patterns of Enterprise Application Architecture was written in 2003, at a time when O/R mapping technology was in its infancy. Most ORMs were commercial products, very simple by today’s standards — more akin to the likes of Dapper or PetaPoco than to modern heavyweights like NHibernate or Entity Framework. Hand-rolled data access layers were very much the order of the day. Furthermore, many of the patterns described in P of EAA — Table Data Gateway, Row Data Gateway, Data Mapper, Unit of Work, Identity Map, Lazy Load, and so on, all catalogue what are now different components of modern-day ORMs.

So when the Repository pattern talks about mediating between the domain and “data mapping layers,” it isn’t referring to your ORM as a whole, as most developers seem to assume, but to just one component of your ORM — specifically, the component that copies data from the results of the generated SQL query into your entities. This mediating layer is also an element of functionality provided by modern ORMs.

For example, Entity Framework’s DbSet<T> is a Repository. So too is NHibernate’s ISession, with methods such as QueryOver<T>().

So what is the wrapper class that people write around their ORMs then, the one that they tend to refer to as a Repository? A more accurate term for this is, in actual fact, a Repository Facade.

It’s important to draw the distinction, especially with the debate around whether this pattern has any value or not. Referring to your ORM itself as a Repository makes it easy for people to make the conceptual leap that allows them to just plug Entity Framework straight into their business service classes without the additional layer of abstraction, but on the other hand it can cause a bit of confusion if you then start saying that “the Repository pattern is harmful.” That’s why I’m now being careful to use the term “Repository” to refer to Entity Framework, NHibernate or the RavenDB client itself, and the term “Repository Facade” to refer to the practice of adding an extra abstraction layer around it.

Why feature switches?

The key feature of Dolstagis.Web is that it builds the concept of feature switches and Branch by Abstraction right into its core. To anyone who recalls the position that I adopted three years ago in The Great Feature Branch Debate, this will no doubt come as a bit of a surprise. Am I jumping the shark here?

Not really. I may have been objecting at the time to the black-and-white “feature branches bad, feature toggles good” line being adopted by Martin Fowler and his ThoughtWorks colleagues, but my own opinion is somewhat more nuanced than the polar opposite of “feature toggles bad, feature branches good.” There are actually a lot of clever things that you can do with feature toggles:

  • You can set them to flip at a specific date and time, for example, if you want to launch something at a conference.
  • You can have them go by IP address. For example, country-specific features.
  • You can use them for A/B testing.
  • You can use them to turn off features and degrade your site’s performance gracefully under load.
  • You may occasionally have a feature that needs to be turned off in order to perform a specific upgrade.

The problem with feature switches is that when they are used as an alternative to branching and merging, you are regularly deploying code into your production environment that you know for a fact to be buggy, immature and incorrect. If your feature switches aren’t properly architected, this can end up being exposed to the general public — with disastrous results. Breaking your tasks down into small user stories can help, but this is not always the case.

The commonest mistake here is that your feature switch doesn’t switch everything. On an ASP.NET web application, for example, it may switch out your controllers, or your routes, but you will still have the same views, the same JavaScript, the same CSS and the same images and other static assets available for both the “on” and “off” states. It’s difficult to turn static assets on or off when by default they’re all served up by IIS from your web application’s filespace.

Dolstagis.Web gets round this problem quite simply by requiring you to expose your static files and your views explicitly in your feature definition classes. If it hasn’t been added, it won’t be shown: it’s as simple as that. In fact, you aren’t even limited to using your application filespace: you could just as easily include your static files and views as resources within your assembly. You can even have the feature that you are toggling in a separate assembly altogether, and modify your build process so that your production version doesn’t even have a copy of the immature and buggy code.

Of course this doesn’t guarantee you that you’ll get your feature switches right, and there are still ways in which you can get them wrong, but hopefully this approach will help to make it easier to avoid running into problems.

Query Objects: a better approach than your BLL/repository

If you’ve been following what I’ve been saying here on my blog and on the ASP.NET forums over the past month or so, you’ll no doubt realise that I’m not a fan of the traditional layered architecture, with your presentation layer only allowed to talk to your business layer, your business layer only allowed to talk to your repository, only your repository allowed to talk to your ORM, and all of them in separate assemblies for no reason whatsoever other than That Is How You Are Supposed To Do It. It adds a lot of friction and ceremony, it restricts you in ways that are harmful, its only benefits are unnecessary and dubious, and every implementation of it that I’ve come across has been horrible.

Here’s a far better approach:

public class BlogController : Controller
{
    private IBlogContext _context;

    public BlogController(IBlogContext context)
    {
        _context = context;
    }

    public ActionResult ShowPosts(PostsQuery query)
    {
        query.PrefetchComments = false;
        var posts = query.GetPosts(_context);
        return View(posts);
    }
}

[Bind(Exclude="PrefetchComments")]
public class PostsQuery
{
    private const int DefaultPageSize = 10;

    public int? PageNumber { get; set; }
    public int? PageSize { get; set; }
    public bool Descending { get; set; }
    public bool PrefetchComments { get; set; }

    public IQueryable<Post> GetPosts(IBlogContext context)
    {
        var posts = Descending
            ? context.Posts.OrderByDescending
                (post => post.PostDate)
            : context.Posts.OrderBy(post => post.PostDate);
        if (PrefetchComments) {
            posts = posts.Include("Comments");
        }
        if (PageNumber.HasValue && PageNumber > 1) {
            posts = posts.Skip
                ((PageNumber - 1) * (PageSize ?? DefaultPageSize));
        }
        posts = posts.Take(PageSize ?? DefaultPageSize);
        return posts;
    }
}

A few points to note here.

First, you are injecting your Entity Framework DbContext subclass (the implementation of IBlogContext) directly into your controllers. Get over it: it’s not as harmful as you think it is. Your IOC container can (and should) manage its lifecycle.

Secondly, your query object follows the Open/Closed Principle: you can easily add new sorting and filtering options without having to modify either the method signatures of your controllers or its own other properties and methods. With a query method on your Repository, on the other hand, adding new options would be a breaking change.

Thirdly, it is very easy to avoid SELECT n+1 problems on the one hand while at the same time not fetching screeds of data that you don’t need on the other, as the PrefetchComments property illustrates.

Fourthly, this approach is no less testable than your traditional BLL/BOL/DAL approach. By mocking your IBlogContext and IDbSet<T> interfaces, you can test your query object in isolation from your database. You would need to hit the database for more advanced Entity Framework features of course, but the same would be true with query methods on your repository.

Fifthly, note that your query object is automatically created and populated with the correct settings by ASP.NET MVC’s model binder.

All in all, a very simple, elegant and DRY approach.

(Hat tip: Jimmy Bogard for the original inspiration. This version simply adds the twist of having your query objects created and initialised by ASP.NET MVC’s model binder.)