Inseparable concerns

Separation of concerns is often cited as the reasoning behind the traditional three-layer architecture. It is important, otherwise you will end up with a Big Ball of Mud.

However, in order to separate out your concerns, you must first categorise them correctly as either business concerns, presentational concerns or data access concerns. Otherwise you will end up with unnecessary complexity, poor performance, anaemic layers, and/or poor testability.

Unfortunately, most three-layer applications completely fail to categorise their concerns correctly. More often than not this is because it is simply not possible to do so, as some concerns fall into more than one category and can’t be refactored out without introducing adverse effects. I propose the term inseparable concerns for such cases.

The key to separation of concerns is to let it be driven by your tests. Under TDD, the first thing you would do if a particular line of code contained a bug would be to write a failing unit test that would pass given the expected correct behaviour. It is what this test does that tells you whether the code under test is a business concern, a data access concern, or a presentational concern.

It is a presentational concern if the test simulates raw user input, or examines final rendered output. For example, mocking any part of a raw HTTP request (GET or POST arguments, cookies, HTTP headers, and so on), verifying the returned HTTP status code, or examining the output generated by a view. In general, if it’s your controllers or your views that you’re testing, it’s a presentational concern.

It is a business concern if the test verifies the correctness of a business rule. Basically, this means that queries are business concerns, period. If they are not returning the correct results, then they have not implemented some business rule or other correctly. Other examples of business concerns include validation, verifying that the data passed to the database or a web service from a command is correct, or confirming that the correct exception is thrown in response to various failure modes.

It is a data access concern if the test requires the code to hit the database. Note that this is where the so-called “best practice” that your unit tests should never hit the database breaks down: if you are adhering to it strictly, sooner or later you will encounter a bug where it stops you from writing a failing test. Most people, when confronted with such cases, skip this step. Don’t: TDD should take precedence. Set up a test database and write the test already.

It is an inseparable concern if it falls into more than one of the above categories. Pretty much any performance-related optimisation that you do will be an example here. For example, if you have to bypass Entity Framework and drop down to raw SQL, you will have to hit the database to verify that business logic is correct. Therefore, it is both a business concern and a data access concern.

Inseparable concerns are much more prevalent than you might expect. IQueryable<T> is the best that we’ve got in terms of making your business and data access layers separable, but, as Mark Seemann points out, it still falls short because NotSupportedException. Another example is calling .Include() on a DbSet to include child entities. Although this is a no-op on Mock<IDbSet<T>>, you can’t verify that you are making the correct calls to .Include() in the first place without hitting the database. Besides which, if you’re mocking DbSet<T> instead of IDbSet<T>, as you’re supposed to be able to do with EF6, calling .Include() throws an exception.

I would just like to stress here that inseparable concerns are not an antipattern—they are a fact of life. All but the simplest of code bases will have them somewhere. The real antipattern is not introducing them, but trying to treat them as if they were something that they’re not.

If your tests aren’t hitting the database, you might as well not write tests at all

Out of all the so-called “best practices” that are nothing of the sort, this one comes right up at the top of my list. It’s the idea that hitting the database in your tests is somehow harmful.

I’m quite frankly amazed that this one gets as much traction as it does, because it’s actively dangerous. Some parts of your codebase require even more attention from your tests than others — in particular, parts which are:

  1. easy to get wrong
  2. tricky to get right
  3. not obvious when you’re getting it wrong
  4. difficult to verify manually
  5. high-impact if you do screw up.

Your data access layer, your database itself, and the interactions between them and the rest of your application fall squarely into all the above categories. There are a lot of moving parts in any persistence mechanism — foreign key constraints, which end of a many-to-many relationship you declare as the inverse, mappings, migrations, and so on, and it’s very easy to make a mistake on any of them. If you’ve ever had to wrestle with the myriad of obscure, surprising and gnarly error messages that you get with both NHibernate and Entity Framework, you’ll know exactly what I mean.

If you never test against a real database, but rely exclusively on mocking out your data access layer, you are leaving vast swathes of your most error-prone and business-critical functionality with no test coverage at all. You might as well not be testing anything.

Yes, tests that hit the database are slow. Yes, it’s off-putting to write slow tests. But tests that don’t hit the database don’t test things that need to be tested. Sometimes, there are no short cuts.

(Incidentally, this is also why you shouldn’t waste time writing unit tests for your getters and setters or for anaemic business services: these are low-risk, low-impact aspects of your codebase that usually break other tests anyway if you do get them wrong. Testing your getters and setters isn’t unit testing, it’s unit testing theatre.)

“But that rule just applies to unit tests. Integration, functional and regression tests are different.”

I agree there, and I’m not contradicting that. But if you’re saying “don’t hit your database in your unit tests” and then trying to qualify it in this way, you’re just causing confusion.

Regardless of what you are trying to say, people will hear “don’t hit the database in your tests, period.” People scan what you write and pick out sound bites. They see the headline, and skip over the paragraph about integration tests and so on as if it were merely a footnote.

By all means tell people to test their business logic independently of the database if you like, but phrase it in a way that’s less likely to be misunderstood. If you’re leaving them with the impression that they shouldn’t be testing their database, their data access layer, and the interaction between them and the rest of your application, then even if that isn’t your intention, you’re doing them a serious disservice.

First impressions of MSpec

For several years now, I’ve used Gallio and MbUnit for unit testing, but just the other day I was introduced to MSpec. I’ve been posting some of my initial thoughts about it on Twitter and I thought I’d better expand on them here. The idea is that you should be able to express your tests succinctly in a language that is supposedly closer to English than to C#.

Something like this:

[Subject(typeof(LogoController))]
public class when_index_action_is_invoked
{
    static LogoController controller;
    static ViewResult result;

    Establish context = () => {
        controller = new LogoController();
    };

    Because of = () => result = controller.Index();

    It should_return_a_ViewResult_with_default_view_name = () => {
        result.ViewName.ShouldBeEmpty();
    };
}

Whoa there. Just what’s going on here?

Whatever it’s doing, it completely subverts the C# language, gives the finger to Resharper and StyleCop, and throws out every best practice and coding convention in the book. Multiple classes per file. All lower-case names, with underscores. And for class and class property names, which should be nouns, you get verbs (Establish), pronouns (It) and prepositions (Because of). It’s like waking up one morning and discovering that the word “the” is no longer the definite article, but now means Gatwick Airport.

When I see code like this, it reminds me of one thing in particular.

Why’s (Poignant) Guide to Ruby.

Chunky bacon!Now I’ve no idea where Joel Spolsky got the idea that Python is for Asperger’s geeks, but he was definitely onto something when he said that Ruby is for tear-streaked emo teenagers, because the Poignant Guide is the very manifesto for that particular stereotype. With the aid of talking cartoon foxes making inexplicable remarks about chunky bacon, _why devotes 176 pages to the thesis that if your code looks sufficiently like English, it automatically qualifies as a Mills and Boon novel.

I say this because MSpec was, in fact, heavily influenced by RSpec, a similar framework for Ruby, and I can’t help getting the impression that throwing out all the .NET conventions looks like an attempt to make C# look more like Ruby — in a kind of Ruby-that-tries-to-look-like-English way.

The principles behind MSpec seem sound enough. A typical unit test should have three phases to it: Arrange (set up your sample data and context), Act (carry out the action under test), and Assert (make sure that you get the correct results). MSpec represents these as the Establish, Because of, and It clauses respectively. One advantage is that every assertion is treated as a separate test, so you can have multiple Asserts for a single context and action, and see their outcomes separately in your reports.

Is it any better than the conventional way of writing tests? One of the claims of MSpec is that it is designed to reduce noise in tests. I’m not sure that it does. In effect, each MSpec class performs the same work as a single test method in MbUnit, and in fact the framework is coercing a C# class to behave as if it were a function. This, along with lower case identifiers with underscores, and multiple classes per file, look like examples of dragging Ruby conventions kicking and screaming into the .NET world, and this inevitably introduces a lot of cognitive dissonance, which can be quite confusing until you’ve figured out what’s going on. It also makes me worry how many leaky abstractions are involved, and I doubt if it reduces the amount of code that you have to write overall.

Long lines are another thing that worry me — the MSpec best practices say that you should have only one line of code in your Because and It clauses, and as a result, some of the MSpec code I’ve encountered extends well past 80 columns. I hate long lines of code with a passion because having to scroll left and right all the time dramatically reduces readability and makes it much harder to diff your code in source control.

Of course, these are just my initial impressions. Perhaps in a few months’ time, you may find me in slim-fit jeans and a tight T-shirt, my eyes hidden behind long side-swept bangs, and tears streaming down my cheeks at the poignancy of MSpec. But as far as I can see at the moment, it’s a bit of an acquired taste.