james mckay dot net
because there are few things that are less logical than business logic

Error handling is the one thing that puts me off learning Go

The Go-pherIt seems that all the cool kids are learning Go these days. It’s certainly appealing to be able to get the performance of C without all the headaches, and to be able to package your program up into a single, tight binary without masses and masses of bloated dependencies. Furthermore, since it’s the language of choice for a lot of important software, such as Kubernetes and Terraform, I’m probably going to have to get my head round it one way or another sooner or later.

But there’s one thing about Go that I really, really do not like one little bit: its approach to error handling. Rather than having exceptions, it reports errors as return codes.

Go’s justification for not having exceptions is as follows:

We believe that coupling exceptions to a control structure, as in the try-catch-finally idiom, results in convoluted code. It also tends to encourage programmers to label too many ordinary errors, such as failing to open a file, as exceptional.

Go takes a different approach. For plain error handling, Go’s multi-value returns make it easy to report an error without overloading the return value. A canonical error type, coupled with Go’s other features, makes error handling pleasant but quite different from that in other languages.

They really need to give some examples to back up this assertion, because when they say that exceptions result in convoluted code, I have no idea what on earth they are talking about. Sure, I’ve seen code that gets exception handling wrong, but that was more due to the code itself being bad rather than any problem with the concept of exceptions itself. I’ve also worked with codebases that get it right, and all I can say is that exceptions done right are much easier to work with and reason about than error codes.

Wrong reasons for objecting to exceptions

There are various reasons why people don’t like exceptions. Some people react against them because they were popularised by Java, and a whole lot of other Bad Things were popularised by Java as well. And yes, maybe Java did make some mistakes by implementing checked exceptions, but please, don’t throw out the baby with the bathwater.

Others complain about exceptions because people get them wrong, doing stupid things like this:

try:
    do_something()
except:
    pass

Please, people. The correct response to misuse is not disuse, but proper use. People will do stupid things with any programming language construct. It doesn’t mean that the constructs themselves are bad.

Others consider not having to write extra error handling code as laziness. But so what? Work is not about favouring busyness over laziness; work is about delivering value to your customers. If you can deliver the same value in half the time with half the bugs and half as many lines of code, you’re not being lazy; you’re being efficient.

Others complain about exceptions crashing their Python or C# code with an ugly looking stack trace. But this is easy to fix, simply by implementing a global exception handler at the top of your code, sending the stack trace to a logging service such as ElasticSearch, and just showing an appropriate message. In any case, which is worse — a stack trace, or a program that silently corrupts your data?

Others complain that they mean that you don’t know which functions might throw errors and which might not. But the only safe assumption that you can make is that any line of code might throw an error — and often for reasons that you are not expecting and did not anticipate.

What are you supposed to do with errors anyway?

The most important thing you need to realise is that both exceptions and error codes should have a very specific meaning — namely, that your method was unable to do what its specification says that it does. It could have failed for any number of reasons: bad user input, missing dependencies, external services having gone offline, timeouts, foreign key violations, null references, division by zero, out of memory, stack overflow, array bounds errors, or even literal bugs. But the important information that they convey is that you asked some other code to do X, and for whatever reason, it did not do X.

It is completely unhelpful to try to categorise them as “ordinary errors” or “exceptional errors.” This distinction is so vague and ambiguous as to be effectively meaningless, and in any case will depend more on context and your own use cases than on any intrinsic properties of the exceptions themselves. The main distinction that you need to make with errors is between those that you have anticipated and can meaningfully correct, and everything else.

For errors that you are able to handle, the correct action will usually be specified in your user stories, and as such, they will need to be handled on a case by case basis. But for errors that you have not yet anticipated, 95% of the time the correct action will be to assume that your own code is also unable to do what it is supposed to, stop what it is doing, and report a failure to the caller.

It is almost never appropriate for your code to carry on regardless after an error. If it does so, it will be running under assumptions that are incorrect. At best, it will result in further errors. At worst, it will silently corrupt your data.

Exceptions are about convention over configuration and safe defaults

Convention over configuration is a principle of language and framework design that says that if one specific course of action predominates, it should be made an implicit convention, and extra code should only be needed to override it.

With error codes, the default behaviour is to do precisely what you are not supposed to do — carry on regardless. Consequently, every single function call needs to be followed by a test for the return value. And the code to do so will be mindlessly, frustratingly repetitive:

    if err := datastore.Get(c, key, record); err != nil {
        return &appError{err, "Record not found", 404}
    }
    if err := viewTemplate.Execute(w, record); err != nil {
        return &appError{err, "Can't display record", 500}
    }

But what this is doing is exactly what exceptions do anyway! The whole point of exceptions is to take this repetitive boilerplate code and make it implicit. Other mechanisms, such as try/catch/finally blocks, exist to provide clear and specific ways to override this convention. The result is code that is clearer and easier to understand, with a significantly improved signal-to-noise ratio.

Yet the language designers of Go consider this boilerplate a virtue!

In Go, error handling is important. The language’s design and conventions encourage you to explicitly check for errors where they occur (as distinct from the convention in other languages of throwing exceptions and sometimes catching them). In some cases this makes Go code verbose, but fortunately there are some techniques you can use to minimize repetitive error handling.

Whatever happened to DRY? Whatever happened to convention over configuration?

Cleaning up

When your exception handling code demands more complex scenarios than just propagating the error up the call stack, 90% of the time it will simply be to clean up: close file handles, release locks, roll back transactions, and then propagate the error condition up the call stack. Furthermore, the cleanup code will usually be common to a whole block of other method calls, any of which could raise an error. This Python code is an example:

f = open('data.json')
try:
    transaction = repository.start_transaction()
    data = json.load(f)
    repository.update(data, transaction)
    transaction.commit()
except:
    transaction.rollback()
    raise
finally:
    f.close()

I shall leave it as an exercise for the reader to translate this block of code into Go. Now Go gives you the defer instruction to allow you to queue up functions to run at the end of your method, but if you have to run separate code for success (transaction.commit()) and failure (transaction.rollback()), as is the case here, your code will be significantly more complex. Additionally, many exception-based languages give you syntactic sugar for these most common exception handling cases — in particular, using in C#, or with in Python.

Stick to the conventions of your language

Of course, if you’re using Go, handling error codes is what you have to do. The Go designers decided to do without exceptions, and to introduce them at this stage would just cause confusion. You would end up with some functions returning error codes while others throw exceptions under exactly the same failure conditions.

Now Go does have a panic/recover construct that is similar to exceptions in some respects. But it is rarely — and inconsistently — used. We are told that it is supposed to be reserved for “truly exceptional conditions.” But what, exactly, makes one condition “truly exceptional” and another not? Why, for example, are array bounds errors exceptional, but Println errors, bad format strings, and broken connections are not? There is neither rhyme nor reason to the distinction.

The Go community seems to be avoiding problems with error handling for now, but that is mainly because Go programmers tend to be experienced, high-end developers who are used to the discipline of meticulously writing all the extra error-handling code. But with the rise in Go’s popularity, sooner or later it is going to experience an eternal September, with newbies piling in and forgetting the all-important error-handling boilerplate code left, right and centre, and when that happens, they will discover that return codes instead of exceptions are no panacea.

You have to tell AWS CLI that your EC2 instance is not in Virginia

Here’s a little gotcha with AWS that I keep running into time and time again. By default, the aws command line interface, and AWS API libraries such as boto3, will always use the us-east-1 (Virginia) region by default, even when running on EC2 instances in other regions.

This is not what you expect, and it is almost never what you want.

There is an issue on the awscli GitHub issue tracker to fix this, but it is still open four years after first being raised, with no indication when (or even whether) it will ever be addressed.

User @BradErz suggests including these lines in your user_data to set the default region:

region=$(curl http://169.254.169.254/latest/dynamic/instance-identity/document|grep region|awk -F\" '{print $4}')
echo "[default]" > /root/.aws/config
echo "region = ${region}" >> /root/.aws/config

Note however that this will only set the default region for the root user; you will need to configure aws-cli separately for any other logins on your instance.

Annoying as this behaviour is, I would be surprised to see it fixed any time soon, as it would be a breaking change.

Design refresh

I updated the design of my blog over the weekend.

My main goal was to switch back to a new, responsive version of my original blog theme, with its orange and blue colour scheme. Since the start of this year I’d been trying out a variety of off-the-shelf WordPress themes to give me a responsive (and therefore more SEO-friendly) design, but I was never really satisfied with any of them, so orange and blue is back, with its first major design refresh since 2011.

I’d thought that making it responsive would be a massive undertaking, but in the end of the day it only took me a couple of hours on Saturday morning. I was helped greatly in this by the fact that I was already using Less CSS as a pre-processor. I was also able to use Google’s mobile friendly test tool to quickly identify and fix the issues that needed fixing.

If you shrink the window down to below 400 pixels, you’ll see that that the text in the header starts to shrink with it to fit. I’d seen a few other sites do this, and it turns out to be very simple to implement, using a combination of media queries and viewport units. In case you’re interested, here is the Less CSS mixin that I’m using to achieve this:

.responsive-font-size(@size, @resize-below) {
    font-size: @size;
    @media screen and (max-width: @resize-below) {
        font-size: unit(@size * 100 / @resize-below, vw);
    }
}

I’ve also restored my old blog posts, together with most of the comments and attached pictures, from a backup that I’d forgotten that I had. In the process of doing so, I’ve updated all the internal hyperlinks and image locations to point to https:// URLs on my blog; this was needed to eliminate mixed-content security warnings in the browser address bar on older posts that contained images. I achieved this quite simply and elegantly by using sed on the output of mysqldump to perform an appropriate find and replace before reloading. The exact command to use is left as an exercise for the reader.

Behind the scenes, I’ve moved it onto a $5/month 512MB DigitalOcean droplet. This is all you need for a blog that only gets a hundred or so hits a day, and a bit of load testing with Apache JMeter suggested to me that it should be able to handle a spike from Hacker News if necessary — apparently hitting the HN home page can get you about 6,000 hits an hour. I’ve scripted the server setup using Terraform and I’ve also got a couple of scripts to backup and restore the data. This means that I can tear down and rebuild it very quickly if need be, in accordance with the modern best practice of treating your servers as cattle rather than pets.

If you have any problems with it, or if anything doesn’t look right, please let me know. If you want to be reminded what the old version looked like, here are some archive.org snapshots for 2007, 2010, 2011, 2012, and 2016.

Introducing Lambda Tools: a new framework for deployment to AWS Lambda

Lambda Tools is a new project that I’ve been working on over the past few weeks or so. It is a build and deployment toolkit, written in Python, to make it easier to work with AWS Lambda functions. It started out as a user story in our backlog at work but since then it’s grown into a full blown open source project with a life of its own.

The “serverless” model offered by AWS Lambda is a useful and potentially cost-effective one, especially for scheduled tasks that only need to run every so often and don’t require a lot of resources. The “free tier” doesn’t expire at the end of your initial twelve month trial, which is an added bonus.

The downside is that it can be tricky to work with. If your function requires additional libraries, it starts to get a bit more complex, and if any of these are written partly in C and require special compilation, things can get pretty messy. On top of that, you will want to write unit tests for your functions and set up some sort of Continuous Delivery pipeline for them.

Lambda Tools includes several features to make these things easier. For example, it gives you an option to build your function in a Docker container to avoid these messy “invalid ELF header” errors. You configure it simply by creating a YAML file called aws-lambda.yml, which might look like this for example:

version: 1

functions:
  hello_world:
    runtime: python3.6
    build:
      source: src/hello_world
      requirements:
        - file: requirements.txt

    deploy:
      handler: hello.handler
      role: service-role/NONTF-lambda

      description: A basic Hello World handler
      memory_size: 128
      package: build/hello_world.zip
      region: eu-west-2
      timeout: 60

      tags:
        Account: Marketing
        Application: Newsletters

Here are some of the other ideas that I’m thinking of implementing for it:

  • A unit test runner
  • Support for Python 2.7
  • Support for languages other than Python (.NET, Node.js, and so on)
  • Integration with Terraform:
    • The ability to plug it into Terraform’s external data source provider
    • The ability to read configuration from stdin
    • A scaffolding engine to generate Terraform modules on the fly
  • Integration with triggers such as CloudWatch cron jobs, API Gateway, and so on
  • A full-blown sample application
  • The ability to include or exclude specific files when building your package

At the moment it only supports Python 3.6 but nevertheless it is in a usable state. You can install it using pip install lambda_tools and it includes full instructions in the readme file on the GitHub repository. Additionally, if you’re interested in getting involved with its development, feel free to fork the project and send me a pull request or three. If it’s anything more complex, raise a ticket on the GitHub issue tracker and we’ll chat about it there.

Necessary and sufficient conditions

Take a look at these two statements. Are they both saying the same thing?

  1. “If you are using HTTPS, then your website is secure.”
  2. “If you are not using HTTPS, then your website is not secure.”

In actual fact, they are not. Furthermore, only the second statement is true: the first statement is false.

The first statement is an example of a sufficient condition. If it were true, all you would need to do to secure your website would be to install an SSL certificate and you’d be done.

The second statement, on the other hand, is an example of a necessary condition. There are, of course, other things you need to do to ensure that your website is secure: for example, take care to avoid SQL injection and cross-site scripting attacks, keep your servers patched and up to date, and so on. But you still need to use HTTPS in addition to all these. If you don’t, your site will be vulnerable to a man-in-the-middle attack.

You can see the difference if I draw up a truth table for a sufficient condition:

Sufficient condition Other stuff Secure?
No No No
Yes No Yes
No Yes Maybe
Yes Yes Yes

On the other hand, a necessary condition looks like this:

Necessary condition Other stuff Secure?
No No No
Yes No Maybe
No Yes No
Yes Yes Maybe

Some conditions can be both necessary and sufficient. In this case, the truth table looks like this:

Necessary and
sufficient condition
Other stuff Secure?
No No No
Yes No Yes
No Yes No
Yes Yes Yes

A necessary and sufficient condition can be written as “if and only if.” This is sometimes shortened to “iff.”

Insufficient does not mean unnecessary.

The most common misunderstanding that people have about necessary and sufficient conditions is the mistaken belief that one implies the other. Or that a lack of one implies a lack of the other.

  • It is possible for conditions to be sufficient but not necessary.
  • It is possible for conditions to be necessary but not sufficient.

Take, for example, this comment:

Google is just a bully because it is so big. It can go f*** itself. A standard webpage is not insecure and the use of SSL doesn’t make it secure either. Maybe everyone forgets that when SSL certs were comprised. I do work on e-commerce sites and I have seen clients who sites got hacked, not because of lack of SSL, but because of bad code on their backend. The hackers proceeded to add code so they would get emailed the credit card info after it was submitted. The user would never know, because the big green icon in the browser said it was secure. The whole thing is just a way for companies to make money.

This commenter correctly realised that SSL is insufficient but he then assumed that this means that SSL is therefore unnecessary. This is of course incorrect. SSL may be insufficient, but it is very, very necessary.

Unfortunately, in the world of IT security, there are plenty of necessary conditions. But there are no sufficient ones.