How to measure things

This post is more than 7 years old.

Posted at 09:00 on 02 October 2017

¹³Do not have two differing weights in your bag – one heavy, one light. ¹⁴Do not have two differing measures in your house – one large, one small. ¹⁵You must have accurate and honest weights and measures, so that you may live long in the land the Lord your God is giving you. ¹⁶For the Lord your God detests anyone who does these things, anyone who deals dishonestly. -- Deuteronomy 25:13-16

Now I'm going to start off with a trigger warning here: this post contains equations.

If that puts you off, then don't even think of wading into the creation and evolution debate. There are many, many aspects of the subject that involve mathematics, and if you aren't able to get your head round that simple fact, you will just end up getting things wrong, claiming that evolution is about crocoducks and shape-shifting cat-dogs, looking completely clueless, and undermining everything that you stand for.

Having said that, the equations in this post aren't particularly advanced, and I'm more interested in drawing your attention to the fact that they exist than trying to do anything with them. But they are important, because they concern one of the most fundamental, basic skills in science: the art of measurement. As such, this is the first thing that you learn in the first half hour of the first practical class of any undergraduate physics degree course. Working scientists need to know this stuff cold — and so too does anyone who wants to teach in their church about creation and evolution.

Measurement 101.

Now when I talk about measurement, you probably think of getting out a tape measure, stretching it from one end of a piece of furniture to the other, reading a single number off it, and leaving it at that. Your desk may be 180 centimetres wide, for example. But that's all you get — a single figure.

That single figure isn't good enough for science.

When scientists measure things, they don't just want to know the value itself; they also want to know how much confidence they can place in it. For that reason, they always seek to determine its uncertainty, or standard error. Additionally, when they plug their results into their equations to get a final value, they include the standard errors as well.

For example, they're not content with knowing that the earth is about 4.5 billion years old. They want to know how far on either side of 4.5 billion years the "real" value might fall. So they will tell you that the age of the earth is 4.54±0.05 billion years, or an error of about ±1%. This means they have a 95% confidence that it is older than 4.49 billion years and younger than 4.59 billion years.

For comparison, the error in your car's speedometer is about ±2.5%.

How errors are calculated

There is no guesswork involved in calculating error bars. On the contrary, they are measured and calculated according to specific statistical formulae.

In order to determine the errors involved in a measurement, you first take multiple readings, $latex x_1 \cdots x_n$, then calculate the mean, $latex \bar{x}$:

$latex \bar{x} = \frac{1}{n}\left (\sum_{i=1}^n{x_i}\right ) = \frac{x_1+x_2+\cdots +x_n}{n}&s=2$

and the sample standard deviation, $latex \sigma$:

$latex \sigma = \sqrt{\frac{\sum_{i=1}^n (x_i - \bar{x})^2}{n-1} } &s=2$

What does this all mean?

It's important to realise that error bars don't represent "hard" limits; they only represent the "spread" of your results, and as such, a probability distribution for what future measurements are likely to report.

Your results are usually assumed to follow a normal distribution. This may not be exact, but it is usually a good approximation. A normal distribution is a bell-shaped curve that looks like this:

400px-standard_deviation_diagram-svg — A normal distribution curve.
Source: Wikipedia (M W Toews)

It is described by this equation:

$latex f(x \; | \; \bar{x}, \sigma^2) = \frac{1}{\sqrt{2\pi\sigma^2} } \; e^{ -\frac{(x-\bar{x})^2}{2\sigma^2} } &s=2$

The important thing to note here are the percentage figures:

68.2% of results will be within $latex \pm \sigma$ of the mean.
95.4% of results will be within $latex \pm 2 \sigma$ of the mean.
99.8% of results will be within $latex \pm 3 \sigma$ of the mean.
The number of results further than $latex \pm 4 \sigma$ from the mean will be negligible.

Different scientific papers quote error bars in different ways. Some of them use $latex \pm\sigma$, simply indicating the standard deviation. Some of them use $latex \pm 2\sigma$, because a 95% confidence level is more intuitively meaningful. Sometimes they will quote a different value called the standard error of the mean, given by this equation:

$latex \sigma_m = \frac{\sigma}{\sqrt{n}} &s=2$

The standard error of the mean is usually used when a large number of readings have been taken with a view to pinning down a value as accurately as possible. Loosely speaking, it gives the probability distribution for the "real" result as opposed to the probability distribution for future measurements.

Calculating the error in the final results

So, let's say you have a number of measurements, and you want to use them to find your final result, say, the age of a rock formation. Let's call your measurements $latex A \pm \sigma_A, B \pm \sigma_B, \cdots $ and so on, and your final result being given by the function $latex Z(A, B, \cdots)$. Then the error in $latex Z$ will be given by this formula:

$latex (\sigma_Z)^2 = \big(\frac{\partial Z}{\partial A}\sigma_A\big)^2 + \big(\frac{\partial Z}{\partial B}\sigma_B\big)^2 + \cdots &s=2$

If you don't understand partial calculus, here are some special cases that crop up quite a lot:

$latex Z = A+B$ $latex Z = A-B$	$latex (\sigma_Z)^2 = (\sigma_A)^2 +(\sigma_B)^2 &s=2$
$latex Z = A \times B$ $latex Z = A / B$	$latex \big(\frac{\sigma_Z}{Z}\big)^2 = \big(\frac{\sigma_A}{A}\big)^2 + \big(\frac{\sigma_B}{B}\big)^2 &s=2$
$latex Z=A^m$	$latex \big(\frac{\sigma_Z}{Z}\big)^2 = \big(m \frac{\sigma_A}{A}\big)^2 &s=2$
$latex Z=\ln A$	$latex (\sigma_Z)^2 = \big(\frac{\sigma_A}{A}\big)^2 &s=2$

What to look for in evidence for the age of the earth.

This is a very basic introduction to how measurements are taken and how errors are calculated, but hopefully it will give you a flavour for the process involved, and an understanding that far from being guesswork, it's an exact, rigorous and systematic discipline. There's a lot more that could be said on the subject: for example, there are specific equations to use when fitting a line or a curve to a graph.

Similarly, I haven't said anything about systematic errors either: these are errors that may affect all measurements in an experiment to an equal extent, and may be caused by such things as contamination, zero errors, or mis-calibrated equipment. A whole raft of techniques are needed to deal with these errors, but the problem is by no means insurmountable. The way they are handled is very similar in many respects to the way that historical assumptions are handled.

The most important thing to take away from all this, however, is that errors and uncertainties can be, and are, quantified. This one fact is the point of failure for many, many young-earth arguments. They try to demonstrate, for example, that radiometric dating is "unreliable," or that certain assumptions are "generous to uniformitarians." This kind of talk is completely unscientific: real scientists will attempt to quantify exactly how unreliable radiometric dating is, and will want to establish precise limits on how much historical rates could have varied.

When you're reviewing young-earth evidence, always look for the error bars. Are they quoted consistently? They may quote them some of the time but not others. Are the error bars that they are omitting likely to be large enough to nullify their arguments? Are they rejecting high-precision measurements with tiny error bars in favour of low-precision measurements with large error bars? Are they giving equal weight to both high-precision measurements and low-precision measurements? (This was one of the fatal flaws in Barry Setterfield's c-decay hypothesis, for example.) Are they interpreting them realistically? You may see them highlight a discordance of, say, $latex 4 \sigma $ here and there, but realistically, does that justify claims that the methods concerned can't distinguish between thousands and billions when $latex \sigma $ is just one or two percent?

Somehow, I think not. When you are told that some new source of error in radiometric dating has been discovered, for example, your first question should always be, exactly how much of a problem is it? If you can't get a straight answer to that question, treat it with a hefty pinch of salt. Chances are, it's far less of a problem than it's being made out to be.