Equation for the experimental variogram

This simple equation is possibly one of the most influential and important drivers of the mining industry and yet most people working with resource companies have never heard of it and those that have don’t understand it.

Without this equation, the industry would lack a robust basis for resource evaluation and valuation. We would not have the basic tools we take for granted around all of our production predictions and it’s likely our cash flow forecasts would be even more inaccurate and imprecise.

It’s a truly critical equation and beautiful in its simplicity and yet… even those who read the math and routinely work with this essential tool often fail to appreciate its value. Often the gap between theory and practice is so great that we forget the implications.

We build models, mines and entire companies on sometimes weak application of this simple concept. That may be appropriate but let’s make sure we do so in an informed manner and not out of ignorance.

So, what is this magical piece of math? It’s the formula for the experimental (semi) variogram. The variogram, the foundation of geostatistical estimation and simulation techniques. It drives our understanding of spatial variability, provides a way to account for sample quality, informs us about the differences we can expect between drill hole samples and month-by-month production grades. The variogram is fundamental in evaluating and quantifying the quality of all resource estimates (and hence everything downstream from the resource estimate). We use the variogram to optimise interpolation and to control simulation.

The importance of the experimental variogram and the variogram models we fit to the experimental data cannot be understated.

Even if you don’t have a math background the variogram is conceptually very easy to understand and I encourage you to read on! Understanding such a vital concept at a conceptual level will enhance your understanding of resource risk, estimation precision and ultimately make you better informed about why resource estimates can sometimes be incorrect and inappropriate.

Let’s start with the equation again…

Mathematicians, like geologists love their jargon. In this case some of the jargon is in the letters and symbols in the equation. Once you understand the notation, understanding the concept is much easier. Here’s a break down:

γ (gamma) is the value we are calculating – the little ^ means it is calculated from our data (experimentally) not from a model of some sort

h is a separation vector (distance and direction) also know as the ‘lag’

Z is a variable, like the grade of a sample

Xi is shorthand for the XYZ coordinates of the sample Z

The Σ means to sum all the values to the right for the limits set below (i = 1) and above (N). In other words, sum the outcome of the equation for all samples (Z) at locations Xi

Don’t let someone else’s jargon prevent your understanding. Click on the image to see the different components of the equation explained.

The variogram is nothing more than a statistic. It is a statistic that describes the spatial variability of some measure (e.g., grades). That description can be presented in many different ways but the common format for the variogram is a 2-dimensional plot with an x-axis of h vs a y-axis of  (or gamma). What are these two values?

h is a vector, or more simply a distance in a specific direction. It represents the separation distance in that direction.

γ is half of the average squared difference between two points separated by h. That is, the squared difference between one sample and a second sample that is separated from the first sample by h (a distance and direction) divided by 2.


We can plot these differences (the y-axis) vs. distance/direction (the x-axis). If you stop and think about that for a moment you might be able to intuitively see what a variogram plot should look like. You might also start to recognise that the variogram is a type of map of the average difference in pairs of data in space.

What does your intuition say? Think about it this way…

  • What would be the difference between one point compared to itself (i.e., a zero distance)? If you said zero you pass the basic math test…
  • Now think about samples down a drill hole. Imagine we have a sample every metre down the hole.
  • What do you think the difference between the first and second sample would be when compared to the difference between the first and third or first and fourth sample? Would the difference be more likely to be increasing or decreasing?
  • What about the first sample and the 20th or 50th?

If you are having trouble visualising this think about it in terms of how well correlated the samples are or if there is likely to be any sort of relationship between the samples as they get further and further separated. Remember that variables like grades tend to have some sort of structure – imparted during the mineralising event. Higher grades tend to occur in zones allowing us to isolate (and mine) these higher-grade parts of mineralised systems. This is another way of saying that we can draw contour around grades of similar values and expect those contours to be a reasonable model of the grade distribution. If couldn’t group high and low grades spatially we would have extreme difficulty mining the deposit economically (but more on that later).

This sort of thinking is something that tends to come naturally to geologists but can be harder to grasp for those without scientific or engineering training. I think most people would recognise that the difference between a sample and itself should be zero, even if they think it’s a pointless comparison. What happens as the distance between samples increase though?

As we compare samples with greater and greater separation they tend to be less similar to each other – the differences increase with distance (on average!). The experimental variogram statistic is exactly that. It is a measure of the average differences squared of all points separated by increasing distance in a given direction. You may ask why we square these differences. There are a couple of reasons, the most critical is that it ensures the values are all positive. If we didn’t square the differences then the contribution of some pairs might be negative and others positive so summing the differences would not give us a meaningful result.

If you remember nothing else about the variogram remember this… The variogram is a measure of the difference between sample over increasing distance.

Let’s delve a bit deeper though…

While this basic understanding is a good start it pays to look at what’s happening and what might be driving the variogram statistic. Picture a drill hole (see below). We have 1m grade assays for 500m along the hole and we want to know what this hole is telling us about spatial continuity. A good start is to calculate the experimental variogram in the down hole direction. How?

Here’s a snapshot of the first few samples in our drill hole. I’ve labelled them with the depth or distance down the hole. For now, let’s agree that if we compare every sample to itself (distance = 0) the variogram will be zero. Now let’s look at a distance of 1m

Here we compare every pair of samples that are separated by 1m. That is sample 0 to sample 1, sample 1 to sample 2 and so on… We can put this all in a table to summarise all the pairs and their squared differences.

We can repeat this comparison for a separation of 2, 3, 4 and so on, generating the squared differences each time and tabulating these values. I can also plot the paired data as scatter-plots. If I keep going like this for all 500 samples in my drill hole I will end up with a multitude of tables of paired data and their differences. It soon gets out of control and is hard to visualise. Like all good scientists faced with too much data we summarise this multitude of paired tables and paired scatterplots to make them easier to understand and visualise. Remember this though… when we summarise data (no matter how) we lose some of the richness present in the unsummarised data. Sometimes that can be a problem.

So, the summary step. Simplify our analysis and present the results in some more meaningful manner. Something that captures the essence of our analysis of the squared differences. Well, the easiest way to summarise a list of numbers is to take their average. Here’s what we get when we look at all 500 samples down the drill hole and the average pair differences at separations ranging from 0 to 40m. You will notice that the scatterplots look worse as the separation increases and the average squared differences increase as the separation increases – that matches our intuition. Taking the concept of summarising the outcomes one step further, let’s chart the average squared differences against the separation distance.

And that brings us to the variogram as it is classically presented. A plot of gamma (γ) against the distance (h). A simple summary statistic that is easy to understand when explained. It shows us the average difference between pairs of sample data separate by a distance in a specific direction. The average difference of the pairs in this case increases with distance before levelling off at around 40m.

But why is this variogram so important?

We’ve gone through the steps to develop a statistic that summarises the average differences between pairs of data at different separations in a given direction. If we could model the distance vs. difference relationships we have determined from our sample data, and we can create that model in every direction we would effectively have a map of how different (on average) we would expect any two samples to be when compared to each other.  Stop and think about that for a minute. Could that be useful? Maybe think about it in terms of estimating the grade at a point in space from a set of samples. Here, we are trying to estimate an unknown value from one (or more) surrounding samples.

Knowing our expectation of the average difference between the unknown point and the known values (in each direction) gives us valuable insight into how related or unrelated each sample is to the unknown point.

This is the framework for kriging. We interpolate unknown values at locations away from known samples and we use the variogram we have modelled to guide how we weight each sample in the estimation process. Put differently we use the variogram to help identify which samples should have more influence or less influence on the grade at our new location. I’ll cover that in more detail in another blog. Suffice to say that the variogram (or more correctly the model we fit to the experimental variogram) underpins every aspect of kriging and geostatistical resource estimation.

I hope you’ve managed to stick with this explanation of the variogram and the most important equation in the resource industry. If you have you will understand it is a summary statistic that describes the spatial variability of data. The average squared difference at different separations in different directions. Conceptually simple.

Equation for the experimental variogram

There are a few key points to note about this statistic. Things worth bearing in mind if you are ever faced with a variogram in the wild or if you are talking to a resource specialist and they mention the term.

  1. The variogram is a summary of a summary of a summary. Each point on the variogram plot you may see actually summarises one property of a scatterplot of pair differences – by taking the average of those squared differences. Here’s the problem… averages are notoriously sensitive to outliers. All it takes is one or two extreme values and the average changes dramatically. Think about the series of numbers 1, 2, 3, 4, 5, 6; the average is 3.5. Now if we change the final value from 6 to 60 the average changes to 12.5. It’s sensitive to the extremes (outliers). It’s relatively easy to see extremes on a scatterplot – look for the points that don’t appear to be related, those that sit at odds with the rest of the data. It’s not as easy to see those extremes when the data is summarised by averaging. Consequently, a variogram can look much worse than it may be in reality – all due to some extreme outliers. How do these outliers occur? Many ways but typically you will have paired data points (at a given separation) where one sample is very different to the other points. Removing that one sample from the analysis may give a totally different perception of the spatial grade distribution;
  2. The degree of sensitivity to those outliers is worse than you may think. Remember we are dealing with the squared Take a number and square it and you increase the ‘contrast’. The difference looks more extreme. So, if there are outliers in the data by squaring the differences we increase the influence of those outliers.

These two points reinforce the need for a robust decision about what data should be included in an estimation domain (i.e., the stationarity – see this post). Pool data that belongs to 2 different populations and you are likely to make the variogram look worse than reality. The picture becomes blurred.

  1. The above examples have been very simple – a necessity for understanding the concept of a variogram. In the real world, we very rarely (never?) deal with sample data that is organised on a regular grid. This affects the experimental variogram calculation. To ensure we have sufficient pairs of data in at least 3 orthogonal directions (we need these for the mathematical models) we must relax some of the conditions we use to work out sample pairs. We do this by relaxing two aspects of the experimental variogram:
    • The separation distance is not usually an absolute. We allow separation distance that ‘are about the same’ to be summarised together in the average squared difference. The degree of relaxation is controlled in the analysis; and
    • The direction is likewise not absolute. Instead we have tolerance bands looking for samples that are aligned in the direction of interest. These bands are typically some angular wedge or a cone.

The impact of these two departures from a strict adherence to separation and direction once again blurs the experimental variogram. By altering the tolerance limits on these two dimensions we increase the likelihood that we will be summarising the squared differences of pairs of data that are not truly part of the information we are trying to understand.

More Later

To wrap up what is already a long and technical post…. The variogram is an extremely powerful concept. Variogram models (or curves fitted to the experimental variograms derived from our sample data) play a fundamental role in resource estimation and grade interpolation. Without the variogram working as a map of the spatial differences between samples in all directions any grade interpolation is nothing more than an expert judgement. The variogram provides a robust and auditable framework for the interpolation outcomes. Still, it is not without its own issues, namely poor implementation and sometime pragmatic application that ignores the basis of this key summary statistic.

If there is one aspect of resource estimation I encourage every geologists, engineer, metallurgist and manager to understand and investigate it’s the variogram.

In the next post, we will take a look at the anatomy of the variogram and explain how to interpret this vital geostatistical tool.







Leave a Reply

Your email address will not be published.