It had to happen. There’s no way to avoid it and still derive reasonable and robust grade predictions. We must talk about statistics. I’ll try and be gentle, make it as intuitive as possible. Don’t be scared now, we are only going to dip our toes into the water and check the temperature. The water’s clear, we can see the bottom of the pool, it should be safe!

This series of blog posts is titled The Truth About Estimation for a reason. We must remember resource estimation is a predictive science. We are predicting tonnes and grade from sparse data. That word predicting is the key. If I go to my favourite on-line dictionary the meaning of ‘predict’ is:

- to declare or indicate in advance; especially :foretell on the basis of observation, experience, or scientific reason

There you go. All those geologist-engineer-metallurgist jokes about what does 1+1 equal are true! Geology, specifically resource geology, is nothing but fortune-telling!

## About Inference

Seriously, there’s something in that basic definition that helps us understand our resource estimates. We are predicting (estimating) on the basis of observations (geological and geochemical data) using scientific reasoning – or at least we should be.

This is where statistics comes into the picture. It’s the field of statistics, specifically probabilities, that we use in resource estimation. And yet I bet most of us forget this foundation and simply charge onto crunching numbers.

Once the drilling is finished, the assays are back from the lab, the core is all logged and you’ve developed that robust 3-dimensional geological interpretation it’s time to bring it all together and finish the job. Surely the hard work is already done…

About now you might be thinking it’s time to start kriging or doing an inverse distance estimate or whatever other approach you fancy – not so. Between the geological interpretation and estimating is one of the most important and over-looked aspects of resource modelling. We need to develop our estimation domains. You see, a set of estimation domains is not equivalent to the geological interpretation.

An estimation domain (or more commonly a set of domains across the deposit) defines the volume and spatial location within which we group sample data together to form an estimate. Strictly it’s the domains, not the geology that define the boundaries of the estimate. In some cases, the two (the domain interpretation and the geology interpretation) may be the same but in some cases that may not be appropriate. Using the ‘raw’ geological interpretation, no matter how robust, may cause serious issues with the estimate.

Why? It comes back to the nature of prediction and the underlying statistical framework that scientific prediction is based on – statistical inference.

Like all aspects of inductive inference, we are using data to derive a conclusion (our estimates). That means we should understand a bit about statistical inference, its strengths, its weaknesses and limitations.

The first thing to realise about models based on inductive inference is that they are fundamentally limited by the data they are based upon. That may seem obvious but it’s a critical point and it raises many challenges in resource modelling (and indeed in other model-based disciplines including artificial intelligence!). A model based on inference cannot predict an event that is not represented in the data (like a black swan).

- an AI-based photo filter that ‘beautifies’ your selfie and was found to be lightening skin tones as part of the beautification process (the training data was predominantly light-toned faces);
- A bias in Google ads where women were shown fewer high-paid job adverts when compared to men;
- A passive application that measures road pothole frequency using data collected from smartphone accelerometers where there is a subtle data collection bias due to differential uptake of smartphone technology across different socio-economic groups;

The second thing to realise about inductive inference models is that they rely on an ‘assumption of sameness’ or ‘constancy’. That is, the data is assumed to be representative of the whole and furthermore the order of data collection (sampling) is assumed to be immaterial. Many geologists will be familiar with this type of assumption, it is similar to the Uniformitarian school of geological thinking that was popular in the mid 18th centaury. Uniformitarianism advocates that geological processes we observe in the present are the same as those that occurred in the past and therefore the geological record can be explained entirely in terms of our current observations of geological processes.

And the last property of an inductive inference model worth noting is that they can be wrong. Even the best inductive methods applied to all the data may give rise to incorrect predictions.

Let’s get back to the problem – estimation. Why on earth do we need all this discussion about induction and inference? It all comes back to that question of domains and domaining. You see the inferential nature of our estimation models implies certain things about the domains we use for those models (remember domains are not necessarily equivalent to geological interpretation). In particular, our domians need to address the three characteristics of inductive inference models:

- Data driven and yet data limited;
- The principle of ‘sameness’ or representivity; and
- The potential for being incorrect.

In summary that means we need to ensure the data in our domains are apples and apples, not apples and oranges while at the same time allowing for the possibility that those apples may actually be plums!

**The basis of inference – stationarity**

This is the idea behind what statisticians and geostatisticians term Stationarity.

Stationarity is an expert decision. It is the decision we make when we say “all these samples are statistically (and geologically) the same.” We, as the person responsible for the estimate, decide what data it is appropriate to pool together into a single grouping. This assumption of stationarity is a property of the model we develop. If we change our domaining (stationarity) decision the model we develop will also change.

Here’s the thing… stationarity is a property of the model, not the physical spatial distribution. It cannot be directly confirmed from the data (remember the model is data limited). Stationarity is a theory about our geology, samples and data. As such, the stationarity theory can be falsified – collect more data and see if it contradicts the stationarity assumption (note, samples that confirm the stationarity assumption do not ‘prove’ the theory – the next sample may be the data that falsifies the assumption).

So how do we assess stationarity? With statistics.

## EDA

Have you ever wondered why exploratory data analysis (EDA) is routinely included as part of resource estimation? Those endless tables and graphs (often assigned to appendices) showing statistical analysis of different rock types, often quoting numbers you haven’t seen since high school like skewness or kurtosis? All too often EDA is conducted by rote, with little ‘analysis’ and an emphasis on presenting statistical summaries. Instead of this mindless and mind numbing number crunching EDA should form a vital part of your domaining decisions and stationarity analysis.

Recall that we are after domains where all the data is ‘the same’. In other words, we want a single population where all samples drawn from that population are representative of that population. Apples and apples.

Here’s where it gets a little more confusing. There are different forms of stationarity, or more correctly ‘degrees’ of stationarity. Drawing on the apple analogy Granny Smith vs. Red Delicious vs. Royal Gala. The differences in the degrees of stationarity relates to the ‘strictness’ of the assumptions. There’s a good paper here.

In practice, most often we are concerned with Intrinsic Stationarity or in some cases Second Order Stationarity. We weaken the requirements for stationarity to allow us to work with incomplete, wide-spaced and sparse data where we are forced to make assumptions about the similarity of the variables we are worried about (and possibly immediately regret that assumption!) The assumptions we impose by our decision of stationarity include:

- On average, the average grade of spatial subsets of the domain are the same. This is the moving windows test. Overlay the volume with a grid of blocks and determine the average (and variance or standard deviation) within each block. Check the block averages (and variance) are approximately the same;
- There is no trend in the data. Using the same overlay of blocks or the raw data itself, check if there is a spatially-associated change in the average grade, that is does the grade increase or decrease in some particular direction? If it does, your domain is not strictly stationary;
- Check the spatial variance (the variogram or the correlogram). Random sub-samples of the data should result in similar experimental variograms – allowing for the sparsity of sample pairs. Similarity should be assessed in terms of anisotropy, range and sill. Thus, if there is a change in the apparent direction of maximum continuity or if the range/sill change, your domain is not strictly stationary; and
- Look across the domain boundary. What is the nature of the transition from one domain to the next? Is there a sharp change in grade? Is the change gradational or even non-existent? This type of boundary analysis helps us understand the similarity/difference of samples in one or another domain.

Here’s where it starts to get really interesting… and somewhat muddy as well.

Stationarity is a property of the model. It’s a decision we (as experts) have made and it impacts on all the aspects of the model. Take a look back at those three assumptions again and note the ambiguity. Phrases like “on average” and “similar”. How similar things are before we decide to change the domain interpretation is our choice and we need to make informed decisions. That means we need to have some idea when it’s more or when it’s less important to be more/less similar. Here’s a starting point, let’s think about what the domain is used for, or to put it a bit more succinctly, what aspects of the model are affected by imposing a decision of stationarity.

## Stationarity matters!

By definition a domain is the spatial set of ‘like’ or ‘similar’ samples. Therefore, all of the inferences we make from those samples are affected by the domain decision. This includes:

- The experimental variogram and the variogram model we derive from the experimental variogram; and
- The samples we select (allow) and assign weights for estimation.

That’s two fairly fundamental aspects.

Firstly, for estimation (and simulation) algorithms based on one or the other flavour of kriging, the variogram model is central. The sample weights and any support correction are derived from the variogram. Change the variogram model – change the estimate. Change your domaining (stationarity) decision – change the estimate.

Secondly, by allowing some samples to be selected during estimation and excluding others we are directly controlling the possible outcomes. Once again, change the domain and that will allow/exclude different data selection changing the estimate. This aspect can commonly be observed in domains that are based solely on grade differentiation – the difference in grade across and interpreted domain boundary will be exaggerated during estimation, which may an artefact rather than a realistic representation.

So, domains matter. Domains are not the same as the geological interpretation but they are typically based on the geology – plus an expert consideration of statistical similarity (i.e. stationarity). Before jumping straight from the geology to the estimate, take time to consider your domains. Ignore the basis of statistical inference at your peril.

##### Footnote…

Ordinary kriging is special case when considering domains and stationarity. While the domain decision affects the quality of the variogram model and the samples that are selected, the search ellipse applied during ordinary kriging mitigates against poor domain decisions. In ordinary kriging, a subset of the entire domain (as defined by the search) is assigned kriging weights. Thus, during estimation, the stationarity decision applies to the samples within the search ellipse which is commonly a smaller volume than the entire domain. This is commonly called ‘quasi-stationarity’. This is not the case with simple kriging or most conditional simulation algorithms where a much stronger stationarity assumption is required. Think about it, in simple kriging a weight is applied to the average grade of the domain. If the domain is not stationary (e.g., there’s a trend) that domain average will be incorrect and will impact on every single block in the model…

Interesting. I would like to see more on resource estimation/evaluation

I’ve often been confused by the combination of declustering and stationarity. If your samples across the domain are considered to be all samples from the same random variable, what does it matter if you use two that are spatially close together? So why the need for declustering?