Benefits and challenges of using big data in resource estimation

Michael Mattera MAusIMM, Mining Industry Process Consultant at Dassault Systèmes GEOVIA

21 Jul 2023 · 2100 words, 8 min read

To estimate the properties of a mineral deposit as reliably as possible — especially as economic orebodies around the world become increasingly complex — a geologist must thoroughly understand the deposit as well as the method of emplacement/mineralisation.

And the only way geologists can do that is through using sound, dependable data.

Yet traditionally mining companies have relied solely on two exploratory drilling methods to obtain the physical samples, typically the only working data, that resource geologists use to model and estimate mineralisation:

diamond drilling, which involves withdrawing small diameters of core rock for analysis
reverse-circulation drilling, which involves collecting crushed rock cuttings for analysis.

The result is that billion-dollar decisions are based on the physical analysis of a very small amount of material while the bulk of the material to be mined, both overburden/waste and the mineralised orebody itself, remains unexamined.

To ensure higher quality resource estimates, geologists need more data.

The good news

The base data for geological modelling and resource estimation can be classified as either hard data (data that is directly observed and measured), or soft data (which make up the bulk of what’s known as ‘big data’) from other sources.

The good news is that advances in data acquisition technologies mean that a whole new world of soft data – from downhole geophysics, multi- and hyper-spectral core scanning, and more routine collection of geometallurgical parameters – is now available to inform and enhance resource modelling and estimation.

For example, using soft data can help resource geologists detect correlations between variables that might not be immediately obvious from hard data alone, such as a subtle alteration pattern that is evident from hyper-spectral core scans but not in assay results. Including additional geometallurgical-related parameters, such as hardness or grindability, acid consumption, moisture content, or clay minerals, can also:

highlight potential processing issues or abnormal values that wouldn’t be recognised with a more limited dataset
help define trend surfaces, such as gradual changes in mean values that can be removed from the data to improve the quality of estimates
identify variables to be estimated that might not normally be included in the block models that represent the material to be mined.

Using geometallurgical and other parameters – through using self-organising maps, for example – also contributes to better domaining of the mineralisation. This is because it allows geologists to consider many more characteristics as they define which volumes of material share similar characteristics and which are distinct.

In addition:

Adding in big data — via techniques such as co-kriging using secondary variables — can help produce estimates that take more localised (at a selected mining unit scale) variations in the mineralisation into account, while still:
- achieving acceptable slope of regression (a standardised measure of the quality of the estimates)
- minimising conditional bias (true value is typically less than the estimate when the estimate is high, and the true value is greater than the estimate when the estimate is low).
If a mining company chooses not to complete, for reasons of time or money, a full analysis of all the attributes of all physical samples, geologists can use big data to fill in (impute) missing values using estimation techniques, proxy formulas, or correlations. Once they have all desirable attributes available for each sample, they can then return to more conventional techniques, such as kriging, to produce estimates or simulations — a set of equally probable realisations of the estimates — for all required parameters.
Incorporating big data as part of the resource modelling and estimation workflow increases the ability for resource geologists to:
- highlight areas of higher risk (with, for example, elevated levels of deleterious elements or material with potential processing problems) that could be subject to additional environmental or social considerations
- adopt the industry best practice scorecard approach to the classification of the Mineral Resource estimates (from low to high confidence: Inferred, Indicated and Measured)
- improve mine site safety by identifying zones that might have poor ground conditions or require a change in standard mining practices to deal with (thereby introducing non-standard or unexpected behaviour).

The bad news

At the same time, however, all this additional information can result in an overload of big data, potentially many terabytes in size, that might also have varying degrees of accuracy and which must be separately validated before it can be used. That validation can add substantial time and effort, since the new, non-traditional data must be made to work with — and be stored and visualised alongside — the traditional physical drilling information, such as lithologies and assays, typically found in a geologist’s resource database. Finally, big data also makes (potential) automatic modelling and simulation a complex, processing-intensive task.

So what should mining companies do?

With clear advantages to using big data, despite what it demands in time and effort, miners need to keep in mind that a poor dataset will always produce a poor estimation. A good dataset, taking into account all available data, will produce an estimation that is more statistically sound, with clearly defined reasoning behind each of the decisions made along the way.

To make the best use of all available data, mining companies should consider how they want to address four specific challenges:

1. Storage

The process of acquiring, validating, and analysing the base data for resource estimation is time consuming and expensive, which means miners must consider the value of the information and knowledge derived from that data when determining how they will store it.

They must also decide how long to store it for: it may take years or even decades before a company makes the decision to mine, while the mining operation itself can take place over decades, so the lifecycle of the data is also long. But even data that is decades old can remain valid and useful for analysis/modelling if it is appropriately stored and, most importantly, still available.

Currently, however, geologists often store the initial data they collect during the exploration phase on a laptop, which both limits access to this data by other project teams and increases the risk that the data — and its potential value — could be lost at any time if a geologist changes roles or the device is retired.

2. Multiple sources

Geologists need to be able to retrieve and use information easily, but the sheer range of data now available for geological modelling and resource estimation can make that difficult.

Today, base data comes in a wide variety of types, including lab results supplied directly from Laboratory Information Management System (LIMS) systems and descriptions of the diamond drilling core from which the physical samples were extracted, as well as data from, for example, hand-held/portable X-ray fluorescent (XRF) analysers, data historians that record drilling penetration rates, and metadata — ie those additional details, such as the time of day the data was collected and the person, company, or piece of equipment that collected the data, that are vital for confirming if the data is in the original form, if it has been manipulated or adjusted, or is a calculated average.

This leads to a dataset made up of a diverse collection of text files, Excel spreadsheets, and resource models in proprietary binary format files, alongside data stored in geoscientific information management software packages and core scans, which alone can take terabytes of data, with much of it collected at different times and by different people/equipment.

3. Data lifecycle

Large amounts of data from multiple sources acquired over many years increases challenges to both data domaining (dividing the rock mass into volumes with similar characteristics that are distinct from each other) and the Mineral Resource classification process.

Resource geologists must now consider the lifecycle of the data used in resource classification and find a way to accommodate and flag drilling results and other data with lower confidence (or which failed the validation test), while not losing portions of that dataset, such as lithological/structural interpretations, that could still be used for resource modelling purposes.

Also, as more data is collected, geologists may deem historical data with no or inappropriate quality assurance/quality control (QA/QC) less reliable for use in mineral resource estimation, and must have a way to incorporate this finding into the database to ensure only the highest quality data is used. For example, if newer, more accurate collar/downhole surveys or laboratory analysis methods identify weaknesses in previously collected data, that new data could make the use of historical data (such as lithological contact positions or assay information) inappropriate, depending on how the data is used in the resource definition and estimation process.

The same might happen with biased historical data. Bias usually only becomes apparent and downgrades confidence in the data after a considerable period of time. It is crucial to maintain all metadata so that the data does not have to be revalidated before it is used in each resource update cycle.

4. Database management

In order to manage a resource estimation dataset that includes an array of big data properly, resource geologists need to be able to:

Discriminate between hard and soft data and any metadata that also needs to be included in the resource dataset, and to store their reasons for considering the data suitable for estimation or not.
Maintain the integrity of the resource dataset to ensure that the level of confidence (low to high) in the data can be used to appropriately:
- classify the confidence level of the resource estimates
- determine the risk profile of the decisions based on those estimates.
Control access to the database to:
- ensure that only validated and approved data (as opposed to raw data on which the QA/QC has not been verified) is used in the resource estimation process
- identify where other data has been confirmed as suitable only for modelling the geology (such as the extent of the mineralised lithologies) as opposed to estimating the mineral content or other properties of the material to be mined, including waste/overburden.
Provide proof of a strong chain of custody for all data that will confirm, for example, that assay data has not been manipulated. This proof will increase confidence in the estimates during external independent reviews, and illustrate that the dataset is being well governed — a vital consideration for financing.

The future is in the cloud

While these are significant challenges, they are not insurmountable, and the future for resource modelling and estimation is, in my opinion, in the cloud.

A cloud-based platform:

removes storage limitations and allows for on-demand access to both data and processing power
ensures high availability, which can replace current back-up and disaster recovery processes, except for those that are time-sensitive and can affect mining/production
provides a central location to store and share all data, with sufficient on-demand computing resources available to accommodate repeatable workflows rather than a collection of independent, difficult to back-up processes run on separate devices with limited processing power
offers the option of using standardised workflows to capture deep specialist knowledge, which then becomes permanently retained, role-based knowledge
enables processes that depend on access to powerful computing resources to be run more efficiently, both in time and cost, than using local machines with limited capabilities — for example, with a cloud-based computing platform, it becomes much more feasible to routinely undertake valuable studies when new data becomes available, such as simulating variations:
- in the geology model and the resource estimates, and then building multiple mine plans based on these variations, or
- in the beneficiation process when handling ore with differing chemical characteristics or ratios of ore types
makes it possible to:
- quickly incorporate artificial intelligence and machine learning techniques in workflows to automate a number of time consuming and/or repetitive tasks
- construct workflows to produce financial models that incorporate much more of the underlying inherent variability of the mineralisation as opposed to those based on average assumptions — making true risk-based decisions using robust confidence intervals placed on key metrics, such as Net Present Value or Internal Rate of Return, possible.

In short, by being able to store, process, integrate, share, and display all available data types required for high-quality resource modelling and estimation, a cloud-based platform will contribute to overall improved orebody knowledge and understanding of the controls on mineralisation.

This in turn will result in significant downstream benefits, including better blending of material for processing, more consistent plant throughput, and ultimately, most importantly, higher product quality and increased profit.

Find out more

Join the GEOVIA User Community, by Dassault Systèmes, to read about industry topics from experts, be the first to know about new product releases and product tips and tricks, and share information and questions with your peers. All industry professionals are welcome to learn, engage, discover and share knowledge to shape a sustainable future of mining.