Upload
nguyenmien
View
217
Download
0
Embed Size (px)
Citation preview
University of Nottingham School of Chemical, Environmental, and Mining Engineering
APPLICATION OF ARTIFICIAL NEURAL NETWORK SYSTEMS TO GRADE ESTIMATION
FROM EXPLORATION DATA
by
Ioannis K. Kapageridis M.Sc.
Thesis submitted to the University of Nottingham for the Degree of Doctor of Philosophy
October 1999
Abstract
i
Abstract
Artificial Neural Networks (ANN) become increasingly popular within the resources
industry. ANN technology provides solutions to problems characterised by shortage
or bad quality of input data. It is a purpose of this research work to show that
estimation of ore grades within a mineral deposit is one of these problems where
ANNs can be applied successfully.
Ore grade is one of the main variables that characterise an orebody. Almost
every mining project begins with the determination of ore grade distribution in three-
dimensional space, a problem often reduced to modelling the spatial variability of ore
grade values. At the early stages of a mining project, the distribution of ore grades
has to be determined to enable the calculation of ore reserves within the deposit and
to aid the planning of mining operations throughout the entire life of a mine. The
estimation of ore grades/reserves is a very important and money-consuming stage in a
mine project. The profitability of the project is often depending on the results of
grade estimation.
For the last three decades the mining industry has adopted and applied
geostatistics as the main solution to problems of evaluation of mineral deposits.
Geostatistics provide powerful tools for modelling most of the aspects of an ore
deposit. However, geostatistics and other more conventional methods require a lot of
assumptions, knowledge, skills and time to be effectively applied while their results
are not always easy to justify.
The work that has been undertaken in the AIMS Research Unit at the
University of Nottingham aimed at assessing the suitability of ANN systems for the
problem of ore grade estimation and the development of a complete ANN based
Abstract
ii
system that will handle real exploration data in order to provide ore grade estimates.
GEMNET II is a modular neural network system designed and developed by the
Author to receive 3D exploration data from an orebody and perform ore grade
estimation on a block model basis. The aims of the system are to provide a valid
alternative to conventional grade estimation techniques while reducing considerably
the time and knowledge required for development and application.
Abstract
iii
Affirmation The following papers have been published based on the research presented in this thesis: Kapageridis I., Denby B. Ore grade estimation with modular neural network systems – a case study. In: Panagiotou G (ed) Information technology in the minerals industry (MineIT ’97). AA Balkema, Rotterdam, 1998. Kapageridis I., Denby B. Neural network modelling of ore grade spatial variability. In: Proceedings of the International Conference for Artificial Neural Networks (ICANN 98), Vol. 1, pp 209 – 214, Springer-Verlag, Skovde, 1998. Kapageridis I., Denby B., and Hunter G. Integration of a Neural Ore Grade Estimation Tool In a 3D Resource Modeling Package. In: Proceedings of the International Joint Conference on Neural Networks (IJCNN ’99), International Neural Network Society, and The Neural Networks Council of IEEE, Washington D.C., 1999. Kapageridis I., Denby B., Schofield, D., and Hunter G. GEMNET II – A Neural Ore Grade Estimation System. In: 29th Internation Symposium on the Application of Computers and Operations Research in the Minerals Industries (APCOM ’99), Denver, Colorado. Kapageridis I, Denby B., and Hunter G. Ore Grade Estimation and Artificial Neural Networks. Mineral Wealth Journal, Jul. – Sep. 99, No. 112, The Scientific Society of the Mineral Wealth Technologists, Athens. Kapageridis I., Denby B. Ore Grade Estimation Using Artificial Neural Networks. In: 2nd Regional VULCAN Conference, Maptek/KRJA Systems, Nice, 1999.
Abstract
iv
Acknowledgements I would like to thank Professor Bryan Denby for his guidance and help through the
duration of my studies at the University of Nottingham. I would also like to thank
him for introducing me to the exciting world of the AIMS Research Unit.
Thanks should go to everyone at the AIMS Research Unit, people who have
been there and others who still are, and who made it all so much easier. Special
thanks to Dr. Damian Schofield for being such a good friend and teacher, and also for
sharing his music CD collection with me.
A big thank you goes to the State Scholarships Foundation of Greece for
making it all possible. Their investment in me was most appreciated.
Many thanks to everyone at the Nottingham office of Maptek/KRJA Systems
for the help and support over the last year of my studies. In particular, I would like to
thank Dr. Graham Hunter, David Muller, and Les Neilson for their help and advice.
Finally, I would like to thank all my friends and in particular David Newton,
Marina Lisurenko, and Stefanos Gazeas for their support and for some unforgettable
times in Nottingham.
Contents
v
Contents ABSTRACT ........................................................................................................................................... I
AFFIRMATION................................................................................................................................. III
ACKNOWLEDGEMENTS ................................................................................................................IV
CONTENTS.......................................................................................................................................... V
LIST OF FIGURES..........................................................................................................................VIII
LIST OF TABLES............................................................................................................................XIII
1. INTRODUCTION ............................................................................................................................. 1 1.1 THE PROBLEM OF GRADE ESTIMATION .......................................................................................... 1 1.2 GRADE DATA FROM EXPLORATION PROGRAMS ............................................................................. 3 1.3 EXISTING METHODS FOR GRADE ESTIMATION ............................................................................... 7
1.3.1 General ................................................................................................................................... 7 1.3.2 Geometrical Methods.............................................................................................................. 7 1.3.3 Inverse Distance Method ...................................................................................................... 10 1.3.4 Geostatistics ......................................................................................................................... 12 1.3.5 Conclusions .......................................................................................................................... 15
1.4 BLOCK MODELLING & GRID MODELLING IN GRADE ESTIMATION ............................................... 16 1.5 ARTIFICIAL NEURAL NETWORKS FOR GRADE ESTIMATION.......................................................... 18 1.6 RESEARCH OBJECTIVES ................................................................................................................ 19 1.7 THESIS OVERVIEW........................................................................................................................ 20
2. ARTIFICIAL NEURAL NETWORKS THEORY....................................................................... 23 2.1 INTRODUCTION............................................................................................................................. 23
2.1.1 Biological Background ......................................................................................................... 23 2.1.2 Statistical Background.......................................................................................................... 25 2.1.3 History .................................................................................................................................. 27
2.2 BASIC STRUCTURE – PRINCIPLES.................................................................................................. 29 2.2.1 The Artificial Neuron – the Processing Element .................................................................. 29 2.2.2 The Artificial Neural Network .............................................................................................. 31
2.3 LEARNING ALGORITHMS .............................................................................................................. 33 2.3.1 Overview............................................................................................................................... 33 2.3.2 Error Correction Learning ................................................................................................... 33 2.3.3 Memory Based Learning....................................................................................................... 35 2.3.4 Hebbian Learning................................................................................................................. 35 2.3.5 Competitive Learning ........................................................................................................... 36 2.3.6 Boltzmann Learning ............................................................................................................. 37 2.3.7 Self-Organized Learning ...................................................................................................... 39 2.3.8 Reinforcement Learning ....................................................................................................... 40
2.4 MAJOR TYPES OF ARTIFICIAL NEURAL NETWORKS...................................................................... 40 2.4.1 Feedforward Networks ......................................................................................................... 40 2.4.2 Recurrent Networks .............................................................................................................. 42 2.4.3 Self-Organizing Networks..................................................................................................... 43 2.4.4 Radial Basis Function Networks and Time Delay Neural Networks .................................... 44 2.4.5 Fuzzy Neural Networks......................................................................................................... 46
2.5 CONCLUSIONS .............................................................................................................................. 48 3. RADIAL BASIS FUNCTION NETWORKS ................................................................................ 23
3.1 INTRODUCTION............................................................................................................................. 23 3.2 RADIAL BASIS FUNCTION NETWORKS – THEORETICAL FOUNDATIONS ........................................ 24
3.2.1 Overview............................................................................................................................... 24 3.2.2 Multivariable Interpolation .................................................................................................. 24
Contents
vi
3.2.3 The Hyper-Surface Reconstruction Problem........................................................................ 26 3.2.4 Regularisation ...................................................................................................................... 28
3.3 RADIAL BASIS FUNCTION NETWORKS .......................................................................................... 31 3.3.1 General ................................................................................................................................. 31 3.3.2 RBF Structure ....................................................................................................................... 31 3.3.3 RBF Initialisation and Learning........................................................................................... 32
3.4 FUNCTION APPROXIMATION WITH RBFNS ................................................................................... 39 3.4.1 General ................................................................................................................................. 39 3.4.2 Universal Approximation...................................................................................................... 39 3.4.3 Input Dimensionality ............................................................................................................ 40 3.4.4 Comparison of RBFNs and Multi-Layer Perceptrons .......................................................... 41
3.5 SUITABILITY OF RBFNS FOR GRADE ESTIMATION ....................................................................... 42 4. MINING APPLICATIONS OF ARTIFICIAL NEURAL NETWORKS................................... 71
4.1 OVERVIEW.................................................................................................................................... 71 4.2 ANN SYSTEMS FOR EXPLORATION AND RESOURCE ESTIMATION ................................................ 72
4.2.1 General ................................................................................................................................. 72 4.2.2 Sample Location Based Systems........................................................................................... 73
POPULATIONS..................................................................................................................................... 79 4.2.3 Sample Neighborhood Based Systems .................................................................................. 80 4.2.4 Conclusions .......................................................................................................................... 85
4.3 ANN SYSTEMS FOR OTHER MINING APPLICATIONS..................................................................... 86 4.3.1 Overview............................................................................................................................... 86 4.3.2 Geophysics............................................................................................................................ 86 4.3.3 Rock Engineering ................................................................................................................. 89 4.3.4 Mineral Processing............................................................................................................... 89 4.3.5 Remote Sensing..................................................................................................................... 91 4.3.6 Process Control-Optimisation and Equipment Selection ..................................................... 93
4.4 CONCLUSIONS .............................................................................................................................. 94 5. DEVELOPMENT OF A MODULAR NEURAL NETWORK SYSTEM FOR GRADE ESTIMATION ..................................................................................................................................... 96
5.1 INTRODUCTION............................................................................................................................. 96 5.2 FORMING THE INPUT SPACE FROM 2D SAMPLES........................................................................... 98 5.3 DEVELOPMENT OF THE NEURAL NETWORK TOPOLOGIES ........................................................... 106
5.3.1 Overview............................................................................................................................. 106 5.3.2 The Hidden Layer ............................................................................................................... 107 5.3.3 Final Weights and Output................................................................................................... 110
5.4 LEARNING FROM 2D SAMPLES.................................................................................................... 111 5.4.1 Overview............................................................................................................................. 111 5.4.2 Module 1 – Learning from Octants..................................................................................... 112 5.4.3 Module 2 – Learning from Quadrants ................................................................................ 115 5.4.4 Module 3 – Learning from Sample 2D Co-ordinates ......................................................... 117
5.5 TRANSITION FROM 2D TO 3D DATA ........................................................................................... 120 5.5.1 General ............................................................................................................................... 120 5.5.2 Input Space: Adding the Third Co-ordinate ....................................................................... 121 5.5.3 Input Space: Adding the Sample Volume............................................................................ 122 5.5.4 Search Method: Expanding to Three Dimensions .............................................................. 123
5.6 COMPLETE PROTOTYPE OF THE MNNS ...................................................................................... 126 5.7 CONCLUSIONS ............................................................................................................................ 129
6. CASE STUDIES OF THE PROTOTYPE MODULAR NEURAL NETWORK SYSTEM.... 131 6.1 OVERVIEW.................................................................................................................................. 131 6.2 CASE STUDY 1 – 2D IRON ORE DEPOSIT..................................................................................... 133 6.3 CASE STUDY 2 – 2D COPPER DEPOSIT........................................................................................ 136 6.4 CASE STUDY 3 – 3D GOLD DEPOSIT........................................................................................... 140 6.5 CASE STUDY 4 – 3D CHROMITE DEPOSIT ................................................................................... 146 6.6 CONCLUSIONS ............................................................................................................................ 149
7. GEMNET II – AN INTEGRATED SYSTEM FOR GRADE ESTIMATION......................... 150
Contents
vii
7.1 OVERVIEW.................................................................................................................................. 150 7.2 CORE ARCHITECTURE AND OPERATION...................................................................................... 152
7.2.1 Exploration Data Processing and Control Module ............................................................ 152 7.2.2 Module Two – Modeling Grade’s Spatial Distribution ...................................................... 159 7.2.3 Module One – Modelling Grade’s Spatial Variability........................................................ 162 7.2.4 Final Module – Providing a Single Grade Estimate........................................................... 164
7.3 VALIDATION............................................................................................................................... 167 7.3.1 Training and Validation Errors.......................................................................................... 167 7.3.2 Reliability Indicator............................................................................................................ 168 7.3.3 Module Index ...................................................................................................................... 170 7.3.4 RBF Centres Visualisation ................................................................................................. 171
7.4 INTEGRATION ............................................................................................................................. 172 7.4.1 Neural Network Simulator.................................................................................................. 172 7.4.2 Interface with VULCAN – 3D Visualization....................................................................... 176
7.5 CONCLUSIONS ............................................................................................................................ 182 8. GEMNET II APPLICATION – CASE STUDIES...................................................................... 185
8.1 OVERVIEW.................................................................................................................................. 185 8.2 CASE STUDY 1 – COPPER/GOLD DEPOSIT 1 ................................................................................ 188 8.3 CASE STUDY 2 – COPPER/GOLD DEPOSIT 2 ................................................................................ 197 8.4 CASE STUDY 3 – COPPER/GOLD DEPOSIT 3 ................................................................................ 209 8.5 CASE STUDY 4 – COPPER/GOLD DEPOSIT 4 ................................................................................ 220 8.6 CONCLUSIONS ............................................................................................................................ 226
9. CONCLUSIONS AND FURTHER RESEARCH....................................................................... 185 9.1 CONCLUSIONS ............................................................................................................................ 185 9.2 FURTHER RESEARCH .................................................................................................................. 188
APPENDIX A – FILE STRUCTURES............................................................................................ 239 A1. SNNS NETWORK DESCRIPTION FILE ......................................................................................... 239 A2. SNNS NETWORK PATTERN FILE ............................................................................................... 241 A3. BATCHMAN NETWORK DEVELOPMENT SCRIPT ..................................................................... 242 A4. SNNS2C NETWORK C CODE EXTRACT..................................................................................... 243 A5. VULCAN COMPOSITES FILE..................................................................................................... 247
APPENDIX B – CASE STUDY DATA ........................................................................................... 254 B1. CASE STUDY 1 – 2D IRON ORE DEPOSIT.................................................................................... 254 B2. CASE STUDY 2 – 2D COPPER DEPOSIT ....................................................................................... 254 B3. CASE STUDY 3 – 3D GOLD DEPOSIT .......................................................................................... 246 B4. CASE STUDY 4 – 3D CHROME DEPOSIT ..................................................................................... 246
REFERENCES .................................................................................................................................. 253
Contents
viii
List of Figures Chapter 1
Figure 1.1: Drillholes from exploration programme and development, intersecting the orebody
(coloured by gold assays – screenshot from VULCAN Envisage). 4
Figure 1.2: Compositing of drillhole samples using interval equal to sample length. 6
Figure 1.3: Polygonal method of ore grade estimation. 8
Figure 1.4: Triangular method of ore grade estimation. 9
Figure 1.5: Search ellipse used during selection of samples for ore grade estimation. 12
Figure 1.6: Frequency histogram (left) and variogram (right) of copper grades (percentages). 15
Figure 1.7: Grid modeling as visualised in an advanced 3D graphics environment. 17
Figure 1.8: Sections through a block model intersecting the orebody. 18
Chapter 2
Figure 2.1: Illustration of a typical neuron [13]. 25
Figure 2.2: Propagation of an action potential through a neuron’s axon [13]. 26
Figure 2.3: The five major models of computation as they were presented six decades ago [18]. 29
Figure 2.4: Structure of the processing element [32]. 30
Figure 2.5: Effect of bias on the input to the activation function (induced local field) [32]. 31
Figure 2.6: Common activation functions: (a) unipolar threshold, (b) bipolar threshold, (c)
unipolar sigmoid, and (d) bipolar sigmoid [33]. 32
Figure 2.7: Basic structure of a layered ANN [32]. 33
Figure 2.8: Structure of the feedforward artificial neural network. There can be more than one
middle or hidden layers [33]. 42
Figure 2.9: a) Recurrent network without self-feedback connections, b) recurrent network
with self-feedback connections [32]. 44
Figure 2.10: Structure of a two-dimensional Self-Organising Map [32]. 45
Figure 2.11: Basic structure of the Radial Basis Function Network [33]. 46
Figure 2.12: The concept of Time Delay Neural Networks for speech recognition [40]. 47
Figure 2.13: An approach to FNN implementation [44]. 49
Contents
ix
Chapter 3
Figure 3.1: Regularisation network [32]. 58
Figure 3.2: Structure of generalised RBF network [32]. 61
Figure 3.3: Illustration of input space dissection performed by the RBF and MLP networks [69]. 70
Chapter 4
Figure 4.1: ANN for ore grade/reserve estimation by Wu and Zhou [73]. 77
Figure 4.2: General structure of the AMAN neural system. 80
Figure 4.3: Back-propagation network used in the NNRK hybrid system. 82
Figure 4.4: Drillhole data used for testing the performance of the NNRK system. 83
Figure 4.5: 2D approach of learning from neighbour samples arranged on a regular grid. 85
Figure 4.6: Modular network approach implemented in the GEMNet system [84]. 86
Figure 4.7: Scatter diagram of GEMNet estimates on a copper deposit [84]. 87
Figure 4.8: Contour maps of GEMNet reliability indicator and grade estimates of a copper
deposit [84]. 88
Figure 4.9: Back-propagation network used for lateral log inversion [86]. Connections between
layers are not shown. 91
Figure 4.10: Estimated grades and assays (red and blue) vs. actual (black) (89). 92
Chapter 5
Figure 5.1: Illustration of quadrant and octant search method (special case where only one
sample is allowed per sector). Respective grid nodes are also shown. 104
Figure 5.2: Estimation results from neural network architecture developed for use with gridded
data. The use of irregular data has an obvious effect in the performance of the system. 105
Figure 5.3: Neural network architectures receiving inputs from a quadrant search (left) and from
an octant search (right). 106
Figure 5.4: Improvement in estimation by the introduction of the neighbour sample distance in
the input vector. 108
Figure 5.5: Modular neural network architecture developed for ore grade estimation from 2D
samples [113]. 110
Figure 5.6: Partitioning of the original dataset into three parts each one targeted at a different
module of the MNNS. 115
Figure 5.7: RBF network used as part of module 1 in MNNS. Training patterns from an octant
search were used to train the network. 117
Figure 5.8: Posting of the basis function centres from the RBF network of Fig. 5.7 in the
normalised input space (X-Grade, Y-Distance). 118
Figure 5.9: Graph showing the learned relationship between the network’s inputs (grade and
Contents
x
distance of neighbour sample) and the network’s output (target grade) for the RBF
network of Fig. 5.7. 119
Figure 5.10: Example of an RBF network from Module 2. 120
Figure 5.11: Posting of the basis function centres from the RBF network of Fig. 5.10 in the
normalised input space (X-Grade, Y-Distance). 121
Figure 5.12: Graph showing the learned relationship between the network’s inputs (grade and
distance of neighbour sample) and the network’s output (target grade) for the RBF
network of Fig. 5.10. 121
Figure 5.13: Module 3 MLP network trained on sample co-ordinates. 122
Figure 5.14: Learned mapping between sample co-ordinates (easting and northing) and sample
ore grade for MLP network of Module 3. 124
Figure 5.15: 3D version of quadrant search. 127
Figure 5.16: 3D version of octant search. 128
Figure 5.17: Simplified 3D search method used in the MNNS for sample selection. 129
Figure 5.18: Diagram showing the structure of the MNNS for 3D data (units are the neural
network modules). 130
Figure 5.19: Learned weighting of outputs from module one RBF networks by the RBF of
module two. 131
Figure 5.20: Learned relationships between sample co-ordinates, length (inputs) and sample
grade (output) from the RBF network of module three. 133
Chapter 6
Figure 6.1: Posting of input/training samples (blue) and test samples (red) from the iron ore
deposit. 138
Figure 6.2: Scatter diagram of actual vs. estimated iron ore grades. 139
Figure 6.3: Iron ore grade distributions – actual and estimated. 140
Figure 6.4: Contour maps of iron ore actual and estimated grades. 141
Figure 6.5: Posting of input/training samples (blue) and test samples (red) from the copper
deposit. 142
Figure 6.6: Scatter diagram of actual vs. estimated copper grades. 143
Figure 6.7: Copper grade distributions – actual and estimated. 144
Figure 6.8: Contour maps of copper actual and estimated grades. 145
Figure 6.9: 3D view of the orebody and drillhole samples used in the 3D gold deposit study. 147
Figure 6.10: Scatter diagram of actual vs. estimated gold grades. 148
Figure 6.11: Gold grade distributions – actual and estimated. 149
Figure 6.12: Gold grades distribution of the complete dataset. 150
Figure 6.13: Drillholes from a 3D chromite deposit. 151
Figure 6.14: Scatter diagram of actual vs. estimated chromite grades. 153
Figure 6.15: Chromite grade distributions – actual and estimated. 153
Contents
xi
Chapter 7
Figure 7.1: Simplified block diagram showing the operational steps of the data processing and
control module in GEMNET II. 159
Figure 7.2: Normalisation information panel. 160
Figure 7.3: Interaction between GEMNET II and other parts of the integrated system during
operation of the data processing and control module. 165
Figure 7.4: RBF centres from second module located in 3D space. Drillholes and modelled
orebody are also shown. 168
Figure 7.5: RBF centres of west sector RBF network and respective training samples in the
input pattern hyperspace (X-Grade, Y-Distance, Z-Length). 170
Figure 7.6: Final module’s RBF network. 172
Figure 7.7: Block model coloured by the reliability indicator in GEMNET II. 176
Figure 7.8: Block model coloured by module index in GEMNET II. Cyan blocks represent first
module estimates while red blocks represent second module estimates. 177
Figure 7.9: First module RBF centres visualisation in GEMNET II. Drillholes and orebody
model are also shown. 178
Figure 7.10: Diagram of the main components of SNNS. 179
Figure 7.11: Modules and extensions of VULCAN. 185
Figure 7.12: Menu structure of GEMNET II in Envisage. 186
Figure 7.13: GEMNET II panels in Envisage. 187
Figure 7.14: Console window with messages from GEMNET II operation. 188
Figure 7.15: GEMNET II online help. 189
Chapter 8
Figure 8.1: Orebody and drillholes from copper/gold deposit 1. 195
Figure 8.2: Scatter diagram of actual vs. estimated copper grades from copper/gold deposit 1. 196
Figure 8.3: Copper grade distributions from copper/gold deposit 1. 197
Figure 8.4: Scatter diagram of actual vs. estimated gold grades from copper/gold deposit 1. 198
Figure 8.5: Gold grade distributions from copper/gold deposit 1. 199
Figure 8.6: Plan section (top) and cross section (bottom) of block model coloured by reliability
indicator values for the gold grade estimation of copper/gold deposit 1. 200
Figure 8.7: Plan section (top) and cross section (bottom) of block model coloured by module
index for gold and copper grade estimation of copper/gold deposit 1. 201
Figure 8.8: RBF centers locations and training patterns from module 1 networks, north (top)
and east (bottom). 202
Figure 8.9: Plan section (top) and cross section (bottom) of block model coloured by gold grade
estimates for the copper/gold deposit 1. 203
Figure 8.10: Orebodies and drillholes from copper/gold deposit 2. 205
Contents
xii
Figure 8.11: Scatter diagram of actual vs. estimated gold grades from zone TQ1 of copper/gold
deposit 2. 208
Figure 8.12: Gold grade distributions from zone TQ1 of copper/gold deposit 2. 208
Figure 8.13: Scatter diagram of actual vs. estimated gold grades from zone TQ1A of copper/
gold deposit 2. 209
Figure 8.14: Gold grade distributions from zone TQ1A of copper/gold deposit 2. 209
Figure 8.15: Scatter diagram of actual vs. estimated gold grades from zone TQ2 of copper/gold
deposit 2. 210
Figure 8.16: Gold grade distributions from zone TQ2 of copper/gold deposit 2. 210
Figure 8.17: Scatter diagram of actual vs. estimated gold grades from zone TQ3 of copper/gold
deposit 2. 211
Figure 8.18: Gold grade distributions from zone TQ3 of copper/gold deposit 2. 211
Figure 8.19: Plan section (top) and cross section (bottom) of block model coloured by reliability
indicator values for the gold grade estimation of copper/gold deposit 2. 213
Figure 8.20: Plan section (top) and cross section (bottom) of block model coloured by module
index for gold and copper grade estimation of copper/gold deposit 2. 214
Figure 8.21: RBF centers locations and training patterns from module 1 network north (top)
and module 2 network (bottom) in copper/gold deposit 2. 215
Figure 8.22: Plan section (top) and cross section (bottom) of block model coloured by gold
grade estimates for the copper/gold deposit 2. 216
Contents
xiii
List of Tables
Chapter 4
Table 4.1: Comparison of NNRK, ANN, and kriging estimates. 83
Chapter 5
Table 5.1: Learning strategy for Module 3 MLP network. 123
Chapter 6
Table 6.1: Characteristics of datasets from the MNNS case studies. 137
Table 6.2: Mean absolute errors from case study 1. 139
Table 6.3: Mean absolute errors from case study 2. 143
Table 6.4: Actual and estimated average gold grades. 147
Table 6.5: Mean absolute errors from case study 3. 148
Table 6.6: Actual and estimated average chromite grades. 152
Table 6.7: Mean absolute errors from case study 4. 152
Chapter 7
Table 7.1: System variables available in BATCHMAN. 181
Chapter 8
Table 8.1: Main characteristics of the four deposits used for testing the final GEMNET II
architecture. 193
Table 8.2: Statistics of data from copper/gold deposit 1. 196
Table 8.3: Actual and estimated average copper and gold grades from copper/gold deposit 1. 199
Table 8.4: Samples and block model file information and training pattern generation results for
copper/gold deposit 2. 206
Table 8.5: Statistics from copper/gold deposit 2 and estimation performance results. 207
Introduction
1
1. Introduction
1.1 The Problem of Grade Estimation Grade estimation is one of the most complicated aspects in mining. It also happens to
be one of the most important. The complexity of grade estimation originates from
scientific uncertainty, common to similar engineering problems, and the necessity for
human intervention. The combination of scientific uncertainty and human judgement
is common to all grade estimation procedures regardless of the chosen methodology.
In statistical terms, grade estimation is a problem of prediction. Geo-scientists
are given a set of samples from which they need to construct a quantitative model of
an orebody’s grade by interpolating and extrapolating between these samples. Geo-
scientists can be people coming from very different fields like geology, mathematics
and statistics. The quantitative model they will construct, ideally takes into
consideration the qualitative model of the orebody built by the geologists interpreting
the exploration data.
The amount of data available for support of the grade estimation process is
usually very small compared with the amount of information that has to be extracted
from them. This data also occupies a very small volume in 3D space compared to the
volume of the orebody that undergoes grade estimation. The quality of this data is
dependant on a number of processes that involve human interaction and allow for the
introduction of measurement errors at the early stages of sampling, analysing and
logging. It should be noticed also that exploration data is usually very expensive.
There are various methods developed for performing grade estimation.
Generally, these methods can be classified in three categories: geometrical, distance
based and geostatistical. There are certain assumptions inherent to each of these
methods, while most of them depend on human judgement and allow for the
Introduction
2
introduction of human errors. These assumptions mainly regard the spatial
distribution characteristics of grade, such as the continuity in different directions in
space. It would be an understatement to say that a great percentage of the people who
apply these methods do not understand or take into consideration these assumptions.
Especially in the case of geostatistics, due to the built-in complexity of the
methodology, people tend to overlook the significance of these assumptions or
underestimate the negative effects that any misjudgements might have. As a result,
mining projects often begin with ‘great expectations’ that may never become reality.
Over- or underestimation of grades is only one of the many unforgiving results from a
wrong choice and application of grade estimation methods.
In recent years, many researchers in the field of grade/reserves estimation
have noticed these problems and tried to suggest possible alternatives. Some of them
have tried to prove that the assumptions inherent in geostatistics cannot be valid most
of the times and therefore other methods should be considered. However, these
discussions commonly concentrate more in disapproving geostatistics and other
established methodologies rather than progress towards a new and valid method.
It seems to be a common belief that the geostatistical methodology has created
a special league of people who understand the underlying mechanisms and theory.
Unfortunately these people are the minority of the scientists and engineers who are
asked to provide with grade estimates based on which large amounts of investment
money will be spent. In most of the cases people misuse geostatistics or completely
avoid them even though they could benefit from their use. Many geologists build
their own picture of the orebody in their minds using their experience and even their
instincts. They ‘develop’ their own methods of estimation by adjusting less advanced
methods to the exploration data at the early stages of a mining project. What is even
Introduction
3
more unfortunate is that they continue to build confidence on those early models of
the orebody, something that inevitably leads them to the difficult point of not being
able to fit new data coming from the mine to their model.
There are too many examples of successful application of geostatistics and
other existing methods for one to completely disregard them. Specifically in the case
of geostatistics, this success cannot be credited to luck because as will be discussed
later, it is a very painful and time consuming process that leaves no space for
mistakes or misjudgements. Therefore, careful choice of a method and careful
application of this method to exploration data can produce reliable results. As already
discussed though, the current methods for grade estimation and particularly
geostatistics require a large amount of knowledge and skills to be effectively applied.
They can be very time consuming and difficult to explain to people who make
investment decisions. Finally, their results depend on the skills and experience of the
modeller, and the quality of the exploration data. They can also be prone to errors
when handling data, which does not follow the necessary assumptions. In the next
paragraph a brief discussion is given on the exploration data used during grade
estimation in order to explain the potential problems they can cause in this process.
1.2 Grade Data from Exploration Programs Drilling is the most common way to enter the 3D space under the ground surface to
extract samples from the underlying rock. Other methods exist such as the
construction of shafts and tunnels.
The geologist, based on the samples obtained, will conclude as to the presence
of a mineralised body. Economics usually dictate the maximum number of drillholes
even though this is also controlled by the complexity of the geological environment.
There are many types of drilling equipment. The layout of a drilling programme does
Introduction
4
not follow specific rules. Figure 1.1 shows a set of drillholes from a copper/gold
orebody. Hole spacing and size depend solely on the characteristics of the orebody.
This is a major source of complication when it comes to developing a grade
estimation technique.
Figure 1.1: Drillholes from exploration programme and development, intersecting the orebody
(coloured by gold assays – screenshot from VULCAN Envisage).
Once the samples are obtained and logged, the mineralised parts are prepared
for assay. Computers are extensively used during this process for logging and storage
of the samples. The outcome of the exploration programme and post-processing is a
series of files containing records of drillhole samples. There are usually three files
describing the contents and the position of the samples in 3D space. These files are:
• Collar table file: this file contains the co-ordinates of the drillhole collars
and the overall geometry of the drillhole.
Introduction
5
• Survey table file: this file provides all the necessary information to derive
the co-ordinates of individual samples in space. The combination of the
survey and collar tables is necessary in order to visualise drillholes
correctly using 3D computer graphics and enable the development of a
drillhole database.
• Assay table file: the results of the assay analysis are stored as per sample
in this file. When combined with the previous two files, this leads to the
completion of the drillhole database. This database is the source of input
data for the process of grade estimation.
Following the development of a drillhole database is the compositing of drillholes
into intervals. These intervals refer to drillhole length and can be fixed or they can be
derived from the sample lengths. In the first case, if the interval is greater than the
length of the samples, then more than one samples are used to provide the assay value
for that interval. Figure 1.2 illustrates the process of compositing. Compositing is
usually a length-weighted average except in the case of extremely variable density
where compositing must be weighted by length times density [71]. In the case of the
intervals being derived directly from the sample lengths, the number of composites
equals the number of samples in the database and the compositing procedure is
reduced to a reconstruction of the database into a single file containing all the
information. This type of compositing will be used throughout this thesis in order to
provide the input data files for the various case studies.
Introduction
6
Figure 1.2: Compositing of drillhole samples using interval equal to sample length.
A typical composites file starts with a header describing the structure of the
file and the format used for reporting the values of the various parameters. After the
header follows the main part of the file consisting of the sample records. Records
typically contain the following parameters:
Sample id, top xyz, bottom xyz, middle xyz, length, from, to, geocode, assay values
The top, bottom, and middle co-ordinates are derived from the survey and collar
tables as explained above. The from and to fields refer to the distance from the
drillhole collar to the beginning and end of the sample respectively. There can be a
number of codes describing geology, lithology, etc. These parameters allow the
interaction between the qualitative model of the orebody, built by the geologist, and
the quantitative model, which will be developed after grade estimation. Finally, there
Introduction
7
can be more than one variable values reported for every composite, e.g. gold and
copper grade.
The irregularities of the drilling scheme, the limited amount of drillholes,
which are economically feasible, and the complex procedures necessary for the
analysis of the obtained samples account for many of the problems during grade
estimation. Additionally, the grades themselves will often present behaviour, which is
very difficult to model using the information available from an exploration
programme. People responsible for the exploration programme are always facing the
questions of how much would it help to add an extra drillhole to the samples
database, whether the cost of the extra drillhole is justifiable by the derived benefits
and naturally, where to drill in the given area.
1.3 Existing Methods for Grade Estimation
1.3.1 General In the following paragraphs, several of the most common existing methods for
grade estimation will be discussed briefly. Attention will be given to their specific
areas of application. Every method presents special characteristics that make it more
applicable to certain types of deposits. There is no such thing as a universally
applicable method for grade estimation. The selection of a method for a particular
deposit depends on the geological and engineering attributes of the latter.
1.3.2 Geometrical Methods Before computers dominated the field of grade estimation, the geometrical methods
were the most often employed [81] and they are still used for quick evaluation of
reserves. These methods include the polygonal (Fig. 1.3), triangular (Fig. 1.4) and the
method of sections.
Introduction
8
Figure 1.3: Polygonal method of grade estimation.
The polygonal method is very often used with drillhole data. It can be applied on
plans, cross sections, and longitudinal sections. The average grade of the sample
inside the polygon is assigned to the entire polygon and provides the grade estimate
for the area of the polygon. The thickness of the mineralisation in the sample is also
applied to the polygon to provide a volume for the reserve estimate. The assumption
here is that the area of influence of any sample extends halfway to the adjacent
sample points. The polygons are constructed by joining the bisectors perpendicular to
the lines connecting these sample points. The polygonal method is applied to simple -
moderate geometry deposits with low to medium grade variability (e.g. coal,
sedimentary iron, limestone, evaporites).
Introduction
9
Figure 1.4: Triangular method of grade estimation.
The triangular method is a slightly more advanced method than the polygonal. In this
method the triangle area between three adjacent drillholes receives the average grade
of the three samples involved. In computation terms, the triangular method is much
faster since the areas are easy to calculate from the co-ordinates of the three points.
This method can be applied to the same cases as the polygonal method.
The last of the three geometrical methods to be mentioned in this thesis, the
method of sections, is the most manual one and requires a lot of time and patience.
The areas of influence of the drillhole samples expand half way to adjacent sections
and to adjacent drillholes in the same section. The grades of the samples are assigned
to their areas of influence. The method of sections is usually applied in deposits with
very complex geometry where the other methods present.
Introduction
10
The geometrical methods suffer from problems concerning the predicted
distribution of grades. Depending on the average grade of the deposit and the cutoff
grade, they can lead to over- or underestimation of grades.
1.3.3 Inverse Distance Method Inverse distance weighting as well as kriging – the geostatistical interpolation tool -
belong to the class of moving average methods. They are both based on repetitive
calculations and therefore require the use of computers. Inverse distance weighting
consists of searching the database for the samples surrounding a point (or a block)
and computing the weighted average of those samples’ grades. This average is
calculated using the equation below:
g* = Σwigi i = 1,2,3,…n (1.1)
where g* is the grade estimate, gi is the i sample’s grade, wi is the weight for the
sample I, and n is the number of samples. The difference between inverse distance
weighting and kriging is in the way the weights w are calculated. In the case of
inverse distance, the weights are calculated as an inverse power of distance as
follows:
∑ −
−
= poweri
poweri
i dd
w i = 1…n (1.2)
where wi is the weight for sample i, di is the distance between sample i and the
estimated point and weighting power is the inverse distance weighting power. The sample
Introduction
11
selection strategy is as important as the weighting power. The following rules –
guidelines can be used during sample selection [71]:
• Samples should be chosen from the estimate point’s geologic domain;
• The search distance should be at least equal to the distance between samples;
• There should be a maximum number of samples to be selected;
• Samples must be a minimum distance from the estimate point to prevent
excessive extrapolation;
• Trends in the grade should be accounted for by the use of a search ellipse.
Modelling of the grade’s range of continuity in various directions is necessary to
provide the axes of the search ellipse (Fig. 1.5). This is commonly achieved using
variogram modelling (see next paragraph);
• The number of samples from any drillhole should be kept up to a maximum of
three. More samples leads to redundant data and can cause problems specially if
kriging is used as the interpolation method;
• Quadrant or octant search schemes may be used in the case of clustered data to
improve the estimation results [71].
The weighting power as well as the search radius and number of samples used can
affect the degree of smoothing. Unfortunately, these can only be found through trial
and error in order to honour the trends in the grade or match production results or
even follow the ideas of the geologist about deposit.
Introduction
12
Figure 1.5: Search ellipse used during selection of samples for grade estimation. The ellipse is divided
in quadrants and a maximum number of points is selected from each one of them.
Inverse distance weighting can be applied to deposits with simple to moderate
geometry and with low to high variability of grade (e.g. all the types mentioned in
polygonal method, bauxite, lateritic nickel, porphyry copper, gold veins, gold placers,
alluvial diamond, stockwork) [71].
1.3.4 Geostatistics The work of G. Matheron and D. Krige in the early 1960s led to the development of
an ore reserve estimation methodology, which is known as geostatistics. The theory
of geostatistics combines aspects from different sciences such as geology, statistics
and probability theory. It is a highly complex methodology with its main purpose
being the best possible estimation of ore grades within a deposit given a certain
amount of information. Geostatistics as any other method will not improve on the
quantity and quality of input data.
Introduction
13
Matheron’s theory of regionalised variables [58] forms the basis of
geostatistical methodology. In brief, according to this theory, any mineralisation can
be characterised by the spatial distribution of a certain number of measurable
quantities (regionalised variables) [38]. Geostatistics follows the observation that
samples within an ore deposit are spatially correlated with one another. Attention is
also given to the relationship between sample variance and sample size.
Every geostatistical study begins with the process of structural analysis,
which is by far the most important step of this methodology. Structural analysis
examines the structures of the spatial distribution of ore grades via the development
of variograms. The variogram utilises all the available structural information to
provide a model of the spatial correlation or continuity of ore grades. The calculation
of a variogram should be based on data from similar geological domains. The
variogram function is as follows:
( ) ( ) ( )( )∑ +−= 2
21 hxgxgn
h iiγ i = 1…n (1.3)
Where g(xi) is the grade at point xi, g(xi + h) is the grade of a point at distance h from
point xi, and n is the number of sample pairs. Sample pairs are oriented in the same
direction and separated by the distance h. Their volume should also be constant. This
is being considered during compositing of drillholes. For the purposes of constant
semi-variogram support, compositing should be performed on a constant interval.
The variogram function is calculated for different values of distance h. The
resulting graph is known as the experimental variogram. As shown in Figure 1.7, the
variogram usually increases with increasing distance and reaches a plateau level. The
distance h at which the variogram stops increasing and becomes more or less level is
Introduction
14
called the variogram range. The value of the variogram at this distance is called sill
of the variogram (C+Co). Finally, the value of the variogram at distance h = 0 is
called the nugget effect (Co). There is a number of different meanings given to a high
nugget effect in comparison to the sill of the variogram, such as low quality samples
and non-homogenic sampling zone. Most of the times it is fairly difficult to identify
these three parameters from the experimental variogram graph and therefore it
becomes difficult to fit one of the available models. It is a process that requires skill,
experience and large amounts of time. It is also a point where mistakes are being
made, undermining the entire process of grade estimation.
Following the variogram modelling is the geostatistical method for grade
interpolation, called kriging. Kriging is a linear estimation method, which is based on
the position of the samples and the continuity of grades as shown by the variograms.
The method finds the optimal weights wi for equation (1.1) by evaluating the
estimation variance from the calculated variograms. Therefore kriging is not based
only on distance, as is the inverse distance method.
Figure 1.7: Frequency histogram (left) and variogram (right) of copper grades (percentages).
Introduction
15
There are a number of variations of kriging each suited to different types of
deposits and sampling schemes. The geostatistical methodology is very well
documented and there are many good publications on this field [38,20,37,17]. There
have also been developed non-linear variants of kriging such as log-normal and
disjunctive kriging [21,49] which are far more advanced than linear kriging but also
far more complicated.
Generally, it is difficult to argue with the efficiency and reliability of a
properly developed geostatistical study. However, there is always the issue of
justifying the extra complexity and cost of geostatistics especially at the beginning of
a mining project when there are no actual values to compare with.
1.3.5 Conclusions From the very brief discussion above, it becomes clear that there is still a need for a
fast and reliable method for ore grade estimation, the results of which will depend
only on the complexity and variability of the given deposit and not so much on the
quality and quantity of the given data. The required method should also not depend
on the skills and knowledge of the person who is applying it, while remaining easy to
understand and apply.
The methods developed so far and especially geostatistics suffer either from
over-simplification of the ore grade estimation process, as in the case of the
geometrical methods, or from over-sophistication, as in the case of geostatistics.
Choosing one of the available methods is usually a compromise between speed and
reliability, cost and attention to detail. This is a compromise very few mining
companies are willing to make but many of them have to because of the resources
available.
Introduction
16
1.4 Block Modelling & Grid Modelling in Grade Estimation Grade estimation usually involves interpolation between known samples, which
become available from an exploration program or from the development of the mine.
The interpolation process is based on locations commonly arranged on a regular
geometric structure designed to provide for the necessary detail and cover the
volume/area of interest. Block and grid model are the main structures used during
grade estimation and deposit modelling. The choice depends on the type and
complexity of the deposit and the value of interest [5].
Figure 1.8: Grid modeling as visualised in an advanced 3D graphics environment.
Grid models (Fig. 1.8) consist of a series of computer two-dimensional
matrices. These matrices may contain estimates of different parameters such as
grades, thickness, structures and other values. A grid is usually defined by its origin
Introduction
17
in space, i.e. the easting, northing, and elevation of its starting position, the distance
between its nodes in both directions, and its dimension in these directions, i.e. the
number of nodes. This structure dramatically reduces the amount of information
necessary to represent a complete model of the deposit and has the additional
advantage of allowing easy manipulation of the various parameters included by
performing simple calculations between the grids. Grid modeling is best suited for
deposits with two of their dimensions being significantly greater than the third.
Block models are far more complex structures being three-dimensional and
allowing the storage of more than one parameter. Figure 1.9 shows two sections
through a block model. The volume including the deposit of interest is divided into
blocks with specific volume associated with them. Their centroid’s relative X, Y, and
Z co-ordinates as to the origin of the model define these blocks. Their dimensions can
vary from one to the other – usually decreasing close to geologic structures and other
features that require more detail. There can be more than one variables associated
with every block, some estimated and others derived. grade estimation on a block
model basis means the extension of point samples to block estimates with volume.
Introduction
18
Figure 1.9: Sections through a block model intersecting the orebody. A surface topography model has
limited the block model.
Block models allow the modeling of deposits with very complex geometry.
They do require though excessive computational power and they tend to be more
demanding as the number of variables stored increases. They are also more difficult
to visualise as they are three-dimensional and they can only be effectively plotted in
sections.
1.5 Artificial Neural Networks for Grade Estimation Artificial neural networks (ANNs) are the result of decades of research for a
biologically motivated computing paradigm. There are many different opinions as to
their definition and applicability to technological problems. It is a common belief
though, that ANNs present an alternative to the concept of programmed or hard
computing. The ever-emerging ANN technology brought the concept of neural
computing, which finds its way more and more into real engineering problems. ANNs
Introduction
19
are parallel computing structures, which replace program development with learning
[92].
There have been many cases of successful applications of ANNs to function
approximation, prediction and pattern recognition problems in the past. This fact as
well as special characteristics of ANNs that will be discussed in the next chapter
makes them a natural choice for the problem of grade estimation. As discussed in the
previous paragraphs, grade estimation is commonly reduced to a problem of function
approximation. ANNs and specifically the chosen type of ANNs can provide, as this
thesis will try to prove, a valid methodology for grade estimation.
1.6 Research Objectives Disregarding of the existing methodologies for grade estimation is definitely not one
of the aims of this thesis. The GEMNet II system described was developed to provide
a flexible but complete alternative method, which takes into consideration the theory
behind deposit formation while minimising the dependence on certain assumptions.
The main objectives of the development of GEMNet II can be identified as follows:
• To find a suitable neural network architecture for the problem of grade
estimation.
• To take advantage of the function approximation properties of ANNs.
• To break down the problem of grade estimation into less complex functions that
can be modelled using these properties.
• To integrate the developed neural network architecture in a system which will be
user-friendly and flexible.
• To provide means of validating the results of this system.
• To minimise the knowledge required for using the system.
Introduction
20
• To compare the performance of the system with existing grade estimation
techniques on the basis of estimation properties, usability and time requirements.
1.7 Thesis Overview Given below is a description of the chapters included in this thesis:
• Chapter 2 - Artificial Neural Networks Theory
Gives a brief discussion on the theory behind ANNs, the main ANN architectures
and their main application areas.
• Chapter 3 - Radial Basis Function Networks
Examines a special type of ANN architecture, which will form the basis of the
GEMNet II system. An in-depth analysis of Radial Basis Function Networks is
presented in order to provide a better understanding of their operation and their
suitability to the problem of grade estimation.
• Chapter 4 – Mining Applications of Artificial Neural Networks
Discusses a number of examples of ANNs application to grade/reserves
estimation. Examples of similar applications from non-mining areas are also
given. Presents a number of reported uses of ANN systems to mining and shows
how this technology begins to gain ground in the mining industry.
• Chapter 5 - Development of a Modular Neural Network System for Grade
Estimation
Describes the development of prototype modular neural network systems for use
with 2D and 3D exploration data. The transition from two to three dimensions is
discussed.
• Chapter 6 - Case Studies of the Prototype Modular Neural Network System
Introduction
21
Presents a number of case studies, which were used to guide the development of
the prototype system. These case studies were also used to validate the overall
approach.
• Chapter 7 - GEMNET II – An Integrated Modular System for Grade
Estimation
Explains the design and development of the GEMNet II system. The system
architecture as well as application is analysed. The integration of the system in an
advanced 3D resource-modelling environment is also discussed.
• Chapter 8 - GEMNet II Application – Case Studies
Contains several examples of the application of GEMNet II to real deposits with
real sampling schemes. The case studies are presented in order of increasing
complexity. Other techniques are applied to the same data in order to provide
with a basis for comparison and evaluation of GEMNet II system’s performance.
• Chapter 9 - Conclusions – Further Research
Gives a discussion on the conclusions from the research described and the
potential areas for further research and development.
Artificial Neural Networks Theory
23
2. Artificial Neural Networks Theory
2.1 Introduction
2.1.1 Biological Background The human brain and generally the mammalian nervous system has been the source
of inspiration for decades of research for a computational model, which is based not
on hard-coded programming but on learning from experience. The human brain,
central to the human nervous system, is generally understood not as a single neural
network but as a network of neural networks each having their own architecture,
learning strategy, and objectives. The massive parallelism of the human brain and the
deriving advantages of this structure always attracted the attention of scientists
especially in the field of computing.
Biological neural networks, regardless of their function and complexity, are
composed of building blocks known as neurons (Fig. 2.1). The minimal structure of
a neuron consists of four elements: dendrites, synapses, cell body, and axon.
Dendrites are the transmission channels for information coming into the neuron. The
signals, which propagate through the dendrites, originate from the synapses, which
form the input contact points with other neurons. Synapses are also centres of
information storage in biological neural networks. There are however other storage
mechanisms inside the biological neurons, which are still not very well understood
and extend outside the four-element neuron model described here. The axon is
responsible for transmitting the output of the neuron. There is only one axon per
neuron, but axons can have more than one branches the tips of which form synapses
upon other neurons [3]. The cell body of the neuron is where most of the processing
takes place. The cell body also provides the necessary chemicals and energy for its
operation.
Artificial Neural Networks Theory
24
Axon Cell body Synapses Dendrites
Figure 2.1: Illustration of a typical neuron [100].
Transmission of information within biological neural networks is achieved by
means of ions, semi permeable membranes and action potentials as opposed to simple
electronic transport in metallic cables [87]. Neural signals produced at the neuron
travel through the axon in the form of ions, which in the case of neurons are called
neurotransmitters. The neuron is constantly trying to keep a balanced electrical
system by transporting excess positive ions out of the cell while holding negative ions
inside. These movements of ions through the neuron are known as depolarisation
waves or action potentials (Fig 2.2).
The information transmitted between neurons is processed using a number of
electrical and chemical processes. The synapses play a leading role in the regulation
of these processes. Synapses direct the transmission of information and control the
flow of neurotransmitters. The cell body integrates incoming signals and when these
reach a certain level the activation threshold is reached and the neuron generates an
action potential, which propagates through the neuron’s axon.
Synapses, as already mentioned, are the centres of information storage. The
synapses store information by modifying the permeability of the cell to different
kinds of neurotransmitters and therefore altering their effect to the neuron’s
Artificial Neural Networks Theory
25
activation. This information needs to be refreshed periodically in order to maintain
the optimal behaviour of the neuron. This form of information storage is also known
as synaptic efficiency, which represents the ability of a particular synapse to evoke the
depolarisation of the cell body.
Figure 2.2: Propagation of an action potential through a neuron’s axon [100].
All the above knowledge of the way neurons transmit, store, and process
information is far from being complete and therefore any derived artificial model
cannot be considered to be anywhere close to being as complex as its biological
counterpart both in the level of neurons and neural networks. ANNs follow the simple
four-element model of the biological neuron in the definition of their building block,
the artificial neuron or processing element.
2.1.2 Statistical Background The study of the human brain and other biological nervous structures is not the only
source of inspiration and formalisation for the development of artificial neural
network models. ANNs are commonly treated as fine-grained parallel
implementations of non-linear static or dynamic systems [31]. The biological
structures when simplified to an artificial model become a system that can be best
described by a traditional mathematical or statistical model such as non-parametric
pattern classifiers, clustering algorithms, non-linear filters, and statistical regression
Artificial Neural Networks Theory
26
models rather than a true biological model. These statistical models are either
parametric with a small number of parameters, or non-parametric and completely
flexible. Artificial neural network methods cover the area in between with models of
large but not unlimited flexibility given by a large number of parameters as required
in large-scale practical problems [82].
The behaviour and dynamics of the structure of artificial networks can be
shown to implement the operation of classical mathematical estimators and optimal
discriminators [47]. It is generally accepted that the earlier models of artificial
neurons and neural networks in the 1940s and ‘50s tried to imitate as close as
possible the biological model while more recent models have been elaborated for new
generations of information-processing devices. In most cases of ANNs it is almost
impossible to get any agreement between their behaviour and experimental
neurophysiological measurements. This results from the over-simplification of the
biological nervous systems, which is dictated by the incomplete understanding of the
numerous chemical and electrical processes involved.
Understanding the operation properties of ANNs can be approached by a
number of different methods. Statistical mechanics is a very important tool for
analysing the learning ability of a neural network. Statistical mechanics provide a
description of the collective properties of complex systems consisting of many
interacting elements on the basis of the individual behaviour and mutual interaction
of these elements [118]. Within this approach, ANNs are defined as ensembles of
neurons with certain activity, which interact through synaptic couplings. Both the
activities and synaptic couplings are assumed to evolve dynamically.
Artificial Neural Networks Theory
27
In the following paragraphs, a discussion on various aspects of ANNs will be
given which will show to a greater extent the strong connection between statistics and
neural computing.
2.1.3 History Almost every introduction to ANNs begins with a brief presentation of the historical
development of ANNs and neural computation in general. There are many good
reasons for discussing the history of ANNs. The brief discussion in this paragraph
will show how this multi-science field of computing evolved through time. This
historical analysis will help to assess the growth and potential of ANNs as an
approach to the problem of computing.
ANNs are the realisation of one of the first formal definitions of
computability, namely the biological model. In the 1930s and ‘40s there were at least
five alternative models of computation (Figure 2.3) [86]:
1. mathematical model
2. logic-operational model (Turing machines)
3. computer model
4. cellular automata
5. biological model (Neural Networks)
Artificial Neural Networks Theory
28
Figure 2.3: The five major models of computation as they were presented six decades ago [86].
The computer model of von Neuman became the most popular and widespread used
one, but this did not mean the dismissal of the other approaches. In fact John von
Neuman himself has participated in the development of other models like the first
ANNs [69]. In 1943 Warren McCulloch and Walter Pitts introduced the first models
of artificial neurons [60]. Donald Hebb in his book entitled The Organisation of
Behaviour [33] tried to build a qualitative explanation of experimental results from
psychology using a specific learning law for the synapses of neurons that he
proposed.
The first hardware implementations of ANNs included the Snark by Marvin
Minsky [64], the Mark I Perceptron by Frank Rosenblatt and others [88], the
ADALINE by Bernard Widrow [109], and the Learnmatrix by Karl Steinbuch [98].
After a quiet period in the 1950s and early 1960s, the field of neural computing
became once again the centre of research activity. Researchers such as Teuvo
Artificial Neural Networks Theory
29
Kohonen [46], James Anderson [2], Stephen Grossberg [30], and Shun-ichi Amari [1]
brought back the interest in the field and by the 1980s the first neural network
applications became a reality. John Hopfield [34] was also another example of an
established scientist who helped to raise the worldwide awareness to the neural
computing field. By the late 1980s the field was very well established through
research groups in most of the major universities and research institutions around the
world. David Rumelhart and James McClelland [89] are also worth mentioning for
their contribution to the field through the publication of the Parallel Distributed
Processing, which are considered as major references of neural computing.
2.2 Basic Structure – Principles
2.2.1 The Artificial Neuron – the Processing Element The artificial neuron or processing element (PE) is the basic unit of an ANN. It is a
simplified version of the four-element model described in Paragraph 2.1. There are
both software and hardware implementations of PEs. Their basic structure is
illustrated in Figure 2.4.
Figure 2.4: Structure of the processing element [32].
The PE k includes a set of synapses each being identified by a weight w. Each input
signal xj to the PE k is multiplied by the synaptic weight wkj. The weighted input
Artificial Neural Networks Theory
30
signals are summed by the adder of the PE (linear combiner). The outcome of the
summation is passed to an activation function also known as squashing function
because it squashes (i.e. limits) the amplitude range of the PE’s output to a finite
value [32]. The bias bk is applied to the adder and has the effect of increasing or
decreasing the output of the latter. Figure 2.5 shows the effect of the bias on the
output of the linear combiner.
Figure 2.5: Effect of bias on the input to the activation function (induced local field) [32].
The following equations describe the model of the PE in mathematical terms:
∑=
=m
jjkjk xw
1υ (2.1)
And
( )kky υϕ= (2.2)
where x1, x2, …, xm are the input signals which are multiplied by the synaptic weights
wk1, wk2, …, wkm and then added to give the linear combiner output υk. The bias bk is
applied to uk to provide the input to the activation function ϕ( . ). Finally, the output
Artificial Neural Networks Theory
31
of the activation function gives the output of the neuron yk. Figure 2.6 illustrates the
most common activation functions used in modern PEs.
Figure 2.6: Common activation functions: (a) unipolar threshold, (b) bipolar threshold, (c) unipolar
sigmoid, and (d) bipolar sigmoid [53].
2.2.2 The Artificial Neural Network The model of the artificial neuron or processing element described above forms the
basis of the artificial neural network (ANN) structure. ANNs consist of layers of
interconnected PEs as shown in Fig. 2.7. This layered structure is the most common
in ANNs and is usually called the fully connected feedforward or acyclic network.
However, there are ANNs that do not adopt this structure as will be discussed in
Section 2.4.
The starting point of the ANN structure is a layer of input units that allows the
entering of information into the network. The input units cannot be considered as PEs
mainly because there is no processing of information taking place at them with the
exception of normalisation (when required). Normalisation is the process of
Artificial Neural Networks Theory
32
equalising the signal range (commonly to a range between 0.1 and 0.9) of different
inputs. Normalisation ensures that changes in the signals of different inputs have the
same effect on the network’s behaviour regardless of their magnitude.
Figure 2.7: Basic structure of a layered ANN [32].
Following the input layer is one ore more internal or hidden layers. The use of
the word hidden is mainly due to the fact that they are not accessible from outside the
ANN. The first hidden layer is fully interconnected with the units of the input layer.
In other words, all PEs of the hidden layer receive the signal from each input unit.
The signals are multiplied by a weight, which is different for every connection. In the
case of more than one hidden layers, there will be full interconnection between
subsequent layers as in the case of the input and first hidden layer.
The final part of the ANN structure is the output layer. The units of this layer
are also PEs, which receive the signals from the last hidden layer and perform similar
Artificial Neural Networks Theory
33
processing to that of the hidden PEs. If normalisation is used in the input layer, then
the outputs of the output PEs have to be transformed back to the range of the original
data to get sensible results. This is required normally when the ANN is used for
function approximation.
2.3 Learning Algorithms
2.3.1 Overview Learning from examples is the main operation of any ANN. Learning in this case
means the ability of an ANN to improve its performance through an interactive
process of adjusting its free parameters. The adjustment of an ANN’s free parameters
is stimulated by a set of examples presented to the network during the application of a
set of well-defined rules for improving its performance called a learning algorithm.
There are many different learning algorithms for ANNs, each with a different way of
adjusting the connection weights of PEs and different way of formalising the
measurement of the ANN’s performance. They are generally grouped into supervised
and unsupervised algorithms. Supervised algorithms are applied when the required
ANN outputs are known in advance, while unsupervised algorithms are applied when
the correct outputs are not known and need to be found. Over the next paragraphs of
this section, the main learning processes and algorithms will be discussed briefly.
2.3.2 Error Correction Learning In order to explain the error correction learning algorithm, the basic structure of any
ANN, the PE, will be examined. The example is based on the assumption that the PE
is the only unit of the output layer of a feedforward ANN. As in any learning
algorithm, adjusting the synaptic weights of the PEs is an iterative process involving
a number of time steps.
Artificial Neural Networks Theory
34
The PE k is presented an input signal vector x(n) at the time step n. This signal
vector is produced by the units of the previous layer - the last hidden layer in this
case. The output signal yk(n) of the ANN’s only output is compared to a target output
dk(n), which produces an error signal ek(n):
ek(n) = dk(n) – yk(n) (2.3)
The production of the error signal activates a corrective mechanism – a sequence of
corrective adjustments to the synaptic weights of the PE that bring the output signal
closer to the target output. A cost function or index of performance is defined based
on the error signal as follows [32]:
E(n) = e2k(n) / 2 (2.4)
Eventually the process of adjusting the synaptic weights of the PE reaches a stabilised
weight state and learning terminates. This learning process of cost function
minimisation is also known as the delta rule or Widrow-Hoff rule [99]. The
adjustment Δwkj(n) of the synaptic weight wkj at time step n is given by:
Δwkj(n) = ηek(n)xi(n) (2.5)
Where η is the learning-rate parameter. The new value of the synaptic weight at time
step n+1 will be:
wkj(n+1) = wkj(n) + Δwkj(n) (2.6)
Artificial Neural Networks Theory
35
The correct choice of the learning-rate parameter is very important for the overall
performance of the ANN.
2.3.3 Memory Based Learning Memory based learning is mainly used for pattern classification purposes. Learning
takes the form of past experiences stored in a memory of classified input-output
examples {(xi, di)}Ni=1, where xi is the input vector, di the target output, and N the
number of patterns [32]. In the case of a new vector xnew presented to the network, the
algorithm will try to classify it by looking at the training data in a local
neighbourhood of xnew. There is a number of different algorithms for memory based
learning, which differ in the way they define two major aspects:
• the local neighbourhood of the new vector xnew
• the learning rule applied to training data in the local neighbourhood of xnew.
In Chapter 3 an in-depth discussion of a very important type of memory-based
classifier will be given, namely the radial basis function network.
2.3.4 Hebbian Learning The oldest of the learning rules is Hebb’s postulate of learning [33]. Hebb, in his
book The Organisation of Behaviour made the following statement as the basis for
associative learning:
When an axon of cell A is near enough to excite a cell B and repeatedly or
persistently takes part in firing it, some growth process or metabolic changes take
Artificial Neural Networks Theory
36
place in one or both cells such that A’s efficiency as one of the cells firing B, is
increased [33, p.62].
Transferring this statement from the neurobiological context into a more algorithmic
language, the following two-part rule [99]:
1. If two neurons on either side of a synapse are activated simultaneously, then the
strength of that synapse is selectively increased.
2. If two neurons on either side of a synapse are activated asynchronously, then that
synapse is selectively weakened or eliminated.
The second part of the rule was not included in the original Hebb’s rule but was
added for consistency reasons. The mathematical formulation of Hebbian learning is
given by the following equation of the synaptic weight wkj adjustment Δwkj(n):
Δwkj(n) = ηyk(n)xj(n) (2.7)
where xj and yk are the presynaptic and postsynaptic signals at time step n, and η is
the learning rate parameter. Hebbian learning is strongly supported by physiological
evidence in the area of the brain called the hippocampus.
2.3.5 Competitive Learning Competitive learning is one of the major types of unsupervised learning. In
competitive learning the output PEs of an ANN compete to become active when an
input signal is presented. In other words the output PEs try to provide the output
associated with an input vector. Competitive learning is based on three elements [90]:
Artificial Neural Networks Theory
37
1. A number of similar PEs, which can have however some randomly distributed
synaptic weights causing a different response to a given set of input vectors.
2. A limited strength for each PE.
3. A competition mechanism for the PEs to gain the right to respond to a given
input. The mechanism must ensure that only one output PE responds at a time –
that PE is called the winner-takes-all-neuron.
The winning PE is the one with the largest induced local field υk for an input pattern
x. The output of the winning PE is set to one while the outputs of all other PEs is set
to zero. The adjustment of the synaptic weight wkj for the winning PE is given by the
following equation:
Δwkj = η(xj – wkj) (2.8)
while for the loosing PEs:
Δwkj = 0 (2.9)
This leads to moving the synaptic weight vector of the winning PE towards the input
vector.
2.3.6 Boltzmann Learning Named in honour of Ludwig Boltzmann, the Boltzmann learning rule is a stochastic
learning algorithm based on statistical mechanics [32]. An ANN designed to follow
the Boltzmann learning algorithm is called a Boltzmann Machine (BM). A BM
implements a stochastic response function to characterise the transitions of individual
Artificial Neural Networks Theory
38
PEs between different states. There are two possible states for BM units: on state
denoted by +1 or off state denoted by –1. A BM is characterised by an energy
function, E, which depends on the states of the BM units:
( ) ∑∑−=i j
jiji xxwxE21 (2.10)
where xj is the state of PE j, and wkj is the synaptic weight between PE j and k. There
are no weights between a PE and itself (j ≠ k, or wjj = 0), in other words, none of the
PEs has self-feedback. During a BM’s operation a PE is chosen at random. Its output
is characterised in terms of a state transition function:
TEjj jexxP /1
1)( Δ−+=−→ (2.11)
where ΔEj is the change in the energy function of the BM as a result of the state
transition and T is the pseudotemperature. The PEs in a BM fall into two categories:
visible and hidden. The visible units form the connection of the network with its
environment. These units have two modes of operation: clamped and unclamped or
free running. In clamped mode, the visible units are clamped onto specific states
while in unclamped mode they operate freely. The hidden units always operate freely.
The adjustment of the synaptic weight is defined by [45]:
)( −+ −=Δ jijijiw ρρη (2.12)
Artificial Neural Networks Theory
39
where ρ+kj is the correlation between the states of PEs j and k in clamped mode, ρ-
kj is
the correlation between the states of PEs j and k in free-running mode, both range
from –1 to +1.
2.3.7 Self-Organized Learning Self-organising learning is usually considered to be just another way to describe
unsupervised learning. Self-organising learning is however a member of the group of
unsupervised learning algorithms together with competitive and reinforcement
learning. The most widely known model of self-organising networks is that of the
Self-Organising Maps or Kohonen Networks proposed by Teuvo Kohonen [45] as a
realisation of the ideas developed by Rosenblatt, von der Malsburg, and other
researchers.
A Kohonen network is an arrangement of PEs in a multi-dimensional lattice
(Par. 2.4.3). This structure enables the identification of the immediate neighbourhood
of every PE. Kohonen learning is based on a neighbourhood function φ(i,k)
representing the strength of the coupling between PE i and k during the training
process. The neighbourhood function is defined for all units i inside a neighbourhood
of radius r of unit k to equal one and for all other units to equal zero. The adjustment
of the weight vectors follows the rule below:
Δwi = wi + ηφ(i,k)(ξ – wi), for i = 1, …,m (2.13)
where m is the total number of PEs, η is a learning constant, and ξ is an input vector
selected using the desired probability distribution over the input space. The learning
process is repeated several times with the neighbourhood radius and the learning
constant being reduced according to a schedule. The value of the neighbourhood
Artificial Neural Networks Theory
40
function also decreases so that the influence of each PE upon its neighbours is
reduced. The effect of the schedule is to accelerate learning at the beginning of the
learning process and produce smaller corrections towards the end. The overall result
of Kohonen’s learning algorithm is that each PE learns to specialise on different
regions of input space and learns to produce the highest output for an input from such
a region.
2.3.8 Reinforcement Learning Reinforcement learning is another important member of the group of unsupervised
learning algorithms. It is closely related to dynamic programming which is why it is
sometimes referred to as neurodynamic programming.
Reinforcement learning is, in essence, an input-output mapping achieved
through interaction with the environment (input space) in order to minimise a scalar
index of performance [7]. Unlike other learning processes, reinforcement learning
aims to minimising a cost-to-go function defined as the cumulative cost of actions
taken over a sequence of steps instead of the immediate cost. The function of the
network is to find these actions and feed them back to the environment.
Reinforcement learning is very appealing since it allows for the network to interact
with its environment and develop the ability to increase its performance on the basis
of the outcomes of its experience from this interaction.
2.4 Major Types of Artificial Neural Networks
2.4.1 Feedforward Networks Beyond any doubt the most popular and widely used ANN structure, the feedforward
network is a hierarchical design consisting of fully interconnected layers of PEs.
Generally the operation of this network is mapping an n-dimensional input to an m-
dimensional output, in other words modelling of a function F : ℜn → ℜm. This is
Artificial Neural Networks Theory
41
achieved by means of training on examples (x1,y1), (x2,y2), …,(xk,yk) of the mapping,
where yk = f(xk).
Figure 2.8: Structure of the feedforward artificial neural network. There can be more than one middle
or hidden layers [53].
The feedforward network is commonly used together with an error correction
algorithm such as backpropagation, gradient descent, or conjugate gradient descent.
The structure of the feedforward network, as shown in Fig. 2.8 comprises a number
of layers of PEs. There are three types of layers depending on their location and
function: input , hidden, and output. The connections between the layers are generally
feedforward during presentation of an input signal. However, during training the
network allows the backpropagation of error signals to the hidden units in order to
adjust the connection weights. Feedforward networks may have more than one hidden
layers. Extra hidden layers allow more complex mappings but also require more
information for training of the network. The choice is usually between an excessive
number of PEs in one hidden layer and a low number of PEs but in more than one
hidden layer.
Artificial Neural Networks Theory
42
2.4.2 Recurrent Networks The main difference between the ANN structure described above and that of the
recurrent networks is in the presence of feedback loops. A recurrent network may or
may not have input and output units since the outputs of a single layer of units can be
directed back to the inputs of the same units, i.e. every PE branches its output to the
inputs of all other units in the layer. Figures 2.9a and 2.9b show examples of
recurrent network with and without input and output units. The other difference
between these two examples is in the presence of self-feedback loops . In Fig. 2.9a
each PE sends its output to the input of every other PE while in Fig. 2.9b the PEs also
receive their own outputs as inputs. Feedback loops are usually passed through unit-
delay elements leading to a nonlinear dynamical behavior [32].
A particular example of a recurrent network is the Amari-Hopfield model or
Hopfield network [35]. The Hopfield network consists of a single layer of PEs which
receive an initial input vector. This input vector consists of component values which
may be either 1 or –1 and are fed one per PE. The initial output from each PE is fed
back to a branching node which fans out to every PE except the one from which the
output signal originates. The branching connections to every PE are weighted by N-1
weights, N being the total number of PEs in the network. The weighted signals are
summed and passed through a threshold activation function resulting in an updated
output. Hopfield networks normally operate asynchronously, i.e. the PEs are activated
one at a time and therefore a single updated output is produced at any given time,
there is a random input added to the weighted signal sum, and the new updated output
is held and used to update each future asynchronous activation of any PE [53].
Artificial Neural Networks Theory
43
(a) (b)
Figure 2.9: a) Recurrent network without self-feedback connections, b) recurrent
network with self-feedback connections [32].
2.4.3 Self-Organizing Networks Self-organizing networks or self-organizing maps (SOMs) are a special class of the
unsupervised ANNs group. SOMs were developed by Teuvo Kohonen [45]. The
learning process applied to these networks, as was described in a previous paragraph,
follows the competitive learning paradigm. SOMs construct topology-preserving
mappings of the input data in a way that the location of a PE carries semantic
information. The SOM can be considered as a specific type of clustering algorithm. A
large number of clusters are chosen and arranged on a square or hexagonal grid in
one ore two dimensions. This grid is in essence a lattice of PEs of the SOMs single
computational layer. Input patterns representing similar examples are mapped to
nearby nodes of the grid. Figure 2.10 illustrates the basic SOM structure.
Artificial Neural Networks Theory
44
Figure 2.10: Structure of a two-dimensional Self-Organising Map [32].
2.4.4 Radial Basis Function Networks and Time Delay Neural Networks Radial Basis Function Networks (RBFNs) and Time Delay Neural Networks
(TDNNs) are two different ANN topologies with characteristics which separate them
from other classes of ANNs. The RBFNs are powerful network structures which
construct global approximations to functions using combinations of Radial Basis
Functions (RBFs) centred around weight vectors [54]. The basic RBFN structure is
shown in Fig. 2.11. A non-linear basis function is centred around each hidden node
weight vector. Hidden nodes have an adaptable range of influence or receptive field.
The output of the hidden nodes is a radial function of the distance between each
pattern vector and each hidden node weight vector.
The RBFN structure’s original motivation was in terms of functional
approximation techniques, regression and regularisation, and biological pattern
formation. The RBFN structure was chosen after a series of tests as the basic ANN
Artificial Neural Networks Theory
45
structure for the GEMNet II system for ore grade estimation. Chapter 4 gives a more
in-depth discussion of the RBFNs and the reasons behind their choice as the building
block of the GEMNet II system.
Figure 2.11: Basic structure of the Radial Basis Function Network [53].
TDNNs are based on ordinary time delays to perform temporal processing
[50, 105]. The TDNN (Fig. 2.12a) is a multi-layered feedforward ANN whose PEs
are replicated across time. The building block of a TDNN is a PE whose inputs are
delayed in time. The activation of a PE is computed by passing the weighted sum of
its inputs through an activation function like a threshold or sigmoid function. The
overall behaviour of the network is modified through the introduction of delays. The
M inputs of a PE are multiplied by N delay steps. Hidden PEs receive M * N delayed
inputs plus M “undelayed” inputs, a total of M * (N+1) inputs. However, only the
hidden PEs activated at any given time step have connections to the inputs with all
the other units having the same connection pattern but shifted to a later point in time
according to their delay position in time.
Artificial Neural Networks Theory
46
TDNNs are used for position independent recognition of features within larger
patterns. The TDNNs are trained on time-position independent detection of sub-
patterns, a feature that makes them independent from error-prone pre-processing
algorithms for time alignment. They are used to capture the concept of time
symmetry as encountered in the recognition of phonemes using frequency-time
images known as spectrograms (Fig. 2.11b).
Figure 2.11: The concept of Time Delay Neural Networks for speech recognition [50].
2.4.5 Fuzzy Neural Networks Fuzzy logic and systems can be used in conjunction with ANNs in more than one
way to provide solutions for control problems, decision making, and pattern
recognition. The most common way of integrating the two technologies is the fuzzy
logic implementation by ANNs leading to neuro-fuzzy systems.
Fuzzy systems provide means of capturing uncertainty. Uncertainty is
inherent in almost every real-world problem. The essential characteristics of fuzzy
logic are as follows [117]:
Artificial Neural Networks Theory
47
• Exact reasoning is viewed as a limiting case.
• Everything is a matter of degree.
• Inference is viewed as the process of propagation of elastic constraints.
• Any logical system can be fuzzified.
The integration of ANNs with fuzzy systems results to a Fuzzy Neural Network
(FNN) of one of the following types [93]:
• FNN with crisp number of inputs and fuzzy weights.
• FNN with fuzzy set input signals and crisp weights.
• FNN with both fuzzy input signals and fuzzy weights.
The building block of an FNN is a fuzzy version of the PE described in Paragraph
2.2.1. A possible FNN structure consists of a layered net with an input layer
implementing membership functions, a first hidden layer implementing fuzzy rules
and combining membership functions, a second hidden layer combining fuzzy values,
and an output layer providing defuzzification. Figure 2.12 illustrates an approach to
FNN implementation.
Figure 2.12: An approach to FNN implementation.
Artificial Neural Networks Theory
48
2.5 Conclusions The discussion given in this chapter covered the basic concepts of artificial neural
networks as well as major types of ANN learning and architecture. The potential of
this technology became clear through examples of ANNs presenting special
characteristics and areas of application. The ever increasing research activity in this
field has also been discussed, showing that ANNs become more and more popular as
tools for solving an increasing number of real-world problems. ANN technology
finds its way to a number of diverse engineering and decision making problems of the
mining industry, as it will be demonstrated in Chapter 4 through a set of successful
examples.
Radial Basis Function Networks
23
3. Radial Basis Function Networks
3.1 Introduction In this chapter the discussion continues with an analysis of a very unique type of
ANNs, the Radial Basis Function networks (RBF). RBFs were initially used for
solving problems of real multivariate interpolation. Work on this subject has been
extensively surveyed by Powell [79]. The theory of RBFs is one of the main fields of
study in numerical analysis [96, 80].
RBFNs are very simple structures. Their design is in essence a problem of
curve fitting in a high-dimensional space. Learning in RBFNs means finding the
hyper-surface in multi-dimensional space that fits the training data in the best
possible way. This is clearly different from most of the ANN design principles
discussed in the previous chapter. Function approximation and pattern classification
are the main areas of RBFNs application. One of the main advantages of RBFNs lies
in their strong scientific foundation. RBFs have been motivated by statistical pattern
processing theory, regression and regularisation, biological pattern formation, and
mapping in the presence of noisy data [96]. Therefore, RBFNs have inherited a wide
range of useful theoretical properties, which have been used to provide solutions to a
much wider range of problems than the RBFs themselves.
The choice of RBFNs in the development of GEMNet II was based on these
theoretical properties, which will be further discussed over the next paragraphs, but
also on results from experiments carried out using data from real mineral deposits.
The use of RBFNs also helped achieving one of the main aims of GEMNet II, which
is to provide a fast alternative to existing grade estimation techniques. In the tests
carried out at the beginning of the project the speed of development of RBFNs has
been unparalleled by any other architecture tested.
Radial Basis Function Networks
24
3.2 Radial Basis Function Networks – Theoretical Foundations
3.2.1 Overview The basic principles of RBFs and of the derived networks will be discussed in this
section. For the purposes of this thesis, the discussion will concentrate to the theory
behind the use of RBFs for interpolation problems and not for pattern classification.
The transition from the original RBF methods for interpolation to RBFNs will also be
analysed.
3.2.2 Multivariable Interpolation RBFs were first introduced to the problem of multivariable interpolation as an
approach to dealing with irregularly positioned data points. The problem of
multivariable interpolation is as follows [79]:
Given m different points },...,2,1;{ mixi = in nℜ , and m real numbers },...,2,1;{ mifi = ,
one has to calculate a function s from nℜ to ℜ that satisfies the interpolation conditions
.,...,2,1,)( mifxs ii == (3.1)
The choice of s from a linear space that depends on the positions of the data points
forms the approach of RBFs. RBFs have the general form:
mixxx ni ,...,2,1,),( =ℜ∈−φ (3.2)
Where φ is the basis function from +ℜ to ℜ and the norm of nℜ is Euclidean.
Several interpolation methods have been considered in which s has the form:
Radial Basis Function Networks
25
( )∑=
ℜ∈−=m
i
nii xxxxs
1.,)( φλ (3.3)
With the condition of the matrix
( ) ,,...,2,1,, mjixxA jiij =−= φ (3.4)
being non-singular, the condition 3.1 defines the coefficients { }mii ,...,2,1; =λ
uniquely. The matrix A is normally called the interpolation matrix. These methods
have a very useful property, proved by Micchelli [62], that, if the data points are
different then, for all positive integers m and n, A is always non-singular. This theory
applies to many choices of φ. However, in the case of the basis functions of the form
( ) 0, ≥= rrr lφ (3.5)
the theory applies only under conditions concerning the degree l, and the dimension
of the input space mo. The class of RBFs covered by Micchelli’s theorem includes the
following functions:
1. Multiquadratics:
( ) ( ) 2122 crr +=φ for some c > 0 and ℜ∈r (3.6)
2. Inverse Multiquadratics:
( )( ) 2
122
1
crr
+=φ for some c > 0 and ℜ∈r (3.7)
Radial Basis Function Networks
26
3. Gaussian Functions:
( ) ⎟⎟⎠
⎞⎜⎜⎝
⎛−= 2
2
2exp
σφ rr for some σ > 0 and ℜ∈r (3.8)
4. Thin Plate Splines:
( ) ( )rrr ln2=φ , ℜ∈r (3.9)
It should be noticed that multiquadratics and thin plate splines decrease by moving
away from the centre of the basis function, while Gaussian and inverse
multiquadratics increase. The thin plate splines are interpolating functions derived by
variational methods [22, 61].
3.2.3 The Hyper-Surface Reconstruction Problem The interpolation technique described above suffers from a very serious problem: If
the number of data points in the training sample is greater than the number of degrees
of freedom of the underlying physical process, then fitting as many RBFs as the
number of data points leads to over-determination of the hyper-surface reconstruction
problem [11]. This is known in neural network terms as overfitting or overtraining.
Allowing an RBFN to reach this stage means degradation of its generalization
performance.
The problem of learning the hyper-surface defining the output in terms of the
input can be either well-posed or ill-posed. These terms have been in use in applied
mathematics for over a century. An unknown mapping f between a domain X and an
output range Y (both taken as metric spaces) is considered. Reconstructing this
mapping f is said to be well-posed when the following three conditions are satisfied
[101, 66, and 44]:
Radial Basis Function Networks
27
1. Existence: for every input vector Xx∈ there is an output y = f(x), where Yy∈ .
2. Uniqueness: for any pair of input vectors x, t ∈ X, f(x) = f(t) only if x = t.
3. Continuity: also referred to as stability, continuity requires for any ε > 0 there will
exist δ = δ(ε) so that if ρx (x, t) < δ then ρy(f(x),f(t)) < ε, where ρ(⋅,⋅) is the
distance between two arguments in their respective spaces [32].
A problem is ill-posed when any of these conditions is not satisfied. Normally, a
physical phenomenon such as an orebody deposition, is a well-posed problem.
Learning from drillhole data is, however, an ill-posed problem because:
• For any pair of input vectors x, t there can be f(x) = f(t) even when x ≠ t.
• It is well known that drillhole and other physical samples from mineral deposits
contain physical sampling errors leading to the possibility for the neural network
to produce an output outside the range Y for a specified input. That means
violation of the continuity criterion.
The second of the reasons has a more serious impact to solving the problem, as lack
of continuity means that the computed input-output mapping does not represent the
true solution.
The issues of hyper-surface reconstruction with RBFs being an ill-posed
problem and leading to overfitting need to be addressed. A number of methods have
been developed for making an ill-posed problem into a well-posed one, as well as
preventing overfitting. The most important one, regularisation, will be discussed in
the following paragraph.
Radial Basis Function Networks
28
3.2.4 Regularisation Regularisation is a method developed by Tikhonov in 1963 [102] for solving ill-
posed problems. Its use has been mostly explored in approximation theory.
Regularisation aims at overcoming the lack of continuity of an ill-posed problem by
means of an auxiliary nonnegative functional embedding prior information about the
solution. Such information is commonly the assumption that similar inputs
correspond to similar outputs. Tikhonov’s theory involves two terms:
1. Standard Error Term: denoted by E(F), represents the standard error or distance
between the desired response (target output) di and the actual response yi for the
training example i = 1, 2, …,N. The standard error term is defined as:
∑ ∑= =
−=−=N
i
N
iiiiis xFdydFE
1 1
22 )]([21)(
21)( (3.10)
2. Regularising Term: denoted by Ec(F), provides the means for embedding
geometrical information about the approximating function F(x) to the solution.
This term is defined as:
2
21)( FFEc D= (3.11)
where D is a linear differential operator. It is in this operator that prior information
about the form of the solution is embedded and therefore its selection depends on the
problem at hand.
Radial Basis Function Networks
29
Regularisation provides a way of reducing the number of basis functions
when fitting RBFs by adding a penalty term described above as the regularising term
[83]. The principle of regularisation is the following:
Find the function Fλ(x) that minimises the Tikhonov functional E(F), defined by
E(F) = Es(F) + λEc(F)
Where λ is a positive real number called the regularisation parameter. The choice of
λ is very crucial as it controls the balance of contribution from the sample data and
the prior information. It can also be seen as an indicator of the sufficiency of the
given data samples to specify the solution to the above minimisation problem.
The implementation of the regularisation theory leads to the regularisation
network [77]. As shown in Fig. 3.1, it consists of three layers. The first layer consists
of a number of input nodes equal to the dimension mo of the input vector x. The
second or hidden layer consists of non-linear nodes connected directly to all the input
nodes. The number of hidden nodes equals the number of samples N.
Figure 3.1: Regularisation network [32].
Radial Basis Function Networks
30
The activation function used in the hidden nodes is a Green’s function G(x, xi). One
of the most common Green’s functions is the multivariate Gaussian function:
)2
1exp(),( 22 ii
i xxxxG −−=σ
(3.12)
where xi denotes the centre of the function, σi its width or receptive field, and wj the
unknown coefficients. These coefficients are defined as follows:
[ ] NixFdw iii ,...,2,1,)(1=−=
λ (3.13)
The minimising solution, denoted as Fλ(x), is given by:
∑=
=N
iii xxGwxF
1),()(λ (3.14)
The solution reached by the regularisation network exists in an N-dimensional
subspace of the space of smooth functions, the set of Green’s functions constituting
the basis for this subspace [77]. As Poggio and Girosi point out, the regularisation
network has three useful properties:
1. It is a universal approximator as it can approximate arbitrarily well any
multivariate continuous function, given sufficient number of hidden nodes.
Radial Basis Function Networks
31
2. It has the best-approximation property, i.e. given an unknown non-linear function
f, there always exists a choice of coefficients that approximate f better than all
other choices.
3. It provides the optimal solution. In other words, the regularisation network
minimises a functional that measures the solution’s deviation from its true value
as represented by the training data.
3.3 Radial Basis Function Networks
3.3.1 General The structure described above as the regularisation network has a very important
weakness: as the number of functions depends initially to the number of training
samples, the network produced can be very expensive in computational terms. This
can be easily understood by considering the computation of the network’s linear
weights, which requires inversion of a very large matrix. Therefore there is a need for
reducing the complexity of the network leading to an approximation of the
regularised solution.
This is achieved by the introduction of a simplified version of the
regularisation network, the generalised radial basis function network. From this point
on, it will be assumed that RBFNs are generalised RBFNs. RBFNs involve searching
for a sub-optimal solution in a lower-dimensional space. This solution approximates
the regularised solution discussed before.
3.3.2 RBF Structure Figure 3.2 illustrates the basic structure of the (generalised) RBFN. The first obvious
difference between this network and that of Fig. 3.1 is in the number of hidden layer
basis functions. In the RBFN there are m1 RBFs, typically less than the number of
training samples, while in the regularisation network there were N RBFs, with N
Radial Basis Function Networks
32
equal to the number of training samples. Other structural differences include the
number of weights being also reduced to m1, and the introduction of a bias applied to
the output unit.
Figure 3.2: Structure of generalised RBFN [32].
Significant differences, not so obvious from the figures, concern the centre
positions and receptive fields of the RBFs as well as the linear weights associated
with the output layer. These are all unknown parameters and have to be learned by
the RBFN during training. In the regularisation network, only the linear weights are
unknown and require training. In the next paragraph, the function of the RBFN will
be further analysed. Special attention is given to the way of initially positioning the
RBF centres during initialisation and the RBF learning algorithms.
3.3.3 RBF Initialisation and Learning For an RBFN to be able to receive training samples and function as a hyper-surface
reconstruction network, a number of its parameters need to be calculated. These
parameters include:
Radial Basis Function Networks
33
• The linear weights between hidden and output layer.
• The bias to the output units.
• The centres of the hidden layer RBFs.
There are a number of methods for RBFN initialisation and learning. The most
common methods are:
1. Random Centre Selection: it is the simplest of the methods. The centres are
randomly chosen from the training data set. It is a common method used when the
training data represent well the problem at hand. Learning using this approach is
concentrated in adjusting the linear weights between the hidden and output layer.
This is achieved using the pseudoinverse method [11]. The weights are calculated
using the formula below:
dGw += (3.15)
where d represents the target output vector in the training data set. G+ is the
pseudoinverse of matrix G, defined as
}{ , jigG = (3.16)
where gi,j is the output of RBF i when presented with input vector j. Golub and
Van Loan [28] provide an in depth discussion over the computation of a
pseudoinverse matrix.
Radial Basis Function Networks
34
2. Self-Organised Centre Selection: The learning method described above requires
a data set representative to the problem at hand. There is no guarantee that the
randomly selected centres reflect accurately the distribution of the data points. To
overcome this problem, a clustering algorithm is used that creates homogeneous
groups of data from the given data set. There are a number of clustering
algorithms, however, in the case of RBFNs, the k-means clustering algorithm is
the most commonly used [23]. Moody and Darken [65] describe the use of k-
means clustering algorithm. The number of centres k is set in advance. With the
number of centres set, the algorithm proceeds with the following steps [9]:
I. The values of the initial RBF centres tk(0) are set randomly. These values
need to be different between them.
II. A vector x is selected from the data set and passed to the algorithm. The
index k(x) of the best-matching centre for the vector is calculated using the
minimum-distance Euclidean criterion:
1,...,2,1,)()(minarg)( mkntnxxk kk=−= (3.17)
where tk(n) is the kth centre at iteration n.
III. The RBF centres are adjusted using the following rule:
⎩⎨⎧ =−+
=+otherwisent
xkkntnxntnt
k
kkk ),(
)()],()([)()1(
η (3.18)
Radial Basis Function Networks
35
where η is the learning-rate parameter receiving values between 0 and 1.
This parameter controls the speed of learning, i.e. the degree of
adjustment on the particular network parameter, in this case, the RBF
centres.
IV. The iteration pointer n is increased by 1 and the algorithm loops back to
step II. This process continues until the centres become stable.
The self-organised stage described above is followed by a supervised learning
stage, which allows the calculation of the linear weights between the hidden and
output layer. The overall approach depends largely on the initial selection of
centres. Several enhancements to the initial centre selection have been introduced
in order to avoid the situation where some initial centres get trapped in regions of
the input space with low density of data points [14 and 15]. An advanced version
of this learning method is used in the development stages of the GEMNET II
system.
3. Orthogonal Least Squares: the OLS algorithm involves sequential addition of
new RBFs to a network, which starts with a single basis function. Each new RBF
is positioned to each data point and the linear weights are calculated for each
position. The centre that gives the smallest residual error is retained. This way the
number of RBFs increases step by step. The selection of a candidate data point for
centre positioning is done by constructing a set of orthogonal vectors in the space
S spanned by the hidden unit activation vectors for each training pattern. The data
point that produces the greatest reduction in the residual error is chosen as the
Radial Basis Function Networks
36
location of the new RBF centre. It is important to stop the algorithm well before
every data point is selected to ensure good generalisation.
4. Supervised Centre Selection: the basis of this method is the least-mean-square
algorithm (LMS). A supervised learning process based on the LMS algorithm sets
all the free parameters of the RBFN. The LMS algorithm takes the form of a
gradient descent procedure. Initially, a cost function is defined as follows:
∑=
=N
jjeE
1
2
21 (3.19)
where N is the number of training samples, and ej is the error defined as:
∑=
−−=
−=M
iCijij
jjj
itxGwd
xFde
1
)(
)(* (3.20)
where Ci is the norm-weighting matrix. The method aims at minimising E by
adjusting the free parameters of the network, the weights wi, the centres ti, and
the receptive fields 1−Σ i . The adjustments to these three parameters are calculated
below [32]:
Linear Weights Adjustment:
∑=
−=∂∂ N
jCijj
i intxGne
nwnE
1))(()(
)()( (3.21)
Centres Position Adjustment:
Radial Basis Function Networks
37
∑=
− −Σ−=∂∂ N
jijiCijji
i
ntxntxGenwntnE
i1
1 )]([))((')(2)()( (3.22)
Receptive Fields Adjustment:
( )∑=
− −−=Σ∂∂ N
jjiCijji
i
nQntxGnenwn
nEi1
1 )()(')()()(
)( (3.23)
The update rules for the three parameters, based on the three learning-rate
parameters η1, η2, and η3, are given below:
Linear Weights Update Rule:
)()()()1( 1 nw
nEnwnwi
ii ∂∂
−=+ η (3.24)
Centres Positions Update Rule:
)()()()1( 2 nt
nEntnti
ii ∂∂
−=+ η (3.25)
Receptive Fields Update Rule:
)(
)()()1( 1311
nnEnn
iii −−−
Σ∂∂
−Σ=+Σ η (3.26)
It should be noticed that this gradient-descent procedure for RBFNs does not
involve error back-propagation.
5. Regularisation Based Learning: the final RBF learning method described is
based on regularisation theory. Yee [116] provides the justification for this RBF
design procedure that is based on four main elements:
Tijijji ntxntxnQ )]()][([)( −−=
Radial Basis Function Networks
38
I. A radial-basis function, G, admissible as the kernel of a mean-square
consistent Nadaraya-Watson regression estimate (NWRE) [68, 108].
II. A common for all centres, input norm-weighting matrix, 1−Σ , with entries
),...,,(021 mhhhdiag=Σ (3.27)
where h1, h2, …, hmo are the bandwidths of a consistent NWRE kernel G
for each dimension of the input space. These bandwidths are given as the
product of the sample variance of the ith input variable estimated from the
available training data and a scale factor determined using a cross-
validation procedure.
III. Regularised strict interpolation for the training of the linear weights using
the following equation:
dIGw 1−+= )( λ (3.28)
where G is Green’s matrix and I is the N-by-N identity matrix.
IV. The choice of the regularisation parameter λ and the scale factors is
achieved using a method such as the common cross-validation (CV).
Generally, larger values of λ lead to larger noise in measuring the
parameters. In a similar manner, the larger the values for a specific scale
factor, the less important is the associated input dimension for the
variation of the network output in relation to variations in the input. In
other words, the scale factors can be used for ranking the significance of
the input variables and can aid the reduction of the input space
dimensionality.
Radial Basis Function Networks
39
3.4 Function Approximation with RBFNs
3.4.1 General In this section, the discussion continues with an evaluation of the function
approximation capabilities of RBFNs. It will be shown that the range of RBFNs is
broad enough to uniformly approximate any continuous function. The effects of the
input space dimension and the amount of input data on the RBFN approximation
properties will also be analysed.
3.4.2 Universal Approximation The universal approximation theorem for RBFNs, as stated by Park and Sandberg
[74], opened the way for their use in function approximation problems, which were
commonly approached using Multi-Layered Perceptrons. The work of Park and
Sandberg [74, 73], Cybenko [19], and Poggio and Girosi [77] led to a new model for
function approximation based on generalised RBFNs. Specifically, the theorem can
be stated as below:
Let G:Rmo→R is an integrable bounded function such that G is continuous and
∫ ≠0
0)(mR
dxxG
Let ℑG denote the family of RBFNs consisting of functions F:Rmo→R represented by
∑=
⎟⎠⎞
⎜⎝⎛ −
=1
1)(
m
i
ii
txGwxFσ
where σ > 0, wi ∈ R and ti ∈ Rmo for i = 1, 2, …, m1. For any continuous input-output
mapping function f(x) there is an RBFN with a set of centres { } 11
miit = and a common receptive
field σ > 0 such that the input-output mapping function F(x) realised by the RBFN is close to
f(x) in the Lp norm, p ∈ [1,∞].
Radial Basis Function Networks
40
The universal approximation theorem provides the theoretical basis for the design of
RBFNs for practical applications.
3.4.3 Input Dimensionality A very critical issue in the use of RBFNs as function approximators is the dimension
of the input space and its effect on the intrinsic complexity of the approximating
function(s). It is generally accepted that this complexity increases exponentially in the
ratio mo/s, where mo is the input dimensionality and s is a smoothness index of the
number of constraints imposed on the approximating function. Therefore, for the
RBFN to be able to achieve a sensible rate of convergence, the smoothness index s
needs to be increased with the number of parameters in the approximating function.
However, the space of approximating functions attainable with RBFNs becomes
increasingly constrained as the input dimensionality is increased [32].
Increased dimensionality also has a great effect on the computational overhead
caused during training of the RBFN. The dimension of the input space has a direct
control over the RBFN architecture – the number of input nodes, the number of
RBFs, and consequently, the number of linear weights between hidden and output
layer. Therefore, any increase in the input dimensionality causes an increase in
computer memory and power requirements, and an almost certain increase in
development time. The most common ways of addressing the high input
dimensionality for a given problem are to identify and ignore the inputs that do not
contribute considerably to the output or to try to combine inputs that present a high
correlation. Another way of reducing the input dimensionality, which is not always
applicable though, is to try and break a complex problem into a number of low
dimensionality problems that can be more effectively addressed using RBFNs.
Radial Basis Function Networks
41
3.4.4 Comparison of RBFNs and Multi-Layer Perceptrons Comparison of RBFNs with MLPs is inevitable since they are both used for similar
applications and both are universal approximators. This comparison also leads to
better understanding of these two ANN architectures. The differences between the
two architectures are both structural (concerning the topology of the network) and
functional (concerning the operation and use of the network):
Structural Differences:
• RBFNs have a single hidden layer. MLPs can have more than one hidden layers.
• Hidden units in RBFNs are different from the output units. MLP hidden units are
similar to the output units.
Figure 3.3: Illustration of input space dissection performed by the RBF and MLP
networks [54].
Functional Differences:
• RBFNs construct local approximations to non-linear input-output mappings,
while MLPs construct global approximations.
Radial Basis Function Networks
42
• The output layer of an RBFN is always linear, while the MLP output layer can be
non-linear depending on the application.
• RBF hidden units calculate the Euclidean norm between the input vector and their
centre, while MLP hidden units compute the inner product of the input vector and
their synaptic weight vector.
• MLPs exploit the logistic non-linearity to create combinations of hyperplanes to
dissect pattern space into separable regions. RBFNs dissect pattern space by
modelling clusters of data directly and, therefore, are more concerned with data
distributions (Fig. 3.3) [54].
3.5 Suitability of RBFNs for Grade Estimation RBFNs, as most of the ANN structures, have certain properties that establish them as
a natural choice for grade estimation. However, RBFNs also have a number of
additional useful properties that give them an advantage over other ANN
architectures for this specific problem.
The first of these properties, and possibly the most important one, is that RBFNs
construct local approximations to input-output mappings. It is well known that a
mineral ore deposit is a localised phenomenon. Modelling of a deposit’s grade in 3D
space using drillhole data can be considered to be a problem of hypersurface
reconstruction in 3D space, with this hypersurface consisting of a number of zones
that need to be locally approximated. Deposits commonly present a localised
behaviour; i.e. points within one area of a deposit close to each other tend to have
similar grades. Clearly, this area very rarely extends to the entire deposit and,
therefore, the approach of fitting RBFs in specific locations can be advantageous.
These locations are found by clustering of the drillhole data in order to identify these
areas of similar ore grade behaviour.
Radial Basis Function Networks
43
RBFNs provide an approach to dealing with ill-posed problems due the
properties that they inherit from regularisation theory. Grade estimation is an ill-
posed problem, even though the underlying phenomenon – the orebody deposition –
is well-posed. As was shown in Par. 3.2.3, reconstructing a deposit’s grade as a
hypersurface in the space derived from the drillhole data information, is an ill-posed
problem, hence RBFNs should be the choice of ANN for this task.
RBFNs also allow the calculation of reliability measures, such as the
extrapolation measure and confidence limit. Due to the localised nature of
approximation performed by RBFNs, it is possible to measure the local data density
for a given point x in the input space as an index of extrapolation [52]. Confidence
limits for the model prediction can also be calculated from the local confidence
intervals developed for each RBF unit using a weighted average of the latter. These
reliability measures were first introduced by Leonard et al. [71, 70] incorporated in a
new ANN architecture that computes its own reliability, called the Validity Index
network (VI). Leonard et al. used a two-stage approach based on data densities
derived using Parzen windows [75], and an interpolation formula used for
determining the densities at arbitrary test points. These measures are now standard to
most of the commercial neural network simulators that provide RBFN development
options. Further examination of the use of reliability measures will be presented in
Chapter 7 with the discussion over the development of the GEMNET II system.
Finally, another advantage of RBFNs over other ANN architectures that is
derived from their theoretical properties, is their speed of development. In the case of
low input dimensionality, RBFNs’ learning is expected to be a lot faster than in any
other ANN architecture used for the same problem. The author approached ore grade
estimation using an input space of maximum four dimensions (Easting, Northing,
Radial Basis Function Networks
44
Elevation, and sample Length), a number low enough for the networks to be very fast
to develop.
In later chapters, the suitability of RBFNs for the problem of grade estimation
will be further demonstrated using experimental results on a large number of case
studies.
Applications of Artificial Neural Networks to Mining
71
4. Mining Applications of Artificial Neural Networks
4.1 Overview Artificial Intelligence (AI) tools have been in use for years in a number of mining
related applications. Expert and knowledge based systems, probably the most popular
AI tools, have found their way into a number of computer-based systems supporting
everyday mining operations as well as production of mining equipment. In recent
years, AI has provided tools for optimizing operations and equipment selection,
problems involving large amounts of information that humans cannot easily cope
with in the process of decision-making. These AI systems together with an ever-
increasing number of sophisticated purpose-built computer software packages have
created a very favorable environment for the introduction of yet another powerful AI
tool, the Artificial Neural Networks.
In the ‘90s the mining industry has been introduced to a number of ANN
based systems, some of them finding their way to a fully commercialized product, as
will be illustrated by some examples in this chapter. It should be noted however that
these examples are very few considering the total number of applications at the
research level, and the overall research effort carried out at universities and research
institutes around the world.
The applications described in this chapter are divided in two groups. The first
group will include examples of ANN systems for Exploration and Resource
Estimation. These systems have many common points with the GEMNet II system
developed by the author, and more importantly share the same aims. The second
group of applications includes examples considering the remaining mining problems.
This grouping does not mean in anyway that Exploration and Resource Estimation is
the most important of the mining tasks or that there are more ANN systems targeted
Applications of Artificial Neural Networks to Mining
72
to this field of mining. The grouping as well as the selection of the examples was
purely based on the relevance of the applications to the subject of this thesis.
4.2 ANN Systems for Exploration and Resource Estimation
4.2.1 General Exploration and resource estimation commonly involves the prediction of various
parameters characterizing a mineral deposit or a reservoir. The input data usually
comes in the form of samples with known positions in 3D space. The majority of the
ANN systems developed for these predictive tasks are based on the relationship
between modelled parameters and sample locations. The most common practice when
developing the training patterns set for an ANN, is to generate input-output pairs with
the input being the sample location and the desired output being the value of the
modeled parameter at that location. In other words, most of the ANN systems treat
the modeling of the unknown parameters as a problem of function approximation in
the sample co-ordinates space.
Some other systems, like GEMNet II, go a step further to exploit information
hidden in the relationship between neighboring samples. The estimation of a
parameter at a specific location in 3D space is, in this case, depending on information
from samples around that location. In fact, GEMNet II is trying to use both this and
the above approach wherever possible.
Most of the systems described over the next paragraph work in 2D input space
(Easting, Northing). They also share the same ANN architecture, usually based on the
MLP or RBF network.
Applications of Artificial Neural Networks to Mining
73
4.2.2 Sample Location Based Systems The first example is an MLP based ANN for ore grade/resource estimation developed
by Wu and Zhou [112]. The network architecture, as shown in Fig. 4.1, is an MLP
with four layers: an input layer, two hidden layers, and one output layer. The network
receives two inputs, the Easting and Northing of samples. The two hidden layers are
identical and have 28 units each. It is a relatively large network considering the
dimension of the input space (2D). However, the developers have used a fast learning
algorithm called the Dynamic Quick-Propagation (DQP) [113] that is based on the
quick-propagation algorithm [24] and a system for the determination of the hidden
layer size called Dynamic Node Creation [4]. The size of the network was, therefore,
determined through a learning process and should not be a cause for consideration.
Figure 4.1: ANN for ore grade/resource estimation by Wu and Zhou [112].
This ANN has been tested on assay composites from a copper deposit. A set of 51
drillhole composites has been used to train the network over an area of 3600 square
1st Hidden Layer: 7 x 4
2nd Hidden Layer: 7 x 4
Applications of Artificial Neural Networks to Mining
74
meters. The results of the trained network have been compared with results from the
polygonal method (manual and computer based), inverse distance, and kriging. These
results were based on Hughes, Davis, and Davey [36]. Unfortunately, there was no
comparison of the grade/resources estimates with actual values. This limitation tends
to be a very common problem in most of these studies.
Similar to the above network, is the ANN developed recently by Yama and
Lineberry [115], which is based again on the MLP architecture but uses the original
back-propagation learning algorithm. This network has one hidden layer with 50
hidden units instead of two. This difference brings back the question of network
complexity, i.e. whether to use a single but large hidden layer or multiple but small
layers. It seems that most of the researchers in the field choose a single hidden layer
mainly because of the reduced computational overhead as well as a reduction in the
required quantity for training samples.
Yama and Lineberry used sulphur data from 1152 samples from a 7315 x
4572-m coal property in northern West Virginia. It should be noticed that the use of
real data in similar studies is very rare. The property was divided into 25 regions (914
x 914-m) due to computer memory limitations. For every region, a network was
trained using the Easting and Northing as inputs and the sulfur values as output. All
the data values were normalized before they were used for training and testing of the
networks. The data were normally distributed, a property that usually causes the
networks to give outputs close to the mean value. Presenting the tails of the
distribution more often to the network and with a higher learning rate has reduced this
effect. The results obtained from the ANNs were compared with results from kriging.
Applications of Artificial Neural Networks to Mining
75
Clarici et al. [16] also described a similar approach of a single hidden layer
network earlier. In that study though, only one neural network was used for the entire
sampling area.
Moving from 2D to 3D input space, Caiti and Parisini [13] have used RBF
networks to interpolate geophysical properties of ocean sediments, e.g. porosity,
density, and grain size. The choice of RBF networks was based on their strong
theoretical foundation especially in function approximation. GEMNet II is based on
RBF networks and, in fact, for very similar reasons to those discussed in the previous
chapter and will be further analysed in the following. Caiti and Parisini used the
Gaussian as the basis function of the interpolating network. They suggested, as many
others, that any discontinuities of the interpolated property can be handled by a
smooth, continuous approximation network provided with enough information close
to the discontinuity. The choice of RBF centers has been based on the number of
values on the z-axis. As they very logically identified, there are normally many
samples on the z-axis and less on the x-y plane due to the sampling techniques used.
In the case of large number of samples on the z-axis, the RBF centers are mobile, in
other words their positions can change with learning. However, in the case of small
number of samples on the z-axis, the RBF centers are fixed, i.e. their positions remain
unchanged during training and the network is updated by adding extra RBFs
whenever a new sample is presented.
Density data from cores in an area of the Tyrrhenian Abyssal Plain, in the
Mediterranean Sea have been used as input data for the training and testing of the
network. Part of the data has been kept out of the training procedure and then used to
test the trained network’s prediction accuracy.
Applications of Artificial Neural Networks to Mining
76
One of the very few examples of ANN system being developed to a fully
commercial product, is Neural Technologies’ Prospect Explorer. It is a complete
system offering data analysis, visualization, and detection of anomalies as well as
analysis of the relationships between them. The system is based on a neural structure
called AMAN (Advanced Modular Adaptive Network), shown in Fig. 4.2 [70].
AMAN is not a type of neural network. It is a complex system consisting of different
types of networks, which are trained, in both supervised and unsupervised mode. The
user has a choice of networks and learning strategies depending on the problem at
hand. As shown in Fig. 4.2, AMAN can be described by the following:
• A set of hierarchically arranged networks: a problem is divided to sub-problems
and a network is assigned to each one of them.
• The type of the individual networks can be chosen to suit the nature of the
specific sub-problem.
• The controller, called ‘supervisor’, can then handle the outputs of the individual
networks to form a final result for the problem.
Figure 4.2: General structure of the AMAN neural system.
Applications of Artificial Neural Networks to Mining
77
AMAN as part of the Prospect Explorer can help to automate the detection of
anomalies from large quantities of survey data. Prospect Explorer provides the
following functions:
• Anomaly Detection: an interpolated grid forms the basis of a color map showing
areas of potential anomalies. This map can be used as a guide for further analysis.
• Cluster Identification: regions of survey data sharing common types of survey
results are identified.
• Correlation Analysis: layers of interpolated data can be correlated to illustrate
the relationship between the values of different types of data.
• Fuzzy Search: pattern-searching tool to analyze how closely regions match a
search specification supplied by the user.
• Relationship Explorer: similar to correlation analysis, but performed at specific
geographic locations.
Prospect Explorer has been used with success in a reasonably complex
exploration task that took place at the Girliambone region in New South Wales,
Australia. This case study involved several layers of data from a copper mine area of
110 square kilometers. The system has successfully identified the already known
deposits in the area as well as some unknown.
Cortez et al. [18] presented a hybrid system combining ANN technology with
geostatistics for grade/resources estimation. Their system, called NNRK (‘Neural
Network estimation of the drift and Residuals’ Kriging’), is based on a network with
3 inputs (the sample’s X, Y, Z co-ordinates), 6 hidden units and one output, the
respective Zn assay [18]. As shown on Fig. 4.3, the chosen ANN is very simple
Applications of Artificial Neural Networks to Mining
78
compared with the larger networks described in he previous examples. This ANN is
used for the identification of the underlying large-scale structure (trend modelling),
while residual analysis is performed at sampled locations by stationary geostatistical
methods that model local spatial correlations. Final estimates are given as a sum of
both estimations. The developers have chosen the use of geostatistics to support the
ANN estimations because of the results they obtained from a preliminary study
showing that the back-propagation network could not follow local variations of grade.
In the NNRK system, these are handled by ‘residual kriging’.
Figure 4.3: Back-propagation network used in the NNRK hybrid system.
The hybrid system has been applied to a case study from a large Portuguese zinc
deposit. As shown in Fig. 4.4, the data are quite spread in 3D space, a situation very
common in case studies with real data. The data came from a drilling programme of
the Feitais deposit, a massive orebody belonging to the Aljustrel group of mines in
South Portugal. The dataset consisting of 768 samples was split in two parts. The
validation set included 160 samples, about 20% of the total. The rest was used for
training the ANN, a process that involved 3000 iterations.
Applications of Artificial Neural Networks to Mining
79
Figure 4.4: Drillhole data used for testing the performance of the NNRK system [18].
The results obtained by the NNRK methodology are compared with those
produced by the ANN and kriging in the following table:
Table 4.1: Comparison of NNRK, ANN, and kriging estimates.
Populations n m (mean) σ2 σ/m
Sampled data 238 3.314 3.988 0.603 NNRK estim. 238 3.516 2.347 0.436 ANN estim. 238 3.493 0.141 0.108 Krig. estim. 238 3.461 1.281 0.370
The results presented show that the combination of ANN and kriging can improve
considerably over the results that can possibly be obtained from each one of the
methodologies individually. It should be noticed though that the back-propagation
network used in this study is only capable of performing global approximations
leading to smooth estimates. The number of hidden units in this network is also
surprisingly low considering the dimensionality of the input space.
Applications of Artificial Neural Networks to Mining
80
4.2.3 Sample Neighborhood Based Systems All of the systems above try to reconstruct the ore grade surface from the sample co-
ordinates. This strategy works very well when this surface is fairly continuous and
there are enough samples covering the considered area. It also works better when
done in 2D rather than 3D – a single network seems to be producing outputs close to
the average value when faced with a very complex deposit in 3D and sometimes even
in 2D. Wu and Zhou [112] created a large network (56 hidden units) to perform grade
estimation on a 2D dataset of a fairly continuous copper deposit.
Quite reasonably, some researchers tried to take advantage of the information
hidden in the relationship between neighboring samples. This approach is followed in
general terms by the most advanced existing methods for grade estimation like
inverse distance weighting and kriging. Most of the examples following this approach
choose as neighbours the samples closest to the estimation point and treat the
problem of ore grade estimation as a mapping between the surrounding grades and
the grade at the estimation point (Fig. 4.5). The samples are normally arranged on a
grid, and the inputs are formed from the eight nodes surrounding the estimation point.
A very good example of this technique is given by Williams [110]. The main
assumption made in this example is that there is a strong correlation between gold
grades and magnetic data. The technique was applied in 2D space. Naturally, building
the grid of magnetic data from scattered samples by interpolation can introduce errors
due to smoothing. This is a serious disadvantage of all methods that require data
arranged in grids.
Applications of Artificial Neural Networks to Mining
81
Figure 4.5: 2D approach of learning from neighbour samples arranged on a regular grid.
This single network approach has an additional limitation: the use of a single network
over the entire area of interest leads to the assumption that the learnt mapping
between the neighbour samples and the grade can be applied globally. In other words,
the method leads to a global approximation of ore grades.
Going a step further, some researchers have introduced multiple networks to
overcome these limitations. These modular systems consist of more than one network
each responsible for learning a different area of the deposit. The GEMNet system
developed by Burnett [12] is a very good example of a modular neural network
system for grade/resource estimation. Figure 4.6 illustrates the principle of
GEMNet’s operation. The deposit is divided into overlapping zones. The selection of
zones was arbitrary, which is a point where improvement could be made. In each
zone, a different network was trained and the final estimate for every point was taken
as the average of the networks trained in the specific area. As zones were
overlapping, there was almost always more than one network giving estimates.
Having more than one estimate led to the introduction of a reliability measure based
Applications of Artificial Neural Networks to Mining
82
on the variance of the individual estimates – an indicator that can be used as a guide
for the reliability of the final estimate. This indicator was also used in the GEMNet II
system with minor changes (Chapter 7).
Figure 4.6: Modular network approach implemented in the GEMNet system [12].
Applications of Artificial Neural Networks to Mining
83
A similar modular approach has been introduced by Geva et al. [27] for
function approximation. In both cases, the developers used multiple MLP networks
acting in a very similar manner with the RBFs of a single RBF network. The results
obtained by both Burnett and Geva have supported the choice of RBF networks as the
building unit for the GEMNet II system. However, GEMNet II is quite different in
the way these networks are used to provide a combined grade estimate, as it will be
shown in Chapter 7.
Actual Grade vs. GEMNet Prediction for Training Points
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6
0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6
Actual Grade (%Cu)
GEM
Net
Gra
de (%
Cu)
Figure 4.7: Scatter diagram of GEMNet estimates on a copper deposit [12].
GEMNet has been tested on simple function approximation problems, as well
as simulated ore deposits. Even though most of the case studies were using 2D data,
the results obtained were very encouraging and suggested that further research work
should be carried out to assess the effectiveness of the modular approach. Figures 4.7
and 4.8 show the results from a 2D study with GEMNet.
Applications of Artificial Neural Networks to Mining
84
0 50 100 150 200 250 300 350 400 450 500 550 600
Departure (m)
0
50
100
150
200
250
300
350
400
450
500
550
600
Latit
ude
(m)
0.0000.0020.0040.0060.0080.0100.0120.0140.0160.0180.0200.0220.0240.0260.0280.0300.0320.0340.0360.038
0 50 100 150 200 250 300 350 400 450 500 550 600
Departure (m)
0
50
100
150
200
250
300
350
400
450
500
550
600
Latit
ude
(m)
0.000.050.100.150.200.250.300.350.400.450.500.550.600.650.700.750.800.850.900.951.001.051.101.151.20
GEMNet ReliabilityIndicator
Grade (% Copper)
High Reliability
Low Reliability
- Training Data Point
Figure 4.8: Contour maps of GEMNet reliability indicator and grade estimates of a copper
deposit [12].
Applications of Artificial Neural Networks to Mining
85
4.2.4 Conclusions The discussion in this section has examined some of the most important examples of
neural network based systems for ore grade/resource estimation. A number of
techniques have been developed that differ mostly in the number of networks used
and the way data is presented to them.
As with the conventional methods for ore grade estimation, it is fairly safe to
say that there is no universally applicable solution to the problem. This is particularly
true when the neural network system is based on a single network. These systems
varied considerably in their architecture from one study to the other. The number of
hidden units changed even though the dimensionality of the problem remained
constant. Systems with modular structure, i.e. multiple networks, are more flexible in
the way they adjust to a specific deposit.
Both the sample co-ordinates and the sample neighbourhood based systems
can have their advantages and disadvantages depending on the deposit at hand. One
would expect the first to be better suited to continuous deposits were the grade can be
considered to be a hypersurface in the sample co-ordinates input space (a simple
surface in the case of 2D samples). The results obtained from the described studies
prove this to a certain degree.
On the other hand, complex deposits presenting a localised behaviour cannot
be modelled well by systems producing global approximations unless there are large
amounts of data to describe the local variations, a case that is very rare. These
deposits call for more flexible structures that can construct local approximations of
grade. Therefore, modular systems can be the choice for modelling complex deposits
using 2D or 3D data.
Applications of Artificial Neural Networks to Mining
86
4.3 ANN Systems for Other Mining Applications
4.3.1 Overview A number of other mining related problems have been approached using ANN
technology. These problems commonly relate to pattern classification, prediction and
optimisation. ANNs have been successfully applied to these areas and are therefore
suitable for similar mining problems.
In the following paragraphs a brief description of such problems and their
ANN solution is described. The applications shown range from geophysics to plant
optimisation and illustrate the fact that ANN systems can be useful to a very large
number of problems.
4.3.2 Geophysics Geophysics is a relatively new area for ANN systems. However, in the last few years
ANNs have become a very popular tool in the interpretation of seismic and
geophysical data from various sources.
Garcia et al. [26] have used a MLP (Fig. 4.9) trained using back-propagation
for the inversion of lateral electrode well logs. Inversion represents the process of
constructing an earth model from the log data. The data used for training the network
were derived from a finite difference method that simulated the lateral log. The
trained network was tested using real data and the results were compared with those
from an automated inversion model. The study has shown promising results and has
presented the advantages of the use of ANN for the specific problem.
Applications of Artificial Neural Networks to Mining
87
Figure 4.9: Back-propagation network used for lateral log inversion [26]. Connections
between layers are not shown.
In a similar fashion, Rogers et al. [85] used a MLP network for the prediction
of lithology from well logs. Malki and Baldwin [56] compared the results produced
by neural networks trained using well logs from different service companies. More
specifically, networks were trained using data from one service company and tested
on data from another, and the study was repeated using training data from both
companies and tested on data from each one individually. The results have shown that
better performance is obtained when using data from both service companies.
Wanstedt et al. [107] applied neural networks to the interpretation of
geophysical logs for orebody delineation. The data used for the development and
testing of their approach were taken from the Zinkgruvan mine in Sweden. The
network used was quite small – three layers with 3 inputs, 7 hidden units, and 1
output. The inputs were the gamma-ray, density, and susceptibility, and the output
was the ore grade (Zn, Pb, or Ag). The study reports good results in estimating the
grades and consequently interpreting the lithology (Fig. 4.10). Unfortunately no
numerical measurement of the network’s performance is provided.
51 Node Input & Output Layer – One for every 2ft interval of a 100ft log
40 Node Hidden Layer
Applications of Artificial Neural Networks to Mining
88
Figure 4.10: Estimated grades and assays (red and blue) vs. actual (black) (107).
Murat et al. [67] used a MLP for the identification of the first arrival on a
seismogram. Roessler [84] used NETS, a neural network simulator written at
NASA/Johnson Space Center to develop a neural network for analysing wave arrivals
from seismic waves transmitted from one borehole and received from another. The
network was trained on a binary pixel image of the seismic trace data. The input layer
consisted of a large array (97 x 41 = 3977) of input nodes, the hidden layer had 50
units, and the output layer had two units. The network was trained to produce a
binary pattern in its outputs, i.e. the outputs were either 1 or 0. The different
combinations of outputs were indicative of the relative position of the first arrival to
the current positive lobe. Once again, no numerical measurement of the networks
performance during training and testing was provided in the study.
Barhen and Reister [6] developed DeepNet, a system based on the MLP that
predicts well pseudo logs from seismic data across an oil field. DeepNet combines a
very fast learning algorithm, systematic incorporation of uncertainties in the learning
process, and a global optimisation algorithm that addresses the optimality of the
Applications of Artificial Neural Networks to Mining
89
learning process. The system has been successfully applied in the Pompano field in
the Gulf of Mexico.
4.3.3 Rock Engineering King et al. [43] have developed an unsupervised neural network for the discovering
of patterns in roof bolter drill data. The network successfully classified 617 drill
patterns to just 9 or 16 unique features representing major geologic features of a mine
roof. The patterns consisted of the penetration rate, thrust, drill speed, and torque. A
system consisting of this network and an expert system was developed for the
evaluation of coal mine roof supports [95].
Millar et al. [63] used self organising networks to model the complex
behaviour of rock masses by classifying input variables related to the rock stability
into two groups: failure or stability.
Walter [106] used Kohonen networks for the classification of mine roof strata
into one of 32 strength classes. The developed system can provide an estimate of
strength within two seconds giving the drill operator a warning almost in real time
when a potentially dangerous layer is reached.
4.3.4 Mineral Processing Neural networks have been successfully applied to a number of pattern classification
problems. Particle shape and size analysis seems to be a natural field of application
for ANNs and specially for unsupervised techniques.
Maxwell et al. [59] developed an ANN based system for particle size analysis
based on video images. The system analyses images from material on a conveyor and
predicts the particle size distribution.
Oja and Nyström [72] applied self-organising maps for particle shape
quantification. Image analysis is performed to mineral slurry particles by use of a
Applications of Artificial Neural Networks to Mining
90
SOM which extracts the features affecting the behaviour of powders and slurries. The
training data set consisted of 3000 binary images of 500 particles. The produced map
size was 12 x 10. The developed SOM was tested on 360 particle images with
success. The test showed that the SOM was capable of clustering differently minerals
that did not have strong shape features.
Deventer et al. [104] used again the SOM for on-line visualisation of flotation
performance. The structure of the froth is quantified by the neighbouring grey level
dependence matrix. The SOM had a map size of 20 x 20 and there were three
classifications of Zn grade peaks as being positive (Class_+1), zero (Class_0), or
negative (Class_-1) for each of the image features. The classification was based on a
number of image features. The developed SOM was to be used as part of an
automated computer vision system for the control of flotation circuits.
Petersen and Lorenzen [76] applied the SOM to the modelling of gold
liberation from diagnostic leaching data. The data came from seven different gold
mines in South Africa. The ores from the mines were fed to mills and the ore samples
were screened into three size intervals. One of the fractions was further screened into
six size fractions giving a total of eight fractions. Representative samples were then
fed to a ball mill, and the product was screened into the same six size fractions. On
each of the fractions, diagnostic leaching was performed for each of the ore types.
The percentage of gold deportment and percentage of gangue, the percentage of free
gold in each fraction, the head grade, and the mass distribution were projected to a 10
x 10 map. The clustering produced was well defined for the different sample sources
(gold mines).
Applications of Artificial Neural Networks to Mining
91
4.3.5 Remote Sensing Probably one of the most popular areas of neural network application, remote sensing
presents problems which are ideal for architectures such as the SOM, the LVQ, or
even the standard MLP. The examples given here, even though not directly linked to
mining activities, demonstrate the potential of ANNs in this field.
Bischof et al. [8] used a MLP for the multispectral classification of Landsat
images. These images came from a Landsat Thematic Mapper (TM) and were 512 x
512 pixels in size. They were also analysed into 7 spectral channels (bands) which
were used as the inputs to the network (13 units for each band representing different
intervals from 0 to 255). The network then had to learn to classify the 7 band values
to one of four types of land (built-up land, forest, water, and agricultural land), each
represented by an output of the network. Even though this architecture gave good
results, the developers extended the network to include a 7 x 7 pixel map of texture
from band 5. Naturally the number of hidden units was increased from 5 to 8 units.
The results from this extended architecture were better than the non-extended one in
all types of land.
Gopal and Woodcock [29] used a MLP for the detection of forest change from
Landsat TM images between 1988 and 1991. A 10-input vector of 10 TM bands (5
from 1998 and 5 from 1991) is used with the single output being the absolute or the
relative change. The results obtained with the developed MLP were better than those
obtained with the conventional method for this task.
Poulton and Zaverton [78] give a comparative study between different neural
network architectures used for classification of TM images. The architectures
compare were the back-propagation network, LVQ, counter-propagation network,
functional link, probabilistic network, and the SOM. From the tests performed, they
concluded that the LVQ architecture was the most flexible and robust one. They also
Applications of Artificial Neural Networks to Mining
92
suggested the use of ANNs for the analysis of geochemical and geophysical data,
location of favorable prospects using GIS data, lithologic mapping from remote
sensing data, and estimation of parameters in a similar way with kriging.
Krasnopolsky [48] used a MLP for the retrieval of multiple geophysical
parameters from satellite data. These parameters were the surface wind speed,
columnar water vapor, columnar liquid water, and sea surface temperature (the four
outputs of the MLP). The MLP had five inputs taken from five Special Sensor
Microwave Imager brightness temperatures. The hidden layer had 12 units. The
simultaneous retrieval of multiple parameters improved the retrieval of each one
individually allowing physically coherent and consistent geophysical fields to be
produced.
Xiao and Chandrasekar [114] used a MLP for rainfall estimation from radar
observations. More specifically, two networks have been developed, one using
reflectivity as the only input, and the other using both reflectivity and differential
reflectivity as the inputs. The networks were trained on data obtained from a multi-
parameter radar and raingages from the Kennedy Space Center. The trained networks
were then used to estimate rainfall for four days during the summer of 1991. The
trainning patterns consisted of a square grid (3 x 3km) of reflectivity values as well as
distances from the grid nodes to the point of estimation. The raingage values were
used as the target outputs. The trained network estimates and raingage values have
shown good agreement at all sites.
Applications of Artificial Neural Networks to Mining
93
4.3.6 Process Control-Optimisation and Equipment Selection Process control and optimisation tends to be a tedious task involving large amounts of
data from very different sources. ANNs are ideal for handling such tasks and this is
why many researchers in the field of process control turned to them for developing
solutions. Process control and optimisation of mineral processing plants as well as the
mining process itself are a special case of these tasks and can therefore be approached
by neural networks.
Van der Walt et al. [103] used the MLP for the simulation of Resin-in-pulp
process for gold recovery. Flament et al. [25] used the MLP for the identification of
the dynamics of a mineral grinding circuit and the development of a control strategy.
Bradford [10] used neural networks in a number of studies modelling the behaviour
of different parts of a mineral processing plant.
Ryman-Tubb and Bolt of Neural Mining Solutions Pty Ltd [91] describe the
use of the AMAN architecture (described before) for integrated process system
modelling and optimisation. The suggested areas of application include froth
flotation, carbon-in-pulp (CIP), milling, and others. Their case study presented a real-
life example based on a multi-stage copper extraction process. The trained networks
(MLPs) were used for the following:
• Prediction of stripped copper cathode from electrowinning
• Prediction of raw material usage
• Identification of key plant parameters
• Analysis of the effect of plant input parameters
• Economic optimisation to determine cost-effective control settings
Applications of Artificial Neural Networks to Mining
94
The developers claimed the following benefits from the ANN approach:
• Decreased raw material costs
• Increased copper production
• Optimised planning of new and existing heap operations
• Ability to implement “Just-in-time” purchasing policy
• Planning of new heaps
• Reduce reliance on individual and human operation
Finally, Schofield [94] investigated the use of neural networks as well as other AI
tools for the selection of surface mining equipment.
4.4 Conclusions Quite clearly, the spectrum of neural network applications in mining is very wide.
This is demonstrated by a number of exciting and very promising studies by a number
of people from different scientific fields. The examples presented in this chapter
support the choice of ANNs as the basis for developing solutions to mining problems
were conventional techniques fail in one way or another. Mining is always about time
and money and so far neural networks have shown that they can be very good in both
terms. The systems described in the above examples were fast, reliable and most of
the times provided a very stable theoretical background on which the validity of the
proposed solution is based.
The general trend in the mining industry for automation to the greatest degree
calls for technologies such as the ANNs that can utilise large amounts of data for the
development of models which otherwise are very difficult or sometimes even
impossible to identify. The speed of ANNs – at least in application mode – also
Applications of Artificial Neural Networks to Mining
95
allows the development of real- or almost real-time systems, which can recognize
quickly potential problems or even danger during a certain process.
Another advantage of ANNs is in the minimisation of the necessary
assumptions for a given problem. Especially in the case of grade estimation, this
attribute proves very valuable. The examples of ANN application to grade estimation
given earlier in this chapter supported this and other advantages of neural networks.
The ambition of the author is to implement these advantages into an integrated neural
network system for grade estimation.
Development of a Modular Neural Network System for Ore Grade Estimation
96
5. Development of a Modular Neural Network System for Grade Estimation
5.1 Introduction Before moving into the in-depth analysis of the integrated GEMNet II system for
grade estimation, it is necessary to go through the development steps that led to the
final architecture. Many things have changed in the developed architecture since the
beginning of this project. The number of networks, their topological characteristics,
the learning algorithm, the error measures, and even the inputs and dimensionality of
the input space were changing or, one could say, evolving as more tests were run and
the author gained more insight to the numerous algorithms and developments in the
field of artificial neural networks. Going through these steps helps to understand the
reasoning behind the developed system and how the original aims were met.
GEMNet II was named as the successor to the original GEMNet system [12]
also developed at the AIMS Research Unit. The author was very fortunate to have a
starting point well ahead of any research carried out elsewhere, something that
inevitably set the aims for the development of GEMNet II at quite a high level.
GEMNet II was developed with the real life situations in mind from day one. The
main aim was to find a reliable and robust architecture that required no significant
interaction with the user in order to provide accurate grade estimation results. After
the identification of this architecture and the proof of its validity through a number of
case studies, the next aim would be to integrate the architecture in a user-friendly
system that would allow straightforward application with no important parameters to
be set by the user. The system should also be capable of removing the ‘black box’
attribute neural networks are famous for, an attribute completely unacceptable in the
Development of a Modular Neural Network System for Ore Grade Estimation
97
mining industry especially when it comes to grade estimates on which decisions
involving large amounts of financial resources will be based.
In this chapter, the development of the modular neural network architecture
for grade estimation will be described. Mathematica from Wolfram Research [111]
was used for the development of all prototype systems as it was found to be a very
resourceful environment providing all the necessary tools for understanding and
validating different neural network architectures.
Two main principles – hypotheses have been accepted during the development
of the system: grade estimation can be approached as a hypersurface reconstruction
problem in the spatial co-ordinates input vector space, and grades are the numerical
representation of a localised phenomenon (deposit) - grades themselves present
localised behaviour. As will be seen later, there are a number of implications brought
by these hypotheses that have a great effect on the design of GEMNet II.
The author has carried out a large number of preliminary tests on various
neural network architectures and learning algorithms as part of his MSc project [42].
These tests were based entirely on simulated 2D data arranged on a square grid. The
networks were trained on the grid nodes using the grade at a given node as the
required output and the grade at the eight (or the four closest) surrounding nodes as
inputs. There was no information provided about the spatial location of the input
samples or even the location of the required output. All together, the approach was
very similar to image analysis techniques using computer vision with the image being
in this case the grade surface. The results of these case studies and their comparison
with results from kriging showed great promise – in fact, the developed neural
networks performed much better than kriging in most of the cases. However, there
was no guarantee that this would happen with real data and of course the whole
Development of a Modular Neural Network System for Ore Grade Estimation
98
approach was not at all applicable to real data due to the inflexible arrangement of the
inputs (fixed on a regular grid).
The most important issue raised from the above project regarded the formation
of the input space, i.e. which input parameters should be used or how should the task
of grade estimation be decomposed into smaller tasks that would be easier to approach
using neural networks. In the next paragraph, the shift from fixed-on-a-grid inputs to
completely floating-in-space sampling inputs will be described in two-dimensional
sampling space.
5.2 Forming the Input Space from 2D Samples It is generally accepted that the input space characteristics as well as its components
play a very important role in the performance of neural networks. The input
dimensionality, as was discussed in earlier chapters, controls to a great extent the
overall complexity of the neural network topology as well as the amount of training
data required to bring the network performance to acceptable levels. Therefore it is
very important to select the inputs from the available data in a way that will help
reduce the complexity of the network and at the same time provide the right
information for the network to be trained on.
The input space also defines the way of approaching the required task, in this
case grade estimation. Using the sample co-ordinates, for example, in two dimensions
(easting and northing) as inputs to a network with the output being the grade of the
sample means that grade is treated as a surface in the co-ordinate space. This approach
seems to be the most popular among researchers dealing with this problem.
As explained in the previous chapter, another approach is to use samples close
to the estimation point as the source of grade input data. Usually, the samples are
arranged on a regular grid, which makes things a lot easier. If they are not arranged on
Development of a Modular Neural Network System for Ore Grade Estimation
99
a grid, then the grid is constructed by applying a polygonal or inverse distance
calculation on the original data, which naturally introduces smoothing errors. The
inputs are in this case the grades of the neighbour nodes and the output is the grade at
the point of estimation (also on the grid). Neighbour nodes can be considered to be the
eight nodes surrounding the estimation point or the four that belong to the same grid
lines passing from the estimation point.
The above approach gives very good results on simulated data and regular
sampling schemes where the smoothing errors introduced from gridding original data
are relatively low. Applying this approach though directly to real data normally
obtained with an irregular sampling scheme leads to the network learning a very
smooth distribution (the distribution of the polygonal or inverse distance grid nodes)
of grades that does not represent the reality. It should be noticed that the polygonal
and the inverse distance method assume that the modelled surface is continuous.
Clearly, there is a need to develop a way of presenting to the networks
information from neighbour samples that honours their relative location to the point of
estimation. In other words, the aim is to form the input space in a way that includes
both the surrounding grade values and their relative position in space.
A very common way of choosing samples surrounding the point of estimation
used by most of the conventional methods is to use octant or quadrant search (Fig.
5.1). The area surrounding the point of estimation is divided into eight (or four)
sectors and a number of samples is chosen from each one of them. This technique
ensures that samples are selected from all directions in 2D space and not only from
the direction where there are more samples and closer.
Development of a Modular Neural Network System for Ore Grade Estimation
100
Quadrant Octant
Estimation point Selected sample Not selected sample Node
Figure 5.1: Illustration of quadrant and octant search method (special case where only one
sample is allowed per sector). Respective grid nodes are also shown.
Dividing the area around the estimation point into octants (or quadrants)
provides a way to expand the inflexible input scheme using grid nodes to a scheme
that can accept samples floating in 2D space. The inputs are now the grades at
neighbour samples any distance away from the estimation point and not from the
surrounding grid nodes (Fig. 5.1). There is no need for gridding the original data using
any interpolation method that normally introduces errors, therefore the network is
modelling the original distribution of grades. The use of octant search (or quadrant)
also allows the use of the same neural network architecture as in the case of gridded
samples.
There is however one fundamental difference between the two approaches. In
the case of samples arranged on a regular grid, the distance of the inputs from the
point of estimation remains constant throughout the sampling area. Using an octant
search means that the samples are now at a varying distance from the estimation point
Development of a Modular Neural Network System for Ore Grade Estimation
101
and therefore there is a need to include distance information as part of the input space.
This requirement is also derived from the hypothesis that grades present localised
behaviour. Therefore it is necessary for the neural network to ‘know’ the distance of
any input sample relative to the point of estimation.
0 100 200 300 4000
100
200
300
400a. Actual Grades b. MNN Grades
0 100 200 300 4000
100
200
300
400
26.00
27.00
28.00
29.00
30.00
31.00
32.00
33.00
34.00
35.00
36.00
37.00
38.00
39.00
Figure 5.2: Estimation results from neural network architecture developed for use with
gridded data. The use of irregular data has an obvious effect in the performance of the system.
The author has initially tested the neural network architecture used for gridded
data directly to original data arranged irregularly in 2D space. The results were, as
expected, not as good as when using gridded data. Clearly learning a distribution
based on inverse distance estimates arranged on a grid is far easier than trying to learn
the original data distribution. Figure 5.2 shows contour maps from this test using data
from an iron ore deposit. The results from this test have shown clearly that it is
necessary to provide distance information to the network in order to improve its
modelling capacity in the case of irregular data. It should be noticed that at this stage
the problem of grade estimation is still approached by the use of a single network with
Development of a Modular Neural Network System for Ore Grade Estimation
102
multiple inputs (eight or four) depending on the search method used – octant or
quadrant.
In order to provide distance information to the network, one input is added per
sample, i.e. for each of the eight octants (or four quadrants) there are two inputs: the
neighbour sample grade and its distance from the estimation point. This leads to a
total of 16 inputs (or eight for quadrant search). The increase in the number of inputs
inevitably leads to an increase of the number of hidden units required to handle the
complexity of the input space. Figure 5.3 shows two neural networks with 16 and 8
inputs used to accept data from an octant and quadrant search respectively.
Figure 5.3: Neural network architectures receiving inputs from a quadrant search (left) and
from an octant search (right). The number of hidden units in the right network is lower than in
the left because the number of weights is higher.
The idea behind the use of two networks with different input dimensionality
was based on the fact that not all estimation points have an adequate number of
neighbour samples to complete the training patterns when using octant search. In
other words, when there are less than eight neighbour samples around the estimation
Development of a Modular Neural Network System for Ore Grade Estimation
103
point, quadrant search and the smaller network is to be used for the estimation.
Naturally, the quadrant search based network can be trained on all locations where the
octant search based network is trained.
In order to get even closer to a real situation, the developed architecture should
be able to handle estimation points at the edges or even outside the sampling area. In
these areas there is not enough information to generate complete patterns for any of
the two networks, i.e. both octant and quadrant search fail to find any neighbour
samples. For this reason, a third neural network is introduced to provide estimates at
these points. This network can only depend on data at the point of estimation and
therefore the commonly used input scheme of sample easting and northing is used.
At this stage, the developed neural network architecture for grade estimation
has become modular, in the sense that there are multiple networks providing estimates
but each on different estimation points from the other. These three networks are in
essence trying to reconstruct the grade hypersurface in their own input vector space.
In other words, no matter if they are only used in specific estimation points, they are
still trained on the entire sampling area, at least the part of it that provides enough
information for their training patterns.
As shown in Figure 5.4, and compared with the results shown in the previous
figure, this architecture provides considerably better estimation performance. The next
question is naturally whether this performance can be further improved. As it was
mentioned earlier in this chapter, grades tend to present localised behaviour, i.e.
samples close to each other tend to have similar grade values. This similarity normally
decreases with the distance between the samples. The effect of this fact is that it is
very difficult to approach grade estimation as a global approximation problem. For
this reason a number of researchers have been led to the use of modular neural
Development of a Modular Neural Network System for Ore Grade Estimation
104
networks that construct local approximations of grade. The architecture described so
far in this chapter, even though modular, still tries to approximate the entire
distribution of grades through each network. It should be noticed at this point that in
the case of radial basis function networks, the modelled surface is being reconstructed
by a series of locally trained basis functions, which gives an answer to this problem.
0 100 200 300 4000
100
200
300
400a. Actual Grades b. MNN Grades
0 100 200 300 4000
100
200
300
400
26.00
27.00
28.00
29.00
30.00
31.00
32.00
33.00
34.00
35.00
36.00
37.00
38.00
39.00
Figure 5.4: Improvement in estimation by the introduction of the neighbour sample distance
in the input vector.
The author carried the solution even further by breaking the problem of grade
surface reconstruction from neighbour points into smaller tasks that can be easier to
approach by a single neural network. More specifically, structural analysis in
geostatistics has been the paradigm for this problem decomposition. In structural
analysis, one tries to find the model of grade variability in certain directions in space.
The derived models are then used to modify the interpolation method and the sample
selection routine. Unfortunately, this is where one of the main disadvantages of
geostatistics appears, as structural analysis, and more specifically variography,
requires skills and time and also depends on the knowledge of the modelled
parameter. The author aimed at overcoming these problems, while still taking
Development of a Modular Neural Network System for Ore Grade Estimation
105
advantage of the benefits of structural analysis, by employing neural networks to learn
the spatial variability from exploration data.
In order to learn the spatial variability of grade, the two networks with the
inputs receiving information from neighbour samples were replaced by a number of
networks trained on neighbour samples coming from a single direction in space. In
other words, there are eight networks with two inputs (neighbour grade and distance
from estimation point) where there was one with 16 inputs, or four networks with two
inputs where there was one with eight. There is now one network per sector (octant or
quadrant) learning the variability of grade in that direction. As expected, it is far
easier for a single network to learn the variability in one direction than in all
directions. It is also easier to control the learning process and to monitor the results of
training.
The results obtained with this architecture are very promising [41]. This is the
final architecture developed by the author to handle exploration data from a two-
dimensional sampling scheme (Fig. 5.5). It became part of the Modular Neural
Network System (MNNS) described in later paragraphs of this chapter. The MNNS
could be considered as the prototype version of GEMNet II. There were several case
studies run using the MNNS and data from simulated and real deposits. These are
discussed in detail in Chapter 7.
Development of a Modular Neural Network System for Ore Grade Estimation
106
I/O Data Set used for Training, Validation and Testing
Octant Search Quadrant Search
Octant RBFNModule16 Inputs8 RBFNs1 Output
Quadrant RBFNModule8 Inputs4 RBFNs1 Output
X-Y-Grade MLPModule2 Inputs1 MLP
1 Output
Output
TrainingTestingSectorOutput
Data Types
Modular Neural Network System for Ore Grade Estimation
Figure 5.5: Modular neural network architecture developed for grade estimation from 2D
samples [41].
After the design of the input space follows the development of the neural
network topology as well as the learning algorithm. These are explained in the
following paragraphs.
5.3 Development of the Neural Network Topologies
5.3.1 Overview From the discussion in the previous paragraph it becomes clear that the topology of
the neural networks used in the developing stages of MNNS has gone through many
changes. Apart from the input layers already discussed, the hidden layer has also been
changing - the number of hidden units, the type of hidden units, and their activation
Development of a Modular Neural Network System for Ore Grade Estimation
107
and output functions. Different error measures were also tested. Overall, only one
aspect of the neural networks did not change and that is the number of output units.
The output layer of all neural networks developed had one unit providing the grade
estimate.
There are two types of neural networks that were predominantly tested during
the development of the MNNS. These are the Multi-Layered Perceptron (MLP) and
the Radial Basis Function (RBF) network. The choice between them was not easy as
the MLP is very popular in function approximation problems and there is a very good
background of theory and practical examples. In theory both architectures can
produce very good results given time and training information. However, the RBFN
has a great advantage over the MLP in terms of speed of development, which was
more than verified during testing. Also the MLP produces global approximations and
in order to get the same effect with the local approximations of the RBF it is
necessary to complicate the overall architecture by introducing a number of MLPs
trained on localised data. This approach was implemented in the original GEMNet
and seemed to produce good results in small to average 2D deposits.
The RBFN was chosen as the building unit of the MNNS after a number of
tests on both architectures. However the MLP was still used occasionally as an
averaging network for the estimates produced by the various RBFNs, as it will be
discussed later.
5.3.2 The Hidden Layer Designing the hidden layer of a RBFN is a very complex task and also a very
important one for the overall performance of the network. The number of hidden units
depends on the training data and the modelled parameter, and can therefore vary from
one dataset to the next. In the case of the MNNS architecture described here, the
Development of a Modular Neural Network System for Ore Grade Estimation
108
problem becomes even more complex as the original drillholes samples dataset is
being processed and presented in three different ways. There is training data for the
octant search networks, training data for the quadrant search networks and finally,
training data for the network trained on the samples’ spatial co-ordinates. In addition
to the original dataset, there is also the patterns consisting of the outputs of all these
networks that become the inputs to the final averaging network.
The optimum number of hidden units can be found in the case of drillhole data
only during training by applying one of the automated node generation or destruction
algorithms. There is a number of algorithms for adjusting the number of hidden units
by training. In the case of the MNNS, a simple training algorithm was employed for
adding hidden units (or RBF centres). The basic steps of the algorithm are as follows:
1. Start with a minimum number of RBF centres;
2. Train the network and calculate the validation error;
3. Add one centre;
4. Repeat step two and compare with previous validation error;
5. If the change in error is too small then stop training;
6. If the change in error is significant and the maximum number of centres has not been
reached, go to step 3;
7. When the number of centres reaches the maximum, exit the algorithm and save the
architecture with the smaller validation error.
Altogether this algorithm finds the number of hidden units that would produce the
minimum validation error and uses the respective topology during estimation. This
algorithm should not be confused with the learning algorithm used for training the
various topologies.
Development of a Modular Neural Network System for Ore Grade Estimation
109
Another very important issue concerning the hidden layer in any RBFN is the
positioning of the basis function centres (weights between input and hidden layer).
This normally takes place at the initialisation stage of the learning process, where
some of the network’s free parameters are set to give learning the best start possible.
The initial positioning of the centres is very crucial. Thinking of the error as a
hypersurface in the weights vector space – in this case the centres vector space – it is
fairly easy to understand the importance of starting from a good point on this
hypersurface, as it will help find the minimum-error-producing weights. A number of
centre positioning algorithms are available including Kohonen learning, random
positioning, k-means clustering, and positioning on samples. After rigorous testing,
and for the available testing data it was found that random positioning of the centres
in the input vector space was leading to better performance than any other positioning
algorithm. Again it should be noticed that this was very much depending on the data
used for the studies and indeed, as it will be seen later when using data from a three-
dimensional sampling scheme, the random positioning was found to be inadequate for
more complex data. The author believes that the random positioning is ideal for two-
dimensional datasets with relatively low number of training patterns. Random
positioning is not expected to perform well when the number of centres to be fitted is
considerably lower than the total number of training patterns available.
A more difficult choice was that of the basis function. There was almost no
agreement between the studies with two-dimensional data as to which basis function
helps produce better results. However it seemed that the multi-quadratic and the thin-
plate spline were consistently producing better results and the author was convinced
to use them in further studies. It should be noticed that the choice of basis function is
Development of a Modular Neural Network System for Ore Grade Estimation
110
not as crucial for the problem at hand as is the smoothing parameter of the function,
which carries information about the problem.
The smoothing parameter can only be set through experimenting with the
training data and is unique for every study. This is one of the points where user
intervention is required for optimum results. There are no rules of thumb for this
problem and therefore it is required that a number of testing runs are performed in
order to set the smoothing parameter to its ideal value. Generally it is a very quick
process and it is only necessary to take place once per study – if more training
patterns become available there is normally no need to change the smoothing
parameter. It is also possible to use only a representative part of the dataset for this
process if the number of training samples is too large and training is time-consuming.
The final aspect of the hidden units and by far the most important one is the
bias. The bias of every unit is normally set to 1 and then adjusted through training. It
is very important as it changes completely the behaviour of the unit when presented
with data inside its receptive field. Generally, as the bias moves away from the value
of 1 (gets smaller) more hidden units become activated than just the unit whose centre
location corresponds to the current training pattern.
5.3.3 Final Weights and Output In order to complete the RBFN architecture, the weights between the hidden layer and
the network’s single output need to be set. This is achieved by a gradient descent
method similar to that used in the MLPs. The RBFNs used in the MNNS are fully
interconnected, i.e. units from one layer branch out to every unit of the next layer. As
there is only one output unit, the number of weights between hidden and output layer
equals the number of RBF centres in the network. The single output unit simply
performs the summation of the hidden units’ weighted outputs and passes the result
Development of a Modular Neural Network System for Ore Grade Estimation
111
through an activation function (like the logistic – sigmoid) that also takes the bias of
the hidden units under consideration.
5.4 Learning from 2D samples
5.4.1 Overview Learning in RBFNs has been discussed in detail in Chapter 3. In this paragraph, the
details of the learning algorithm used in MNNS will be discussed. Attention will be
given to the effects of the problem characteristics on the learning parameters, i.e. how
the learning algorithm is adjusted to perform better with exploration data.
In the MNNS architecture there are three neural network modules each trained
on different patterns derived from the same data (Fig. 5.6). It is therefore necessary to
describe the learning process for each one of the modules individually, as there are
significant differences. The discussion begins with the RBFNs trained using the
patterns formed by an octant search.
Figure 5.6: Partitioning of the original dataset into three parts each one targeted at a different
module of the MNNS.
Development of a Modular Neural Network System for Ore Grade Estimation
112
5.4.2 Module 1 – Learning from Octants Module 1 has eight RBFNs each with two inputs (neighbour sample grade and
distance from estimation point), one output (grade at estimation point) and a varying
number of hidden units (RBF centres). Figure 5.7 shows one of these networks.
Module 1 can be seen as a modular network with 16 inputs and eight outputs. These
outputs are averaged to provide a single grade estimate for the module.
Input vectors are normalised, i.e. reduced to vectors of equal length. This is
necessary to ensure that changes of equal scale in different inputs have the same effect
on the network’s performance. The outputs are denormalised to give an estimate in
the original range of values.
Figure 5.7: RBFN used as part of module 1 in MNNS. Training patterns from an octant
search were used to train the network.
The learning process begins with the initialisation of the RBF centres. This
process involves positioning the centres and setting the bias of the basis functions. As
already explained, the centres were chosen randomly in the input space and the bias
Development of a Modular Neural Network System for Ore Grade Estimation
113
was usually set to an initial value of 1. The initial centre positions and bias values can
be further optimised during the learning process. However, as it was found during
testing, it is very difficult to train the networks by adjusting all the free parameters
simultaneously. Therefore, training in MNNS was concentrated on one parameter at a
time.
The number of centres, as already discussed, was set by another process,
nesting the RBF learning algorithm. The first parameter to be set by the learning
algorithm is the weights between hidden and output layer. These are found by solving
a problem of least squares using the known output and the output of the network. The
rest of the network’s free parameters (centre location and bias) were set one at a time
by a gradient descent method. Figure 5.8 shows the location of the basis function
centres in the input space. The distance between the current input vector and the
vector of the basis function centres was measured using the Euclidean error distance
measure.
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
-2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5
Figure 5.8: Posting of the basis function centres from the RBFN of Fig. 5.7 in the normalised
input space (X-Grade, Y-Distance).
Development of a Modular Neural Network System for Ore Grade Estimation
114
The training patterns set was split in three parts. One third was used for
training, the second for validation, and the third for testing. The training patterns were
randomly selected as members of each part. The learning process stopped when the
maximum number of centres was reached or when the change in the validation error
was less than 0.001%. The architecture with the lower validation error was saved to
be further used during testing and application. In contrast with the networks shown in
Fig.5.3, the networks trained using the octant search had more hidden units than their
counterparts in the quadrant search. One explanation for this is that the input
dimensionality is the same in both cases but the problem in the case of octant search is
more difficult due to less input data than in the quadrant search. The RBFNs in octant
search are required to minimise the validation error for the same mapping but with
less data, and therefore more basis functions are needed to achieve the mapping. The
number of centres varied between 5 and 21 throughout the case studies.
Figure 5.9: Graph showing the learned relationship between the network’s inputs (grade and
distance of neighbour sample) and the network’s output (target grade) for the RBFN of Fig.
5.7.
Development of a Modular Neural Network System for Ore Grade Estimation
115
After training and validation of the networks, testing took place to measure the
generalisation performance and to provide the basis for comparison with other grade
estimation techniques. Figure 5.9 shows an example of a network’s learned mapping
between neighbour sample grade and distance, and grade at point of estimation.
5.4.3 Module 2 – Learning from Quadrants Module 2 has four RBFNs each with two inputs (neighbour grade and distance) and
one output (grade at estimation point). As in the case of Module 1, this module can be
considered as a modular neural network with 8 inputs and 4 outputs. The outputs from
the four networks are averaged to provide a single output. Figure 5.10 shows one of
these networks. The number of basis functions was less than in Module 1 networks
because quadrant search produces more training patterns than octant search from the
same dataset and therefore it is easier for the RBFNs of Module 2 to produce the same
mapping with less hidden units.
Figure 5.10: Example of an RBFN from Module 2.
Development of a Modular Neural Network System for Ore Grade Estimation
116
The number of basis functions varied between 2 and 17 throughout the case studies.
The learning process was identical to the one used in Module 1. Figure 5.11 shows
how the centres of the RBFs were located in the normalised input space for the
network in Fig. 5.10 and for a specific case study. Figure 5.12 shows the learned
mapping for the same network, i.e. the learned relationship between the inputs (grade
and distance of the neighbour samples) and the output (grade at estimation point). It
can be seen that generally the network’s output increases with increasing neighbour
grade and decreases with increasing distance of neighbour sample.
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
-2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5
Figure 5.11: Posting of the basis function centres from the RBFN of Fig. 5.10 in the
normalised input space (X-Grade, Y-Distance).
Development of a Modular Neural Network System for Ore Grade Estimation
117
Figure 5.12: Graph showing the learned relationship between the network’s inputs (grade and
distance of neighbour sample) and the network’s output (target grade) for the RBFN of Fig.
5.10.
5.4.4 Module 3 – Learning from Sample 2D Co-ordinates The single network of this module is a Multi-Layer Perceptron with two inputs
(easting and northing of samples) and one output (sample grade). The number of
hidden units, as shown in Fig. 5.13, was 14 but this changed from one study to the
other to achieve better results. The activation function of the hidden units was the
bipolar sigmoid activation function (tanh).
Development of a Modular Neural Network System for Ore Grade Estimation
118
Figure 5.13: Module 3 MLP network trained on sample co-ordinates.
Learning was based on the steepest descent algorithm. The steepest descent
method measures the gradient of the error surface after each complete cycle, and
changes the weights in the direction of the steepest gradient. When a minimum is
reached a new gradient is measured, and the weights are changed in the new direction.
The method is improved by the use of the momentum coefficient, and the learning
coefficient. The learning coefficient weights the change in the connections. The
momentum coefficient is a term, which tends to alter the change in the connections in
the direction of the average gradient. This can prevent the learning algorithm
stopping in a local minimum rather than the global minimum. In the MNNS the
learning process is split into four periods each with different number of training cycles
and different learning and momentum coefficients. Table 5.1 shows how these
coefficients are chosen during training.
Development of a Modular Neural Network System for Ore Grade Estimation
119
Table 5.1: Learning strategy for Module 3 MLP network.
Period 1 2 3 4
Learning Cf. 0.9 0.7 0.5 0.4
Momentum Cf. 0.1 0.4 0.5 0.6
Cycles 1000 100 100 10000
From the table it is clear that the change in the weights is more rapid at the beginning
of training and it is reduced from one period to the next. In most cases the learning
process is stopped well before the end of the last period. For example, in the case of
the iron ore data discussed before, learning was stopped at period 4, cycle 156.
Generally there are no rules for choosing these coefficients and one has to experiment
in order to find the best strategy for training.
The training patterns were split into three parts for training (55%), validation
(30%), and testing (15%). The validation set was used for guiding the learning
process, i.e. the process stopped when there was no significant change in the
validation error. At that point the topology was saved to be used for testing and
application. Figure 5.14 shows what the network has learned in the case of the iron
ore data. It should be noticed again that this network is used only for estimating
grades at locations where the previous two modules cannot due to lack of data.
Development of a Modular Neural Network System for Ore Grade Estimation
120
Figure 5.14: Learned mapping between sample co-ordinates (easting and northing) and
sample grade for MLP network of Module 3.
5.5 Transition from 2D to 3D Data
5.5.1 General Having described the modular neural network architecture for use with two-
dimensional data, it is now necessary to examine how this architecture can be
modified or expanded to accept data from real 3D sampling schemes such as drillhole
data. There are certain issues that need to be considered during this expansion. The
most obvious is the added dimensionality of the samples. Now there are three co-
ordinates defining the location of samples in space – easting, northing, and elevation.
More interesting is perhaps, the fact that samples have now a volume associated with
them. As the assaying procedure is carried out on different drilling core lengths, the
Development of a Modular Neural Network System for Ore Grade Estimation
121
samples come in all sorts of lengths and therefore different volumes. This extra
information needs to be considered in the input space of the estimating architecture.
The fact that each drillhole can give more than one samples also complicates things
even further. The neighbour sample search methods have to take this fact into
consideration to avoid choosing too many samples from the same drillhole. The
search methods described before are also purely 2D and cannot be considered as an
option with 3D data especially in the case where the orebody does not follow a
specific 2D plane in space. Therefore a fully 3D search method is necessary to be
developed.
These issues as well as other minor ones will be discussed over the next
paragraphs of this section.
5.5.2 Input Space: Adding the Third Co-ordinate In three-dimensional sampling schemes commonly used in exploration programmes,
samples are located in space by three co-ordinates: easting, northing, and elevation.
As was explained in previous paragraphs, one of the modules of MNNS is an MLP
network trained on the 2D co-ordinates of samples. The same network now needs to
increase its input dimensionality to accommodate the elevation co-ordinate of each
sample. The inputs of the network change from two to three. This obviously affects
the number of weights necessary, i.e. the number of hidden units has to increase.
The networks in the other two modules have the distance of the neighbour
samples as an input. This distance was calculated in 2D space. Now the distance is
calculated in 3D space. The centres of the basis functions were initially positioned
randomly in the input space. This is inadequate in the case of three-dimensional
samples as it was found during testing. The more complex distribution of neighbour
Development of a Modular Neural Network System for Ore Grade Estimation
122
sample distances is responsible for this fact. Therefore a different way of centre
positioning needs to be employed.
5.5.3 Input Space: Adding the Sample Volume The sample volume defines what people in geostatistics would call the support of a
particular sample. In drillhole data, samples have a certain length as to the length of
the drillhole itself. In order to cope with the variations in the support of samples, it is
necessary to pass the samples through compositing and use the composites of equal
length in the estimation procedure. This is the case for most of the conventional
methods of estimation including geostatistics.
In the case of the MNNS approach, there is no need to composite the samples
into equal length composites. The architecture is modified to accept the length of the
samples as an extra input to all neural networks involved. Specifically the network
trained on the sample co-ordinates now also accepts the length of the samples - the
inputs increase to four (easting, northing, elevation, and length). The networks trained
on neighbour samples now receive the neighbour sample length as well as its grade
and distance from the estimation point.
A complication of the transition to 3D data relative to the sample volume is
the fact that the estimation is now taking place in 3D as well. Block modelling is the
norm for 3D grade estimation. As was described before, block modelling is based on
blocks with an associated volume. This volume needs to be considered during
estimation for the same reasons that sample length is considered during training and
estimation. The extra input added to the neural networks enables the introduction of
the block volumes during estimation.
Development of a Modular Neural Network System for Ore Grade Estimation
123
5.5.4 Search Method: Expanding to Three Dimensions The search methods used in the case of 2D data can not be used with 3D data because
they take no consideration of the third dimension (elevation) which is necessary to
fully define the location of samples in space. The quadrant and octant methods, as
shown in Fig. 5.1, select samples from a plane rather than a 3D sample space. Even if
this plane is rotated in any of the three axes (easting, northing, elevation) these
methods would only be adequate for flat orebodies with not much grade variation in
one of the three dimensions. It is therefore necessary to expand these search methods
to three dimensions.
The author first tried to achieve this by applying the quadrant and octant
search in all three planes defined by the three axes: the XY, XZ, and YZ plane.
Figures 5.15 and 5.16 illustrate how the quadrant and octant search would divide 3D
space into sectors.
Figure 5.15: 3D version of quadrant search.
Development of a Modular Neural Network System for Ore Grade Estimation
124
Figure 5.16: 3D version of octant search.
From the figures it becomes clear that the resultant search methods become very
complex and very difficult to comprehend in three dimensions. The total number of
sectors produced is 64 for quadrant and 512 for octant. This means that the MNNS
should have 64 networks trained on quadrant search data and 512 networks trained on
octant data. Even if this was possible in computation terms, there would not be
enough samples to fill each sector and provide training patterns for every network.
Therefore it is necessary to simplify these search methods in order to cope
with the geometrical characteristics of exploration sampling schemes. After
considering a number of schemes, the author decided to use the simple search method
shown in Fig. 5.17. There are only six sectors in this scheme: upper, lower, north,
south, east, and west. These sectors are defined by the intersection of four planes: two
planes vertical to the XZ plane at ±45° dip, and two planes vertical to the YZ plane at
Development of a Modular Neural Network System for Ore Grade Estimation
125
±45° dip. In other words, these sectors look like pyramids of square base with their
top at the estimation point.
Figure 5.17: Simplified 3D search method used in the MNNS for sample selection.
The advantage of this search scheme is not just the fact that it is very simple and
affordable in computation terms. With this scheme, the drillhole where the current
training point belongs is always within two opposite sectors. This allows easier
control of the number of samples selected from this drillhole, which can help improve
the results of estimation. Another advantage of this scheme is that it can handle any
inclination of the orebody or the drilling scheme.
The author decided to replace both 2D-search methods (quadrant and octant)
by this simplified 3D method, which means that the MNNS has now just two
modules: one trained on the sample co-ordinates and length and one trained using data
Development of a Modular Neural Network System for Ore Grade Estimation
126
from the single search method. This also means that the second module now has only
six networks, one for every sector of this search scheme.
5.6 Complete Prototype of the MNNS The complete modular neural network system for grade estimation using 3D data is
shown in Fig. 5.18. The system comprises three neural network modules responsible
for the estimation and a data processing and control module that generates the training
patterns for the networks by applying the search method described.
Figure 5.18: Diagram showing the structure of the MNNS for 3D data (units are the neural
network modules).
The second module or unit as shown in the figure is a single RBFN trained on the
outputs of the six RBFNs of the first module. This network replaced the simple
averaging of the RBFNs' outputs that was done previously. It was found necessary as
it became clear during testing that some of the RBFNs of the first module were
consistently producing estimates closer to the actual values while others were
consistently far from them. The learning process for this RBFN is identical with that
of the RBFNs in module one. The number of hidden units varied between six and
Development of a Modular Neural Network System for Ore Grade Estimation
127
nine. Figure 5.19 shows an example of how this network's output varied depending on
the outputs of the RBFNs in module one.
Figure 5.19: Learned weighting of outputs from module one RBFNs by the RBFN of module
two.
The third module is the modified for 3D data neural network with four inputs
(easting, northing, elevation, and length) and one output (target grade). Unlike in the
case of 2D data where the MLP architecture seemed to perform better, and from early
tests ran using 3D data it became clear that the RBFN reduces the validation error
even further than the MLP and therefore the third module is based on a single RBFN
and not on the MLP as was described before.
The data processing and control module accepts data in ASCII form and
creates training pattern files for the neural networks of the MNNS. The formation of
training patterns is based on the search method described. Basically, for every training
sample in the dataset, one neighbour sample is chosen from every sector – the one
closest to the training sample. The grade of the neighbour sample, its distance from
the training sample and its length are written as inputs on the training pattern file of
the network responsible for the specific sector, while the training sample grade is
written as the require output. Clearly, in some occasions there are no neighbour
samples in some of the sectors. In those cases, the training sample is marked for
Development of a Modular Neural Network System for Ore Grade Estimation
128
estimation with module three, which is trained on the training sample co-ordinates.
The network of module 3 is however trained on all samples regardless of the results of
the search process. Figure 5.20 shows an example of this network’s output depending
on its inputs.
Figure 5.20: Learned relationships between sample co-ordinates, length (inputs) and sample
grade (output) from the RBFN of module three.
Development of a Modular Neural Network System for Ore Grade Estimation
129
After training is stopped the saved topologies are used for estimation. Initially this
was done on the basis of drillhole samples hidden from the training process for testing
reason. Later the drillhole samples were mostly targeted on the training and validation
process. Cross validation was used for testing the validity of the learned mappings and
for comparing with other grade estimation techniques. Studies carried out with this
architecture [40] supported most of the choices made during the development process
described in this chapter. Even at this prototype stage, the system could perform
reasonably well on a wide variety of data.
5.7 Conclusions In this chapter the development of the modular neural network system (MNNS) for
grade estimation was described. This system with some modifications will become the
core of GEMNet II. As it was explained, the MNNS is trying to approach grade
estimation in two different ways:
1. Using a sample’s co-ordinates and length to construct the picture of grade in 3D
space;
2. Using neighbour samples’ grade, distance and length to construct the picture of
grade in specific directions in space.
This approach ensures that there is an estimate for grade even in places where
sampling density is very low. This approach also takes advantage of the information
hidden in the relationship between neighbour samples and takes under consideration
the support of the samples. Because of that, it can provide estimates that have a
volume associated with them as opposed to point estimates.
Development of a Modular Neural Network System for Ore Grade Estimation
130
The MNNS requires a minimum of human interaction – this interaction is
limited to a single parameter of the RBFNs and it does not require any particular
knowledge or skills from the user. The results depend solely on the data at hand – the
estimation process adjusts to the available data.
However, the described system, being in a prototype form, is not very user-
friendly and integration of its results in the process of reserves estimation is difficult.
Therefore it is necessary to integrate the MNNS into a complete resource-modelling
environment in order to get the most out of the system and realise its full potential.
This integration will also allow better comparison with the existing methodologies.
In the next chapter this integration is described as well as a number of minor
modifications to the MNNS architecture. The targeted resource-modelling
environment was one of the leading mining software packages called VULCAN from
Maptek/KRJA Systems Ltd. The integration of the MNNS inside VULCAN led to the
development of GEMNet II.
Case Studies of the Prototype Modular Neural Network System
131
6. Case Studies of the Prototype Modular Neural Network System
6.1 Overview The case studies presented in this chapter were based on the prototype MNNS
architecture. In fact there were two versions of the prototype system, as described in
Chapter 5, one for 2D data and one for 3D. There are two case studies for each one of
them. More specifically, these case studies are:
• 2D iron ore deposit
• 2D copper deposit
• 3D gold deposit
• 3D chromite deposit
The 2D deposits have been extensively used in geostatistical as well as neural
network case studies and are ideal for comparison of different approaches. The 3D
deposits have never been used in a published study.
These studies are part of a larger set of tests ran using the prototype MNNS
architecture. The purpose of those tests was to validate the approach and fine-tune the
architecture. As the 2D datasets were created specifically to demonstrate the validity
of the geostatistical approach, they were ideal for testing MNNS and comparing its
results with those obtained using inverse distance and kriging. The datasets from the
four case studies presented here are given in Appendix B.
It should be noted that finding datasets from real deposits is fairly difficult.
Mining companies are quite reluctant in giving information away. Both in the MNNS
Case Studies of the Prototype Modular Neural Network System
132
studies of this chapter and the GEMNET II studies of the next, the most common types
of deposits are metal.
The performance of the MNNS will be compared with inverse distance and
kriging as these are the most commonly used methods for ore grade estimation in
metal deposits. As the only known ore grade values are those provided in the samples,
a part of the dataset is kept out of the information provided to the various methods for
estimation. In other words, some of the samples become the testing points where the
performance of each method is tested. This clearly compromises the overall
performance of each method but unfortunately there is no other objective way of
testing.
The estimation performance will be expressed in terms of the mean absolute
error on the test set and also with graphs of actual vs. estimated (scatter), histograms
of grade distribution, and contour maps of ore grade.
The datasets were of varying complexity and size and therefore presented a
varying difficulty to the estimation techniques used. Table 6.1 summarises the
characteristics of these datasets.
Table 6.1: Characteristics of datasets from the MNNS case studies. 2D Iron Ore 2D Copper 3D Gold 3D Chromite
Total Samples 91 51 112 94
Area/Volume 160,000m2 360,000m2 42,686,028m3 70,010,800m3
Standard Deviation 4.4798 0.3731 0.5521 7.0019
Average Grade 34.59% Fe 0.4658% Cu 0.9316gr/t Au 15.7223%
Results from inverse distance and kriging were obtained using Surfer from Golden
Software in the 2D case studies, and VULCAN in the 3D case studies.
Case Studies of the Prototype Modular Neural Network System
133
6.2 Case Study 1 – 2D Iron Ore Deposit The dataset used in the first case study of the MNNS architecture is a simulated iron
ore deposit [41]. It is a low-grade sedimentary deposit with an average grade of
34.59% Fe. The 91 samples contained are in essence two groups of data: 50 of them
are samples taken at random over the 160,000m2 (400 x 400) sampling area and the
other 41 are taken on a regular 100m grid (Fig. 6.1).
45.30 30.70 40.00 33.30 33.50
30.40 36.70 27.60 34.70
37.90 40.50 31.80 39.80 35.40
32.40 34.70 34.40 28.90
34.10 31.50 39.10 35.50 34.90
33.70 35.40 36.30 34.50
34.90 27.40 27.50 39.00 32.40
26.20 40.00 29.10 39.30
36.60 34.60 38.90 37.90 35.40
34.30
35.50
28.6029.40
41.50
36.80
33.40 36.0030.20
33.20
33.70
34.30
35.30 31.00
27.40
33.90
37.6039.90
27.2034.20
30.2030.40
39.90
40.00
40.60
33.90
32.50
29.60
30.60
40.40
30.10
35.30 41.40
28.50
40.10
24.40 31.60
39.50
34.8029.90
37.80 29.8037.40
27.40
36.50
40.80
32.90
40.00
44.10
41.40
0.00 50.00 100.00 150.00 200.00 250.00 300.00 350.00 400.000.00
50.00
100.00
150.00
200.00
250.00
300.00
350.00
400.00
Figure 6.1: Posting of input/training samples (blue) and test samples (red) from the iron ore
deposit.
The 50 random samples were used for training and validation of the MNNS
networks. They were also used as input data for inverse distance and kriging. The 41
Case Studies of the Prototype Modular Neural Network System
134
grid samples were used for testing all three approaches. The absolute errors produced
by the three methods were as follows:
Table 6.2: Mean absolute errors from case study 1.
Method Mean Absolute Mean Absolute %
Inverse Distance Squared 2.77 8.26
Kriging 2.64 7.90
MNNS 2.60 7.77
Figure 6.2 shows a scatter diagram of the actual vs. the estimated grades from the
various methods.
Iron Ore Grade Values Data Fit
25
30
35
40
45
50
25 30 35 40 45 50 Actual
Estim
ated
KrigingMNNSID2
Figure 6.2: Scatter diagram of actual vs. estimated iron ore grades.
The MNNS is slightly outperforming kriging and inverse distance in this dataset. It
should be noted though once more that this dataset was generated to suit a
geostatistical study and therefore kriging is expected to give good results. It is quite
Case Studies of the Prototype Modular Neural Network System
135
obvious that all methods tend to underestimate in high-grade areas. The reason for
this, at least in the case of the MNNS, is because these areas are close to the borders
of the deposit where the MLP is providing the estimates. The MLP module seems to
give estimates close to the average grade. The performance of the three methods
becomes even clearer by examining the grade distributions below (Fig. 6.3).
Grade Distributions
0
2
4
6
8
10
12
14
16
26.2 28.9 31.1 33.3 35.6 37.8 More
Bin
Freq
uenc
y ActualKrigingMNNID2
Figure 6.3: Iron ore grade distributions – actual and estimated.
From the above figure it seems that the MNNS generates a smooth distribution similar
to that of the inverse distance. Kriging follows better the shape of the actual
distribution. Generally all three methods perform well. The following contour maps
show exactly how close the methods were to the actual values and to each other.
Case Studies of the Prototype Modular Neural Network System
136
Figure 6.4: Contour maps of iron ore actual and estimated grades.
Kriging and the MNNS seem to perform better in different regions except for a part in
the southwest of the deposit where they both perform badly. Lack of enough training
samples is the main reason for the high error level areas produced. The MNNS system
seems to map better the low-grade area on the northwest region while kriging did
better on the southeast.
6.3 Case Study 2 – 2D Copper Deposit The 2D copper deposit in this study is in essence a level from a theoretical open pit
copper mine [36]. It consists of 51 drillhole composites as shown in Fig. 6.5. These
composites cover an area of 360,000m2 and are concentrate mainly in the central part
0 100 200 300 4000
100
200
300
400
0 100 200 300 4000
100
200
300
400
a. Actual Grades b. MNNS Grades
c. Kriging Grades
%Fe
0 100 200 300 4000
100
200
300
400
2627282930313233343536373839
0 100 200 300 4000
100
200
300
400d. Inverse Distance Grades
Case Studies of the Prototype Modular Neural Network System
137
of that area. This data has been used by Hughes et al. [36], Wu and Zhou [112] and
Burnett [12] for testing different estimation methods.
0.175
0.417 0.489
0.215 0.396 0.685 0.377 0.427 0.140
0.392 0.320 0.717 0.806 0.889 0.475
0.230 0.8330.453 0.719 1.009 0.893 0.089 0.092
0.102
0.915 1.335 0.519 0.072 0.0401.3650.023
0.644
0.258 0.638 1.615 0.765 0.465 0.034
0.476 0.409
0.165 0.063 0.406 0.909 0.012
0.228
0.224 0.188 0.027 0.395
0.225
0.00 100.00 200.00 300.00 400.00 500.00 600.000.00
100.00
200.00
300.00
400.00
500.00
600.00
Figure 6.5: Posting of input/training samples (blue) and test samples (red) from the copper
deposit.
The dataset was split in two parts: 30 composites were used for training the networks
and 21 for testing the performance of the MNNS as well as of the other methods. The
inverse distance and kriging estimates were obtained using the same parameters that
Hughes et al. used in their study [36]. The performance of the three estimators in
terms of the mean absolute error on the test data is given below:
Case Studies of the Prototype Modular Neural Network System
138
Table 6.3: Mean absolute errors from case study 2.
Method Mean Absolute Mean Absolute %
Inverse Distance Squared 0.0226 8.21
Kriging 0.0291 7.18
MNNS 0.0258 4.81
Figure 6.6 shows a scatter diagram of the actual vs. the estimated copper grades from
the various methods.
Copper Grade Values Data Fit
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6
Actual
Estim
ated ID2
KrigingMNNS
Figure 6.6: Scatter diagram of actual vs. estimated copper grades.
Once again, the MNNS is performing well compared to the other two methods.
Inverse distance and kriging appear to have very similar performance with their
estimates being very close. Unfortunately, the locations used to test the performance
of the three methods are simply samples that would otherwise have been used as input
information. Unlike case study 1 where there was a good spread of the test samples, in
Case Studies of the Prototype Modular Neural Network System
139
case study 2 and in most of the studies to follow, input data are used for testing, which
means that the spread of the test points is not always ideal. In these cases, testing
takes the form of cross-validation, where the estimator is trying to recreate sample
points from the remaining data set. The actual as well as the estimated copper grade
distributions are shown in Fig. 6.7.
Grade Distributions
0
1
2
3
4
5
6
7
0.2 0.4 0.6 0.8 1.4 More
Bin
Freq
uenc
y ActualID2KrigingMNNS
Figure 6.7: Copper grade distributions – actual and estimated.
The MNNS in this study tends to slightly overestimate grades close to the average but
generally the estimates are well balanced. The other two methods are also performing
well. The contour maps in Fig. 6.8 illustrate the results of grade estimation. The actual
grade map is limited to the sampling area as there is no information outside it. MNNS
is limited to the testing area. Inverse distance and kriging extend to the borders of the
map but comparison should be limited to the testing area.
Case Studies of the Prototype Modular Neural Network System
140
0.00 50.00 100.00 150.00 200.00 250.00 300.00 350.00 400.00 450.00 500.00 550.00 600.000.00
50.00
100.00
150.00
200.00
250.00
300.00
350.00
400.00
450.00
500.00
550.00
600.00
0.00 50.00 100.00 150.00 200.00 250.00 300.00 350.00 400.00 450.00 500.00 550.00 600.000.00
50.00
100.00
150.00
200.00
250.00
300.00
350.00
400.00
450.00
500.00
550.00
600.00
0.00 50.00 100.00 150.00 200.00 250.00 300.00 350.00 400.00 450.00 500.00 550.00 600.000.00
50.00
100.00
150.00
200.00
250.00
300.00
350.00
400.00
450.00
500.00
550.00
600.00
0.00 50.00 100.00 150.00 200.00 250.00 300.00 350.00 400.00 450.00 500.00 550.00 600.000.00
50.00
100.00
150.00
200.00
250.00
300.00
350.00
400.00
450.00
500.00
550.00
600.00
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
1.1
1.2
1.3
1.4
1.5
Actual MNNS
Kriging Inverse Distance
% Cu
Figure 6.8: Contour maps of copper actual and estimated grades.
Inverse distance is by far the worse method in this case. Kriging is doing better but
fails to split the high-grade area. MNNS tends to underestimate the high-grade area
that kriging models very well in the right side of the map. However, MNNS is better
in finding the shape of the high-grade area as well as splitting it to its parts as they
appear in the actual grade map.
6.4 Case Study 3 – 3D Gold Deposit With the third case study, the transition is made from 2D to 3D data. This transition
means that the 3D version of the MNNS architecture is now used. The sample search
methods are not the 2D octant and quadrant methods, but the 3D search scheme
developed specifically for the MNNS.
Case Studies of the Prototype Modular Neural Network System
141
The data used in this case study is part of a larger dataset from a copper/gold
deposit. The original dataset consists of four orebodies developed along fractures in
metasomatised host rocks, which include gneissic granites, mica schists and
metasomatites. In this study, only one of the orebodies was used. The input and test
data were limited to the drillhole samples located inside this orebody (code named
TQ2). The total number of samples was 112.
As the dataset is now 3D, the visualization of the results of estimation
becomes more difficult. Contour maps can only be used to show sections through the
estimated area. Normally, estimation in 3D deposits is made on a block model basis,
but as the actual grade values of the blocks are unknown, the estimation performance
can only be measured over a part of the input dataset.
The orebody model has been created in VULCAN/Envisage during a
geological modeling study based on lithology. Fig. 6.9 shows a 3D view of the
orebody and drillholes (screenshot from Envisage). It should be noted that in this
study VULCAN is used for providing the inverse distance and kriging estimates and
not as an implementation environment for MNNS. The same study including the
complete dataset with four orebodies is repeated in the next chapter using the fully
integrated in VULCAN system, GEMNET II.
Case Studies of the Prototype Modular Neural Network System
142
Figure 6.9: 3D view of the orebody and drillhole samples used in the 3D gold deposit study.
From the 112 available samples, 42 (37.5%) were used for testing the
performance of the three estimation methods. This means that the MNNS had only 70
samples (62.5%) available to train the various networks. After testing with all three
methods, the actual and estimated average gold grades were:
Table 6.4: Actual and estimated average gold grades. Actual ID2 Kriging MNNS
Average (gr/t) 0.9316 0.6524 0.6581 0.7420
The mean absolute error was quite high in comparison with the previous two studies.
Clearly, a three-dimensional orebody is far more challenging and demanding than a
two-dimensional one. The mean absolute errors for the three methods are given
below:
Table 6.5: Mean absolute errors from case study 3. ID2 Kriging MNNS
Case Studies of the Prototype Modular Neural Network System
143
Mean ABS Error 0.4242 0.3939 0.3162
Mean ABS % 44.10% 40.17% 31.60%
The results for inverse distance and kriging were obtained using cross-validation in
VULCAN. Cross-validation was limited to the 42 test samples used for testing the
MNNS. The following figures (6.10) shows the data fit produced by the three
methods.
Gold Grade Values Data Fit
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6
Actual
Estim
ated ID2
KRIGINGMNNS
Figure 6.10: Scatter diagram of actual vs. estimated gold grades.
It is obvious that none of the methods performs very well. The MNNS, even though it
performs better than the other methods, tends to overestimate grades close to the
average value and underestimate the high-grade samples. This becomes clearer in the
next figure (Fig. 6.11) showing the actual and estimated distributions.
Case Studies of the Prototype Modular Neural Network System
144
Gold Grade Distribution
0
2
4
6
8
10
12
14
0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 More
Bin
Freq
uenc
y ActualID2KrigingMNNS
Figure 6.11: Gold grade distributions – actual and estimated.
The distribution shown in the above figure as the actual gold grade distribution refers
only to the test samples and not the entire dataset. However, as it can be seen from the
following graph, this distribution is not very far from the distribution of the entire
dataset. The main differences are in the low and high grade areas were the test set had
less and more samples respectively. This could explain the relatively average
performance of all three methods.
Case Studies of the Prototype Modular Neural Network System
145
Actual Gold Grade Distribution
0
5
10
15
20
25
30
35
0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 More
Bin
Freq
uenc
y
Figure 6.12: Gold grades distribution of the complete dataset.
This study, being the first using 3D data from a real deposit, shows how much more
difficult it is for the estimation methods to perform well than in the case of 2D data
from simulated deposits. The performance degradation, at least for the MNNS, can be
considered as a result of the higher input dimensionality and the higher complexity of
the required mapping.
This study also shows that the MNNS in its first application to 3D data has
outperformed both inverse distance and kriging. What is not clear from this study is
the time difference in applying these methods. MNNS required about an hour to
generate the training pattern files, train the networks, and provide estimates. Kriging
required a complete geostatistical study that, depending on how thorough one wants
to be, can take many hours.
Case Studies of the Prototype Modular Neural Network System
146
6.5 Case Study 4 – 3D Chromite Deposit The dataset used in the final study of the MNNS is taken from a larger samples
database of an undeveloped chromite deposit. There are 94 samples from 26 drillholes
in this dataset. There is no geological study and therefore the estimation is not
constrained by geology. Normally, there should be an orebody to limit the samples
used for the estimation as well as the locations were the estimation takes place, but in
this case the dataset is very small and the lack of geological modelling is not expected
to generate problems. The drillholes from the dataset are shown in Fig. 6.13.
Figure 6.13: Drillholes from a 3D chromite deposit. From the 94 samples, 38 were used for testing the three methods while the remaining
56 were used for training the neural networks and as input information for inverse
distance and kriging. The actual and estimated average chromite grades were as
follows:
Case Studies of the Prototype Modular Neural Network System
147
Table 6.6: Actual and estimated average chromite grades. Actual ID2 Kriging MNNS
Average grade
% Chromite 15.7639 14.7639 15.1511 16.3449
The estimation performance of all three methods was good considering the fact that
there was no limitation as to the samples used due to the lack of a geological model.
This means that the methods were able to estimate grades from samples that do not
necessarily belong to the same geological domain. The mean absolute errors are given
below:
Table 6.7: Mean absolute errors from case study 4. ID2 Kriging MNNS
Mean ABS Error 3.7687 3.3996 2.4536
Mean ABS % 21.83% 19.82% 16.19%
Once again, the MNNS is outperforming the other two methods but this time the
difference is clearer as they all perform well. The MNNS is closer to the actual
average chromite grade and produces the smaller absolute errors from the three
methods. This is verified by the data fit graph and grade distribution chart shown in
the following figures.
Case Studies of the Prototype Modular Neural Network System
148
Chromite Grades Data Fit
0
5
10
15
20
25
30
35
0 5 10 15 20 25 30 35
Actual
Estim
ated ID2EST
KRGESTMNNS
Figure 6.14: Scatter diagram of actual vs. estimated chromite grades.
Chromite Grade Distribution
0
2
4
6
8
10
12
14
16
10 15 20 25 30 35
Bin
Freq
uenc
y Actual
ID2
Kriging
MNNS
Figure 6.15: Chromite grade distributions – actual and estimated.
Case Studies of the Prototype Modular Neural Network System
149
From the above graphs it appears that kriging is doing better at the low to middle
grade samples while the MNNS is doing better at high-grade samples. Inverse
distance tends to overestimate low-grade samples and underestimate high-grade ones.
Generally all three methods are performing well.
6.6 Conclusions The prototype 2D and 3D MNNS architectures were tested in this chapter in four very
different case studies. The datasets used were coming from both simulated and real
deposits. Each dataset had a different type of ore as its target quantity with the
common point being the fact that they were all metal. The number of samples in each
case study was relatively low. These were, however, case studies that aimed at the
development of the modular architecture and not at the establishment of the approach
as a valid ore grade estimation technique. Therefore the low number of samples
allowed easy monitoring of the system’s performance and fast development times.
The performance of MNNS, as measured by the produced absolute errors and
estimated grade distributions, compared very well to the performance of inverse
distance and kriging. MNNS seemed to perform well even on datasets that were
designed to demonstrate the validity of the geostatistical approach. Clearly though,
there could be plenty of space for improvement of the geostatistical studies. That is
always of course at the expense of time and effort.
The speed of development and the independency of the approach to the
knowledge and skills of the user have been demonstrated by these case studies. The
quality of the estimates also has shown that the MNNS architecture is a step in the
right direction for ore grade estimation using artificial neural networks.
GEMNET II - An Integrated System for Grade Estimation
150
7. GEMNET II – An Integrated System for Grade Estimation
7.1 Overview In this chapter the discussion continues with the analysis of GEMNET II, the
integrated system for grade estimation developed by the author and based on the
Modular Neural Network System described in the previous chapter. GEMNET II is
mainly written in C and uses parts of the SNNS, the Stuttgart Neural Network
Simulator from the University of Stuttgart, Germany [97]. The GEMNET II core
program is a data processing and control module written in C that processes the
samples file as well as the block model file. The core program also makes external
calls to parts of the SNNS simulator. These parts are the main simulator kernel, the
batch execution language (BATCHMAN), and the C code extraction program
(SNNS2C) that converts the trained neural network topologies to C functions. The
development of neural networks is controlled by a number of scripts written in the
SNNS batch language, which is very similar to AWK and C.
GEMNET II is integrated within VULCAN, a leading software package for
resource modelling. The control of the system is done through ENVISAGE,
VULCAN's graphical editor that provides the graphical user interface for GEMNET II.
The interface between GEMNET II and VULCAN is based on a number of scripts
written in a very popular scripting language called Perl. Specifically for VULCAN
there are a number of extensions to Perl, which are called Lava extensions. These give
access to graphical objects and routines in ENVISAGE, which are very useful for
integrating external programs like GEMNET II.
GEMNET II - An Integrated System for Grade Estimation
151
The integration of GEMNET II with SNNS and VULCAN provides the
following additional functionality that was missing from the MNNS as a standalone
system:
• Ability to try practically every neural network architecture without having to
modify the core of the system;
• Faster training and application of neural networks during the estimation process;
• A graphical user interface that is easy to learn and use;
• Direct access to an integrated modelling environment allowing the incorporation
of the estimation results in a larger scale modelling operation;
• Estimation based on the advanced block modelling that VULCAN provides;
• 3D visualisation of the drillhole samples and the targeted block model;
• 3D visualisation of the estimation results and validation of the training process;
• Straightforward comparison of GEMNET II with other estimation packages and
techniques incorporated in VULCAN, like the geostatistical packages GSLIB,
Geostokos, and ISATIS;
• Data management based on VULCAN's project file structure;
• Estimation reliability measures.
The MNNS core has also been modified to improve the estimation process and
provide a number of reliability measures. The next section gives the details of the core
architecture and shows how it was implemented using the SNNS simulator. It should
be noted that there were some changes in the names of modules in the MNNS.
GEMNET II - An Integrated System for Grade Estimation
152
7.2 Core Architecture and Operation
7.2.1 Exploration Data Processing and Control Module This is the main part of GEMNET II. It is a program written entirely in C with the
code being compatible with both Microsoft Windows based PCs and UNIX based
workstations. It is responsible for processing the drillhole samples file and the block
model centroids file, normalisation of the data, generation of training patterns for the
various networks, and for making all the necessary external calls to the neural
network simulator (SNNS). Once the development of neural networks is completed
and the C code extracts have been compiled, this module carries on with the
estimation process. Figure 7.1 shows schematically the operation of this module.
The first operation of the data processing and control module is to read the
samples file and place the sample co-ordinates and assay values (grades) in a number
of arrays. This is done to increase the speed of the search process later on. The
samples file is normally a map file generated by VULCAN’s compositing function.
The map file contains a header describing the file structure and records consisting of
sample ids, sample co-ordinates, and assay values.
The values of the arrays are normalised so that all co-ordinates and assay
values vary between zero and one. As was explained before, this ensures that the
effects of the range of values are eliminated from the neural network training process.
The normalisation information (minimum, maximum, and range values) is stored in a
file that will be used later to restore the initial values and to ensure that the estimates
will also be in the correct range of values. Figure 7.2 shows the normalisation
information as reported by GEMNET II in VULCAN. The contents of the normalised
arrays (sample co-ordinates and grade) are written in a file used for training the
second module’s network. As it was mentioned before this network is trained on the
GEMNET II - An Integrated System for Grade Estimation
153
entire dataset but is used to provide estimates only where there are not enough
neighbour samples for the networks of the first module.
Figure 7.1: Simplified block diagram showing the operational steps of the data processing
and control module in GEMNET II.
GEMNET II - An Integrated System for Grade Estimation
154
Figure 7.2: Normalisation information panel.
The next step is the application of the search method. Each sample is taken as
the centre of the search scheme. The space around the centre sample is divided into
the six sectors described in the previous chapter. The centre sample is in essence the
training point for the RBF networks of the neural network modules. All the remaining
samples are assigned to one of the sectors depending on their relative location to the
centre of the search. It should be clear that as the discussion is about samples with an
associated volume, their location is identified as the centroid of the volume. The
normalised distance of each neighbour sample to the centre is calculated and stored
together with the normalised neighbour sample and centre sample grades in one of six
files, one for every sector.
At the end of the search process, there are six files each containing a different
number of samples depending on the geometrical characteristics of the drilling
(sampling) scheme. In fact the number of training patterns is equal between opposite
sectors, e.g. the north sector has equal patterns with the south sector, etc. Therefore
the networks of the second module in GEMNET II are trained on different number of
samples, while in the MNNS the number of samples was constant. This is because in
the MNNS only one neighbour was selected from every sector for every sample while
GEMNET II - An Integrated System for Grade Estimation
155
in GEMNET II the number of samples depends only on the available samples in a
sector. This is a fundamental difference between the two implementations of the
modular architecture. The networks in the MNNS were not provided with all the
available information on the effects of the sample distance as they were trained on
only one neighbour sample per sector. In GEMNET II the six RBF networks are
trained on all the information available in order to build a more complete model of the
distance-grade relationship.
However, there is one implication brought by the above change. The final
network trained on the outputs of the first module’s networks needs to be trained on
complete patterns. As each network is trained on different samples there is no
synchronisation in their training process, i.e. these networks are trained sequentially
and on different centre samples. This problem is rectified by the use of test files.
Together with the six training files produced by the search process, there are six test
files that contain patterns formed using the closest neighbour in each sector and only
for the centres where all sectors have at least one neighbour sample. This way the
trained networks can be synchronised and provide individual estimates for the same
centre samples, which can then be used for training the final module network.
All the pattern files created need to be converted in a format compatible with
the neural network simulator used (SNNS). This is fairly easy to do as SNNS reads
ASCII pattern files with a very straightforward structure. An example of a training
pattern file generated during a GEMNET II case study is given in Appendix A.
The data processing module operation proceeds with the processing of the
block model centroids file. Again this file is generated in VULCAN using a block
model export option that calculates the centroid co-ordinates and volume of each
GEMNET II - An Integrated System for Grade Estimation
156
block. The centroid co-ordinates are real world co-ordinates and not relative to the
origin of the block model.
The block model centroids are normalised and passed one at a time to the
centre of the same search scheme used for the drillhole samples. This normalisation
uses the same parameters used for the normalisation of the drillhole samples to ensure
that their relative locations are preserved. The search process is exactly the same only
this time the place of the search centre has been taken by block centroids and only one
neighbour sample is selected – the nearest – from each sector. The neighbours are
again drillhole samples. Each block is flagged depending on the existence of a
neighbour sample in each sector. There is one flag for each sector, which is turned to
one if there is a neighbour or zero if there is not. These flags are written in a file
sequentially and are used during the estimation process to control the usage of the
individual networks. This will be discussed later when the estimation process is
described.
The grade, distance, and length of the neighbour samples from the six sectors
are written in an input pattern file for the first module’s networks. The centroids of the
blocks are written in another input pattern file for the second module’s network. As it
was mentioned, the choice between module one and two during estimation is
controlled by the file containing the flags from the search process.
After the processing of the block model is completed, the data processing and
control module continues with the most important aspect of the operation of GEMNET
II: the neural network development. The module makes a number of external calls to
the SNNS executables. These calls are arranged in command line batch files. The first
set of calls is targeted at BATCHMAN, the SNNS batch language for neural network
development. The calls include as arguments the name of the batch program to be
GEMNET II - An Integrated System for Grade Estimation
157
executed as well as a log file name where all the messages from the development
process are to be stored.
The batch programs are written in the SNNS batch language that is very
similar to AWK and C. The batch language provides access to every function of the
SNNS kernel: all the neural network architectures and learning algorithms. The batch
programs that come with GEMNET II control the development of all the employed
neural networks. The beauty of this approach is that by simply changing the batch
program, one has complete control over the learning process. As the batch program is
just a text file, an external process, such as VULCAN’s graphical user interface can
easily alter it. This way the complete control of GEMNET II neural network
development is passed to the interface with VULCAN. An example of a batch
program from GEMNET II is given in Appendix A.
The first batch programs train the networks of module one and two using the
training patterns. The log files are written for these networks during this process.
After training terminates, the test patterns are presented to these networks to provide
synchronised outputs, i.e. individual estimates for the same samples. These outputs
are written into ‘results’ files, which are subsequently used for generating the training
patterns for the final module network. The last of the batch programs trains the final
module network. Once this process terminates, the neural network development is
complete.
The trained networks at this stage are in the form of SNNS network files –
ASCII files containing the network topology and the weights and biases after training.
These networks now need to be converted into C functions to be used during the
estimation process. The data processing and control module makes the necessary calls
to the C code extraction utility provided with SNNS, the SNNS2C. This utility creates
GEMNET II - An Integrated System for Grade Estimation
158
both the header (.h) file and the code (.c) file from the network file. Examples of a
trained file as well as the respective header and C code file are given in Appendix A.
The module calls the SNNS2C to convert all the trained networks to C code. All that
is left then in order to use the networks is to compile them and link the headers with
the application. Upon completion of this process, GEMNET II is ready to provide
grade estimates in unknown locations.
The final operation of the data processing and control module is grade
estimation on a block model basis. The program reads the flags file and uses module
two whenever the flag value is one or module one whenever it is zero. The input
pattern files generated during the block model processing described above provide the
input values for the network function calls. The final module network is then called
using the outputs of module one and two network functions. The data processing and
control module de-normalises the final estimate and the block model centroids and
writes them in an estimates file. Together with the centroid co-ordinates and grade
estimate, the module also writes the variance of the individual estimates from module
one and two networks as well as the flags showing which networks are responsible for
the estimate. These extra parameters are used to validate the estimation process and
identify any problematic areas or networks.
After the estimation process is complete the data processing and control
module terminates. The main and most important part of GEMNET II operation is
complete. The main contributing parts of this operation are shown in Fig. 7.3. It is
becoming more and more clear that during this operation there is only a minimum of
human interaction required.
GEMNET II - An Integrated System for Grade Estimation
159
Figure 7.3: Interaction between GEMNET II and other parts of the integrated system
during operation of the data processing and control module.
7.2.2 Module Two – Modeling Grade’s Spatial Distribution The second neural network module in GEMNET II consists of the RBF network, as
described in the MNNS architecture, as well as the batch program that controls its
learning process. It is presented before the first module for consistency reasons. In
contrast with MNNS, the learning process is not part of the main program but it is
implemented in the SNNS batch language.
GEMNET II - An Integrated System for Grade Estimation
160
There is no difference in the RBF network topology for this module between
the MNNS and GEMNET II. However, there are major differences in the learning
process for this network. Most of the case studies ran using GEMNET II involved
considerably large datasets – more than a thousand samples. The learning process had
to be improved to cope with the abundance of training data.
One of the most important changes was in the initialisation of the network. In
MNNS, this was simply done by randomly placing the RBF centres in the input space.
In GEMNET II this was found to be inadequate due to the large number of samples
defining the input space. A more ‘intelligent’ way of locating the centres has been
employed: Kohonen learning. Before the network is trained and its weights adjusted,
the input patterns are clustered using a process of self-organisation known as
Kohonen learning (Chapter 2). This process ensures that the input samples are
clustered according to their statistical properties and an RBF centre is allocated to
each cluster. The random positioning of the centres is still taking place right before
this process to accelerate the initialisation stage of the development. The clustering
process is accelerated, as its starting point is a random spread of centres in the input
space.
Initialisation continues with the weights between hidden and output layer as
well as the bias of the hidden units. The initialised network topology is saved in a
network file for further examination in the validation stage.
Following the initialisation of the network’s input-hidden layer weights (centre
positioning), two learning stages take place. As was mentioned before, RBF learning
has to concentrate on one free parameter at a time. The learning process becomes
unstable if more than one parameters are allowed to change. Therefore, a separate
learning process is allocated for the hidden-output layer weights and the bias of the
GEMNET II - An Integrated System for Grade Estimation
161
hidden units. The learning parameters are set to experimental values that were found
after a large number of tests. The learning process for these two parameters is
identical to the one used in MNNS. Training is stopped again when the change in the
network’s output error becomes very small. The trained network topology is saved in
a network file.
The final operation of the batch program is to pass the test pattern file through
the network and write the results in a text file. This file can be used for generating a
scatter plot of actual vs. estimated grade for the specific network. This will be shown
later when the validation tools provided by GEMNET II are described.
During this development process all the messages coming from the simulator
are stored in a log file that can be opened with a text editor for examination.
Examining the log file as well as the initialised and trained network files can draw
useful conclusions about the effectiveness of the training process. The author used
these files as a guide for setting the learning parameters and the required number of
cycles. The network files provide a very useful piece of information: the location of
the RBF centres in the normalised input space (Fig. 7.4). This will prove to be very
important for validating the network’s learning and estimation performance.
GEMNET II - An Integrated System for Grade Estimation
162
Figure 7.4: RBF centres from second module located in 3D space. Drillholes and
modelled orebody are also shown.
7.2.3 Module One – Modelling Grade’s Spatial Variability The changes in the learning process for the RBF networks of the first module are
exactly the same with the second module. The initialisation procedure makes use of
Kohonen learning for locating the RBF centres in the input space. From the discussion
on the data processing and control module it is clear that the six RBF networks of
module two are trained separately and in sequence. The learning procedure is
identical. However there is one problem that became clear during testing. Because of
the geometry commonly found in most sampling schemes the drillholes are arranged
in sections typically perpendicular to the orebody. This can lead to some sectors of the
RBF t
GEMNET II - An Integrated System for Grade Estimation
163
search scheme being overcrowded while others having a low number of samples. As
there is no way of knowing in advance which sectors will be overcrowded and which
not the training of the networks can be unbalanced, i.e. some networks have many
samples to learn but the same number of training cycles to do it with others who have
only a few training samples.
The solution to this problem is a number of filters introduced between module
two networks and the data processing and control module. These filters allow samples
inside a distance range to pass as training patterns to the networks, while they hold
samples that are further than a certain range. It should be noted that the criterion is the
distance range, i.e. percentage of the maximum distance between samples, and not
absolute distance. By adjusting the search range the number of samples can be limited
and the networks can be trained on similar number of training samples.
A very interesting issue with the first module networks is the visualisation of
the RBF centres. In the second module, the input space is the ‘real’ 3D space defined
by the drillhole samples’ co-ordinates and therefore visualisation of the RBF centres
is straightforward. In the first module networks though, the input space is not the 3D
space of real world co-ordinates, but the hyperspace defined by the distance, grade,
and length of neighbour samples. In order to visualise the RBF centres, this space is
constructed in Envisage using the training input patterns. A new mapping window is
constructed by substituting the three co-ordinates (easting, northing, and elevation)
with the grade, distance, and length of samples. The training samples and RBF centres
can then be visualised in this hyperspace (Fig.7.5).
GEMNET II - An Integrated System for Grade Estimation
164
Figure 7.5: RBF centres of west sector RBF network and respective training samples in the input pattern hyperspace (X-Grade, Y-Distance, Z-Length).
It is somehow difficult to understand the way samples are placed in this
hyperspace as well as how the RBF centres are located. However, after careful
examination of images like the one in Fig. 7.5, the distribution of samples becomes
clearer. A very interesting finding is that samples being chosen as neighbours in a
specific direction appear to form lines of constant X-Grade and varying Y-Distance.
This of course should have been expected, but pictures like this help to understand
even further the characteristics of the input space.
7.2.4 Final Module – Providing a Single Grade Estimate The final module consists of a single RBF network responsible for weighting
the individual estimates of the first and second module networks. This network does
not model the grade in an input vector space. It simply tries to model the relationship
between the responses of the first and second module networks and the actual grade
GEMNET II - An Integrated System for Grade Estimation
165
values. This network is completely ‘unaware’ of sample co-ordinates or neighbour
sample grades, distances, and lengths. The only information provided to this network
is the required output (actual grade at estimation point) and the estimates of the
individual networks.
The purpose of this network is to replace the simple averaging that was the
way of providing a single estimate from the various networks in the earlier
architectures. During testing it was found that the final estimate can be brought even
closer to the actual value by weighting the individual network estimates. One could
argue about the use of an artificial neural network for this task, and in fact the author
had many recommendations by other researchers in the field of AI that did not suggest
the use of an ANN or specifically an RBF network. However, the RBF network of the
final module proved to be at least good enough for this weighting task and with this
project being dominated by the use of ANNs, the author did not look any further. It
should be noted though that different ANN architectures were tested.
The RBF network of the final module is shown in Fig. 7.6. It is a simple 3D
representation of this network and the location of the RBF hidden units has nothing to
do with the positioning of the RBF centres before or after training.
GEMNET II - An Integrated System for Grade Estimation
166
Figure 7.6: Final module’s RBF network.
A training process very similar to that of the other neural modules determined
the number of RBF centres and their location in the input space. Unfortunately, due to
the high dimensionality of this network’s input space (6D) it is not possible to use
Envisage or any other graphical environment for the direct visualisation of the RBF
centres and training samples in the correct input space. It is only possible to examine
the learned model using any three of the six inputs at a time.
The training process for this network involves the results of the previous
networks on the test samples and not on their training samples. This was necessary to
allow complete freedom in the number of samples used for training the first module’s
networks. However, the author believes that this could be a source of inefficiency for
the complete architecture as this is the final RBF network that controls the final
estimate produced. If the test samples are not representative of the dataset then the
RBF network of the final module could have difficulties in providing reliable results.
This is an aspect of GEMNET II’ operation that needs monitoring. The author suggests
GEMNET II - An Integrated System for Grade Estimation
167
that the distributions of grade estimates from the various first and second module
networks are compared with the final module network estimates.
The validation of the system’s operation during neural network development
as well as during grade estimation has been a consideration of the author since the
beginning of GEMNET II development. This fact led to the development of validation
tools specific to GEMNET II and implemented using VULCAN’s graphical
capabilities. These are the subject of the next section of this chapter.
7.3 Validation
7.3.1 Training and Validation Errors The first and most common way of measuring a neural network’s performance
is by calculating its estimation error on the training or validation pattern set. The
training error is less important as it reflects the performance of the network on
samples that it was trained to perform well. In other words, the training error is not a
good measure of a network’s performance. However, the training error can indicate
problems in the learning process that can be due to inadequate number of samples or
training cycles or both. If a network cannot reach an acceptable error level regardless
of the number of training cycles, then the learning algorithm needs to be modified or
the number of samples to be increased. One has to monitor the progress of the training
error curve cycle after cycle in order to conclude as to the origin of high training
errors.
A more representative and reliable measure of a network’s performance is the
validation error. A good learning algorithm should normally be based on the
validation error to guide the weight changes but even if this is not the case, a
validation pattern set can help build confidence on the learned mappings. In the case
of GEMNET II and samples from drillholes, generating a validation set and using it
GEMNET II - An Integrated System for Grade Estimation
168
for measuring its performance is not an easy task. In geostatistics and other more
conventional methods the developed estimation technique is validated using the
process of cross-validation. Cross-validation is in essence the regeneration of the
samples by hiding one at a time and trying to estimate it using the remaining samples.
In the case of the neural networks in GEMNET II, this is what the training process
does. In other words, cross-validation is not applicable in the case of GEMNET II
because it can give very misleading results.
On the other hand, by hiding samples from the training process to use them as
a validation set automatically means that GEMNET II has less samples to train the
networks and therefore less chances of producing good results on the validation set.
This is especially applicable when the system is dealing with a very complex orebody
that requires as many samples as possible to describe its grade behaviour in space.
With this consideration in mind, the author suggests that a validation set be
generated at first to measure the networks’ generalisation performance. If the
validation errors are acceptable then the networks should be retrained using the same
training process but including the samples of the validation set to ensure that the best
possible mappings are generated.
7.3.2 Reliability Indicator The learning process in GEMNET II is a relatively more complex process than in other
systems as it involves a very modular neural network structure. It is important to see
the final estimate produced as the result of the weighting of individual estimates.
Therefore by measuring the variance of these estimates one can conclude as to the
reliability of the final estimate. In other words, the higher the agreement between the
individual estimates the higher the reliability of the final estimate and vice versa. The
variance of the individual estimates will be mirrored by the weight values of the final
GEMNET II - An Integrated System for Grade Estimation
169
network. A combination of very high and very low weight values in the final network
express the difficulty of the final network in getting close to the actual grades.
The variance of the first and second module networks’ estimates has been used
as the basis of a reliability measure or reliability indicator. This is calculated during
the estimation process. In VULCAN, the user has to add an extra variable to the block
model to be used by GEMNET II for storing the reliability indicator for each block
estimated. After the estimation process, the block model can be visualised in 3D or in
sections with a colour scheme based on the reliability indicator (Fig. 7.7). This way it
is possible to identify areas where GEMNET II has difficulties to provide an estimate.
The reliability indicator though cannot lead by itself to the origin of the problem or
even quantify it. It is strictly an indicator, i.e. a guide that can help identify problems.
Figure 7.7: Block model coloured by the reliability indicator in GEMNET II.
GEMNET II - An Integrated System for Grade Estimation
170
7.3.3 Module Index Another useful source of information is the flags stored in the flags file during the
processing of the block model centroids file by the data processing and control
module. This file consists of records with six flags each, one for every sector. The flag
values are one for sectors with neighbour samples and zero for empty sectors. These
values are used during estimation for choosing between the first (sector flag = 1) or
the second module networks (sector flag = 0).
These flags are stored in the block model. Specific variables have to be set in
the model to contain the flag values. The block model can then be visualised in
Envisage using a colour scheme that depends on the flag values or module index (Fig.
7.8). By combining the module index and the reliability indicator, it is easy to identify
the networks that can present problems during estimation.
Figure 7.8: Block model coloured by module index in GEMNET II. Cyan blocks represent
first module estimates while red blocks represent second module estimates.
GEMNET II - An Integrated System for Grade Estimation
171
7.3.4 RBF Centres Visualisation The RBF centres location in the input vector space is absolutely crucial to the
performance of an RBF network. The RBF centres visualisation tool has been
developed specifically for GEMNET II in Envisage and allows the displaying of both
the centres and the training samples of any RBF network from the modular
architecture (Fig. 7.9). This option loads the RBF centres using a special symbol on
the screen and also the training samples as crosses. The correct input space is used,
i.e. the 3D real world co-ordinates space for the second module and the neighbour
sample grade, distance, and length input space for the first module.
Figure 7.9: First module RBF centres visualisation in GEMNET II. Drillholes and orebody
model are also shown.
GEMNET II - An Integrated System for Grade Estimation
172
Clearly this is an option for the users who will know the basics of the system’s
operation; otherwise it will not be very useful. By looking at the positions of the RBF
centres, one can decide whether the network initialisation procedure is efficient and
whether the learned mapping is reliable. A well spread distribution of centres in the
input space with a high density of centres in areas where grade seems to present a
complex behaviour suggest that the network has been properly developed. High
density of centres in areas with very few or even no samples means that the
initialisation and training process needs to be modified. Usually an increase of the
number of initialisation or training cycles is required, or an increase of the learning
parameters.
7.4 Integration
7.4.1 Neural Network Simulator Development of neural networks in GEMNET II is based on the Stuttgart Neural
Network Simulator (SNNS) developed at the Institute for Parallel and Distributed
High Performance Systems (IPVR) at the University of Stuttgart, Germany. SNNS
was originally developed for the UNIX operating system but was recently ported to
the Microsoft Windows 95/NT environment. It is still based on X Windows and
requires an X Server in Windows 95/NT for the graphical user interface. Figure 7.10
shows a schematic diagram of its main components.
GEMNET II - An Integrated System for Grade Estimation
173
Figure 7.10: Diagram of the main components of SNNS. The four main components of SNNS are the simulator kernel, graphical user
interface, batch execution language (BATCHMAN), and network C code extraction
tool (SNNS2C). The graphical user interface is not used in GEMNET II as this is
provided by Envisage in VULCAN. The other three parts - mainly BATCHMAN and
SNNS2C - are extensively used. The simulator kernel includes a number of functions
for:
• Network manipulation
• Network structure definition
• Cell (processing element) definition and manipulation
• Learning
• Pattern manipulation
• Pattern propagation
• Network and pattern file handling
• Error calculations
• Memory management
GEMNET II - An Integrated System for Grade Estimation
174
The batch execution language in SNNS, BATCHMAN, has been modelled after
languages such as AWK, Pascal, Modula2 and C. BATCHMAN provides a
command line or scripting interface to the simulator kernel. It is possible to send
commands directly in interactive mode using the interpreter or execute complete
batch scripts by calling BATCHMAN with the batch script file name as an
argument. The structure of the batch scripts or programs is not predetermined.
There are a number of system variables available for monitoring the development
of the networks. These can be used during training to create more advanced
training algorithms. The available system variables are:
Table 7.1: System variables available in BATCHMAN.
SSE Sum of squared differences of each output neuron
MSE SSE divided by the number of training patterns
SSEPU SSE divided by the number of output neurons
CYCLES Number of cycles passed
PAT Number of patterns in the current pattern set
EXIT_CODE Exit status of an external call
SIGNAL Integer value of a caught signal during execution
There is a total of eight batch programs in GEMNET II for the development of the
eight RBF networks. These programs are very similar to each other and generally
follow the same steps:
1. Load untrained network file and training and testing pattern files
2. Initialise the network using Kohonen learning
GEMNET II - An Integrated System for Grade Estimation
175
3. Write the initialised network to a file
4. Train the network’s hidden-output layer weights
5. Train the network’s hidden units’ bias
6. Write the trained network to a file
7. Test the network using the test pattern file and write the results to a file
BATCHMAN is called from the data processing and control module using the scripts and a name for the training log file. BATCHMAN runs the scripts and writes all the messages during the steps described above to the log file. After all eight scripts have been executed, control is passed back to the data processing and control module. The user can open the log files with a text editor to get more information about any possible problems as well as the training and validation errors. From the execution of each script the following files are created:
<network name>ini.net: initialised topology (e.g. eastini.net)
<network name>tr.net: trained topology (e.g. northtr.net)
<network name>.log: training log file (e.g. east.log)
<network name>.res: results of testing (e.g. east.res)
The other SNNS tool used in GEMNET II is the network compiler SNNS2C.
This tool compiles a network file into an executable C source code. There are
limitations as to the network types and other SNNS features supported by SNNS2C,
but fortunately none of them causes any problems to the GEMNET II modular
network architecture. SNNS2C supports all the necessary features for GEMNET II.
The input to SNNS2C is the trained network file as created from
BATCHMAN after executing the batch scripts of GEMNET II. SNNS2C generates
ANSI-C source code and header files. The generated code is compiled separately. The
header files are linked to the data processing and control module. This way the
produced network C functions are linked to GEMNET II and can be called during
grade estimation. During network compilation SNNS2C goes through the following
steps:
GEMNET II - An Integrated System for Grade Estimation
176
1. Network loading: the network file is loaded with the function from the
simulator kernel.
2. Dividing network into layers: individual units are grouped into layers with
the same type and activation function.
3. Layers sorting: the layers are sorted in topological order.
4. Network writing: The generated network structure, activation functions and
pattern propagation is written to the C source file.
Altogether, SNNS proved to be very useful for the development of neural
networks in GEMNET II. The flexibility provided by the batch execution language
and the very large library of network types, activation functions, and learning
algorithms provided by the simulator kernel allowed quick and easy testing of
different learning strategies and network architectures. It would be very time
consuming, if not impossible, to do the same development and testing without the
simulator, using hard-coded neural networks and learning algorithms.
7.4.2 Interface with VULCAN – 3D Visualization Grade estimation is part of a much larger process that involves other tasks such as
geological modelling and reserves estimation. In order to exploit the full potential of
GEMNET II, it has to be integrated in this larger process of mineral deposit evaluation
[39]. This was achieved using VULCAN, one of the leading earth resources
modelling packages available for the mining industry.
VULCAN is a modular package, i.e. it consists of a core module (VULCAN
Modeller) and a number of specialised modules like the MineModellers,
GeoModellers, SurveyModeller, and Chronos (scheduler) (Fig. 7.11). VULCAN can
GEMNET II - An Integrated System for Grade Estimation
177
be customised to include the functionality required by specific projects and for that
reason this system has all the necessary features that allow third party software to be
interfaced to it.
VULCAN’s user interface, Envisage, is an advanced 3D modelling
environment that provides advanced 3D CAD and visualisation as well as
triangulation modelling, grid mesh modelling, and contouring [57].
VULCAN’s GeoModellers provide functions for drilling, borehole
visualisation, channel sampling, geological modelling, geostatistics, block and grid
modelling, stratigraphic modelling, and other tasks. For geostatistics, the
GeostatModeller can be interfaced to the GSLIB, Geostokos, and ISATIS
geostatistical packages. Block models can be visualised in 3D and manipulated in
many different ways. GEMNET II relies on the importing and exporting functions
available for block models in VULCAN as well as the drillhole compositing
functions.
Envisage provides customised user menus, i.e. users can create their own
menus that look and act exactly like the rest of the GUI and can provide the functions
that the user wants. These functions can be directly linked to a Perl script
(VULCAN’s supported scripting language), which means that users can add
functionality to the system. GEMNET II is interfaced to VULCAN by a number of
scripts written in Perl and utilising the extensions for VULCAN, called Lava.
GEMNET II - An Integrated System for Grade Estimation
178
Figure 7.11: Modules and extensions of VULCAN.
The structure of the user interface is shown in Fig. 7.12. The menu for
GEMNET II includes options for setting the estimation parameters, network
topologies and learning, and validation.
GEMNET II - An Integrated System for Grade Estimation
179
Figure 7.12: Menu structure of GEMNET II in Envisage.
There is a main menu and two sub-menus for the setup and validation. All options
lead to panels that accept user input from the keyboard. These panels (Fig. 7.13)
access the options available with GEMNET II and allow the user to do the following
things:
1. Select samples file and block model
2. Modify the learning method and network topologies
3. Run GEMNET II with the saved specifications
4. Display the block model using the reliability indicator or the module index
5. Display the input samples and RBF centres in the correct input space
GEMNET II also requires functions already built into Envisage. These include:
GEMNET II - An Integrated System for Grade Estimation
180
1. Drillhole compositing
2. Block model ASCII import/export functions
3. Block model display functions
Figure 7.13: GEMNET II panels in Envisage.
After the user selects the input and output files for the estimation process,
GEMNET II can start the network development. The data processing and control
module is called using the Run option from the main menu. A console window is
opened and GEMNET II begins with the processing of the samples and the generation
of the training pattern files (Fig. 7.14).
GEMNET II - An Integrated System for Grade Estimation
181
Figure 7.14: Console window with messages from GEMNET II operation.
The data processing and control module continues its operation in the
background while the user can carry on using Envisage. Once the network
development is complete and the networks are compiled, grade estimation takes place.
The results are written to a file selected by the user. This file can then be imported to
the block model. The user can then validate the estimation process using the tools
described and compare the results with other studies using geostatistics within the
Envisage environment.
VULCAN’s online help is based on a web browser and HTML files for each
and every option. A number of pages were added to provide help for GEMNET II. The
help is context based, i.e. it depends on the function that the user is trying to access
(Fig. 7.15).
GEMNET II - An Integrated System for Grade Estimation
182
Figure 7.15: GEMNET II online help.
The system operates in a very similar manner to other functions in Envisage,
which means that users can get familiarised with GEMNET II in a very short period of
time.
7.5 Conclusions In this chapter an in-depth discussion was given on GEMNET II, the integrated system
for grade estimation based on artificial neural networks. The benefits of the approach
were explained and in particular the advantages of the integration with the neural
network simulator, SNNS, and the resources modelling package, VULCAN.
Even though GEMNET II is based on the basic MNNS architecture described
in the previous chapter, there are many improvements that help GEMNET II be a
much more usable system.
GEMNET II - An Integrated System for Grade Estimation
183
The system has many advanced features that can establish it as a commercial
product. It provides validation tools that can help build confidence to the estimates
while it removes most of the problems found in other grade estimation techniques.
GEMNET II makes very few assumptions about the grade distribution. Its operation
does not depend on the user’s knowledge of geology, geostatistics, or even neural
networks. It should be noted though that knowledge of neural networks could improve
sometimes the results but not significantly. Generally, the system adjusts to the data
presented to it to achieve the best possible estimation.
Even though it is based on artificial neural networks, GEMNET II is not a
‘black box’ approach. The technique is fairly understandable as it is based on
established principles of grade spatial behaviour. The validation tools provided with
GEMNET II and the exhaustive monitoring of the network development also help the
user to understand how it works and why. In the next chapter the validity of the
approach will be proved through a number of case studies using real 3D data from
different deposits around the world.
GEMNET II Application – Case Studies
185
8. GEMNET II Application – Case Studies
8.1 Overview The case studies presented in this chapter are the final tests of the GEMNET II
architecture. Their purpose was to demonstrate the full potential of the approach and
provide a complete comparison with other estimation techniques. They are presented
in order of increasing complexity and difficulty. The number of available samples
increases as well as the structural complexity of the deposits.
The data used in these case studies come from real deposits. In some of them
the 3D co-ordinates of the samples have been changed without affecting their relative
locations for confidentiality purposes. The number of case studies was limited to four
as in the previous chapter. The selected case studies are the most representative of
GEMNET II performance while they are quite different between them. These studies
are also ideal for geostatistics and in fact have been used for demonstrating
grade/reserves estimation using computer software. However, no results have ever
been published using this data other than the papers written by the author during this
project.
The deposits in the four case studies that follow present a complex 3D
structure. They all come with a complex geological model, which is used for
constraining the estimation process. The geological model in some cases becomes
more complicated by the presence of faults and other discontinuities. This factor
makes grade estimation an even more challenging task.
In all of the case studies, a complete geostatistical study has been performed,
the results of which are presented in this chapter together with the study of GEMNET
II application. Unfortunately, the author was not able to get authorization for
publishing results from case studies other than copper/gold deposits. There seems to
GEMNET II Application – Case Studies
186
be an abundance of real copper/gold data available from fully exploited or
undeveloped deposits. The same does not apply for other metals and minerals.
The four copper/gold deposits used for testing the estimation performance of
GEMNET II have very little in common. Except from the type and possibly the way
they have been formed, these deposits present a very different 3D picture and a very
different estimation task. Their size and geometry varies significantly as does the
grade distribution suggested by the available samples.
The available samples for each of the four deposits vary in number
considerably. The drilling geometry is also different as is the assaying procedure.
These differences ensured that GEMNET II would be tested on very different
conditions and data and that the results would reflect its performance over a wide
range of problems. Table 8.1 gives the main characteristics of the four deposits used
in this chapter. The data from them are given in the accompanying CD-ROM.
Table 8.1: Main characteristics of the four deposits used for testing the final GEMNET II
architecture.
Name MAC_DEMO THOR SME GEOST_GOLD
Number of Samples 1361 3612 10,656 30,211
Estimated Grades Au, Cu Au Cu Au
Number of Orebodies 1 4 5 1
As can be seen from the table, the deposits are given code names. These names are
used as a replacement of their original name and location for confidentiality purposes.
The same computer system has been used for all four case studies. It was a
Pentium II 300MHz with 128Mb RAM and 1Gb of virtual memory space running
under Microsoft Windows NT 4.0. The time required to complete each case study has
GEMNET II Application – Case Studies
187
been affected by the specifications of the system and therefore comparison with other
similar studies should not be made unless these specifications are the same. The
geostatistical studies were also performed using the same computer.
GEMNET II was running from VULCAN/Envisage version 3.3. Geostatistics
were running also from the same environment using GSLIB. Therefore the same
computational overhead from VULCAN has been present while the various
approaches were tested.
The measures of performance for the three approaches compared were the
mean absolute error, the data fit diagram (scatter plot), and the estimated vs. actual
grade distribution diagram. These performance measures were based on samples
taken out of each dataset that were not provided as input information for any of the
three techniques. In other words, these were unknown samples for the estimators but
not for the performance measures. This was considered by the author as a more
objective way of comparing the various techniques, as the actual values of those
samples were known as opposed to block estimates of unknown actual grade. For
GEMNET II, the reliability indicator values are also shown in slices through the
estimated block model. It should be noted again that the reliability indicator is only a
guide to the quality of the produced estimates from GEMNET II and not a precise
performance measure.
For the case studies were the deposit consists of more than one orebodies, the
samples were split into groups, one for each orebody. The same was applied for the
block model. In each run only data from inside an orebody are used and only blocks
inside the same orebody are being estimated. As a result, two of the case studies
(THOR and SME) are much more complicated and took a lot more time to complete.
GEMNET II Application – Case Studies
188
Finally, the basis of comparison for the various approaches was the results on
the test set for GEMNET II described in Chapter 7, and cross-validation results for
inverse distance and kriging on the same test set. Cross-validation was performed
using again GSLIB inside VULCAN.
8.2 Case Study 1 – Copper/Gold Deposit 1 The dataset from the first copper/gold deposit consists of 44 drillholes containing
1361 samples in total. From these only 227 samples are within the geological model
of the orebody as can be seen in Fig. 8.1 below. These samples are used for the
estimation process.
Figure 8.1: Orebody and drillholes from copper/gold deposit 1.
GEMNET II Application – Case Studies
189
The number of samples is quite small and the 3D model of the orebody fairly simple
making this case study a relatively easy task. The following table gives the statistics
for the data used in this case study:
Table 8.2: Statistics of data from copper/gold deposit 1. Number of
Samples Average Grade
Standard
Deviation
Coefficient of
Variance
Number of
Estimated Blocks
Au 227 2.34g/t 3.1758 1.0339 5,698
Cu 227 4.01% 4.7184 0.9424 5,698
The co-ordinates of the samples have been transformed from their original values for
confidentiality purposes. The relative positions of the samples have not been changed.
The data processing and control module of GEMNET II generated 1109, 2156, and
1905 patterns in the west-east, north-south, and upper-lower sectors respectively.
Copper Grade Values Data Fit
0
2
4
6
8
10
12
0 2 4 6 8 10 12
Actual
Estim
ated ID2
KrigingGEMNet II
Figure 8.2: Scatter diagram of actual vs. estimated copper grades from copper/gold deposit 1.
GEMNET II Application – Case Studies
190
First the three methods were tested using the copper grade data. The mean
absolute errors produced were 18.9% for GEMNET II, 20.06% for inverse distance
squared, and 19.68% for spherical kriging. Clearly, there was not much difference in
this case between the different approaches. The data fit diagram of Fig. 8.2 shows
exactly how close they were.
Copper Grade Distribution
0
2
4
6
8
10
12
14
16
18
20
2 4 6 8 10 12 14 More
Bin
Freq
uenc
y Actual
ID2
Kriging
GEMNet II
Figure 8.3: Copper grade distributions from copper/gold deposit 1.
Unlike the absolute errors which suggest that GEMNET II is doing slightly better than
kriging and inverse distance, the estimated distributions shown in Fig. 8.2 show that
inverse distance is following better the actual distribution of copper grades, with
GEMNET II and kriging presenting very similar distributions. GEMNET II tends to
underestimate high-grade samples but the overall estimation is not biased. On the
other hand, inverse distance seems to overestimate low-grade samples.
The time requirements for the application of the three methods were quite
different, even though geostatistics were fairly straightforward in this case. GEMNET
II required 50 minutes to process the samples and block model centroids, develop the
GEMNET II Application – Case Studies
191
networks and perform grade estimation. The geostatistical study required about 3
hours to complete. The time spent for grade estimation using inverse distance and
kriging, once the geostatistical study was complete, was about 15 minutes. Even
though there is a difference, this study is not ideal for demonstrating the benefits from
the speed of GEMNET II application. The difference in time requirements between
geostatistics and GEMNET II will be demonstrated in the following case studies
which present a much more complicated structural picture.
In the second part of the study, the techniques were tested using gold grades
from the same samples. The time requirements were identical to the first part. The
errors produced were quite similar as well: 18.78% for GEMNET II, 22.47% for
inverse distance squared, and 20.47% for spherical kriging. Figure 8.4 shows the data
fit diagram of the estimates and Figure 8.5 the estimated and actual gold grade
distributions.
Gold Grade Values Data Fit
0
1
2
3
4
5
6
7
0 1 2 3 4 5 6 7
Actual
Estim
ated ID2
Kriging
GEMNet II
Figure 8.4: Scatter diagram of actual vs. estimated gold grades from copper/gold deposit 1.
GEMNET II Application – Case Studies
192
Gold Grade Distribution
0
2
4
6
8
10
12
14
16
18
20
0.8 1.2 1.6 2 2.4 3 4 5 More
Bin
Freq
uenc
y ActualID2KrigingGEMNet II
Figure 8.5: Gold grade distributions from copper/gold deposit 1.
Quite clearly, GEMNET II tends to underestimate high-grade samples once again,
even though this time it seems to be doing a bit better than in the case of copper
grades. Generally the behaviour of the three estimators is very similar for both the
estimated grades, copper and gold. The following table shows the estimated average
grades for gold and copper.
Table 8.3: Actual and estimated average copper and gold grades from copper/gold deposit 1. Actual ID2 Kriging GEMNet II
Au 2.34 2.54 2.26 1.96
Cu 4.01 3.69 3.72 3.41
The estimation performance of GEMNET II can be monitored through the
validation tools developed within VULCAN. These are mainly the reliability indicator
and module index. The RBF centers visualization tool also provides some insight to
the process of neural network development for grade estimation in GEMNET II. The
GEMNET II Application – Case Studies
193
following figures illustrate sections through the block model of the copper/gold
deposit in this case study coloured according to the reliability indicator (Fig. 8.6), and
the module index (Fig. 8.7). Figure 8.8 also illustrates the positions of RBF centers
from various networks in their respective input pattern space.
Figure 8.6: Plan section (top) and cross section (bottom) of block model coloured by reliability
indicator values for the gold grade estimation of copper/gold deposit 1.
GEMNET II Application – Case Studies
194
Figure 8.7: Plan section (top) and cross section (bottom) of block model coloured by module
index for gold and copper grade estimation of copper/gold deposit 1.
From sections like those in Fig. 8.6 one can identify areas where the
estimation process with GEMNET II is problematic. These areas are usually close to
the edges of the modeled orebody or around faults and other discontinuities. In this
case the low reliability area is indicated at the middle part of the orebody. This was
expected before the estimation process due to a dyke that is intersecting the orebody
and exactly the same location. The sections of Fig. 8.7 show which module is
responsible for providing the estimate and can help optimize the estimation process in
conjunction with the reliability indicator sections.
GEMNET II Application – Case Studies
195
Figure 8.8: RBF centers locations and training patterns from module 1 networks, north (top)
and east (bottom).
GEMNET II Application – Case Studies
196
The visualization of RBF centers from various networks can help to understand how
the system performs grade estimation and in particular how it clusters the training
patterns. A good spread of the centers in the input space, as in this case study, means
that the neural network development is responding properly to the data at hand.
Figure 8.9: Plan section (top) and cross section (bottom) of block model coloured by gold
grade estimates for the copper/gold deposit 1.
GEMNET II Application – Case Studies
197
The results from grade estimation are shown in the Fig. 8.9 as sections through
the estimated block model. It should be noted that the real grade values for the blocks
are unknown and therefore it is not possible to compare the estimated values with the
actual. It is also of little use to compare the block estimates from the three approaches.
8.3 Case Study 2 – Copper/Gold Deposit 2 The dataset of this case study is a superset of that used in the third case study
described in chapter 6. It is a public domain set from a large undeveloped copper/gold
deposit. It consists of four orebodies as shown in Fig. 8.10.These orebodies occur in
the form of chains of lenses (fractions of the deposit) developed along shear fractures
in metasomatised host rocks, which include gneissic granites, mica schists and
metasomatites. The set contains 77 drillholes providing a total of 3600 observations
on lithology, bleaching, structure and assays. Figure 8.10 shows the drillholes
together with the lenses in the area. The networks in GEMNET II are trained and
tested on each lens individually, i.e. only samples inside the volume of a single lens
are used to train and test the networks each time.
Figure 8.10: Orebodies and drillholes from copper/gold deposit 2.
GEMNET II Application – Case Studies
198
The data processing and control module searched the dataset for each orebody and
formed training patterns, the number of which varies from one orebody to the next.
The results of the training pattern generation as well as other information about the
data used are shown in Table 8.4 below:
Table 8.4: Samples and block model file information and training pattern generation results for copper/gold deposit 2.
Orebody TQ1 TQ1A TQ2 TQ3
Samples Included 689 382 133 484
Blocks Included 8,003 8,188 2,040 16,596
Patterns – WE 38,023 3,086 649 7,842
Patterns – NS 16,117 2,829 283 99
Patterns - UL 9,514 7,182 897 12,342
It becomes clear from the table above that the higher the number of the available
drillhole samples the higher the number of training patterns produced. This however
depends on the sampling geometry and can vary between sectors. In the case of the
orebody TQ1 for example, and specifically for the West-East sector, the number of
generated training patterns is fairly high (38,023). This inevitably leads to longer
training time requirements for the specific networks. In fact in some cases the time
requirements are so high that it is practically impossible to train the networks with all
of the available patterns. This is the reason why the data are filtered by a distance
criterion, i.e. percentage of maximum distance between samples. This criterion, as
introduced in the previous chapter, has nothing in common with the structural analysis
and the range values in variography. The maximum distance ranges in GEMNET II
GEMNET II Application – Case Studies
199
are set to limit the number of training pattern per network depending on the hardware
limitations only.
GEMNET II was applied to each orebody individually. The required
development and application time varied between orebodies as did the produced mean
absolute errors on the test set. The following table shows statistical information on the
four orebodies as well as the estimation performance results from the three estimators.
Table 8.5: Statistics from copper/gold deposit 2 and estimation performance results.
Orebody TQ1 TQ1A TQ2 TQ3
Coefficient of Variance 1.0612 1.0612 1.0615 1.0611
Actual Avg. Grade 0.9109 g/t 0.9272 g/t 0.7339 g/t 1.1354 g/t
ID2 Avg. Grade 0.8571 g/t 0.8610 g/t 0.6843 g/t 1.0719 g/t
Kriging Avg. Grade 0.8577 g/t 0.8683 g/t 0.6794 g/t 1.0587 g/t
GEMNET II Avg. Grade 0.8374 g/t 0.8273 g/t 0.6271 g/t 1.0245 g/t
ID2 ABS % 22.40 % 20.68 % 31.69 % 19.85 %
Kriging ABS % 18.61 % 16.92 % 25.30 % 17.83 %
GEMNET II ABS % 15.64 % 16.73 % 25.51 % 14.92 %
Grade estimation by GEMNET II lasted over one hour for the first orebody and similar
times for the other three. The time spent on the geostatistical study was a bit more
difficult to calculate as the author spent days to complete the variography and perform
kriging and inverse distance. The geostatistical study was carried out once for the
entire deposit. The results of estimation from the three methods are shown in the
following figures (Fig. 8.11 to 8.18).
GEMNET II Application – Case Studies
200
Gold Grades Data Fit
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
Actual
Estim
ated ID2
KrigingGEMNet II
Figure 8.11: Scatter diagram of actual vs. estimated gold grades from zone TQ1 of
copper/gold deposit 2.
Gold Grade Distribution
0
10
20
30
40
50
60
70
80
90
0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 More
Bin
Freq
uenc
y ActualID2KrigingGEMNet II
Figure 8.12: Gold grade distributions from zone TQ1 of copper/gold deposit 2.
GEMNET II Application – Case Studies
201
All three methods perform well. Inverse distance is producing a very smooth
distribution of grades, while kriging and GEMNET II try to follow the peaks a bit
better. GEMNET II also tends to underestimate high-grade samples, something that
has been quite consistent through the various studies.
Gold Grades Data Fit
0
0.5
1
1.5
2
2.5
0 0.5 1 1.5 2 2.5
Actual
Estim
ated ID2
KrigingGEMNet II
Figure 8.13: Scatter diagram of actual vs. estimated gold grades from zone TQ1A of
copper/gold deposit 2.
Gold Grade Distribution
0
5
10
15
20
25
30
35
40
45
0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 More
Bin
Freq
uenc
y ActualID2KrigingGEMNet II
Figure 8.14: Gold grade distributions from zone TQ1A of copper/gold deposit 2.
GEMNET II Application – Case Studies
202
Gold Grades Data Fit
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
Actual
Estim
ated ID2
KrigingGEMNet II
Figure 8.15: Scatter diagram of actual vs. estimated gold grades from zone TQ2 of
copper/gold deposit 2.
Gold Grade Distributions
0
2
4
6
8
10
12
14
0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 More
Bin
Freq
uenc
y ActualID2KrigingGEMNet II
Figure 8.16: Gold grade distributions from zone TQ2 of copper/gold deposit 2.
In the TQ2 zone, GEMNET II presents severe underestimation of high-grade samples
and overestimation of average grade samples. Inverse distance fails to follow the
actual distribution while kriging seems to be performing better overall. This zone is
GEMNET II Application – Case Studies
203
quite different from the other three in that it has very few samples and low average
grade.
Gold Grade Estimates Fit
0
0.5
1
1.5
2
2.5
0 0.5 1 1.5 2 2.5
Actual
Estim
ated ID2
KrigingGEMNet II
Figure 8.17: Scatter diagram of actual vs. estimated gold grades from zone TQ3 of
copper/gold deposit 2.
Gold Grade Distribution
0
10
20
30
40
50
60
0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 More
Bin
Freq
uenc
y ActualID2KrigingGEMNet II
Figure 8.18: Gold grade distributions from zone TQ3 of copper/gold deposit 2.
GEMNET II Application – Case Studies
204
It is quite notable from the diagrams presented that all four zones seem to have a very
difficult distribution of gold grades. The distribution graphs are all split in two areas
with a very low population around 1.2 g/t. This could be due to the geological
modelling as the samples are selected within the modelled orebodies. If these extend
between two or more actual geological zones then each of the four datasets could
include samples from two or more different populations. As an effect of that, the
expected good performance from all three approaches is not realised and the produced
absolute errors are quite high.
The estimation process with GEMNET II was validated using the same tools as
in the previous case study. The following figures show slices through the block model
coloured according to the reliability indicator, module index, and estimated grades.
Screenshots from the RBF centre location tool are also shown. It should be noted that
the block model shown includes all four zones, which can be identified by the sub-
blocking.
GEMNET II Application – Case Studies
205
Figure 8.19: Plan section (top) and cross section (bottom) of block model coloured by
reliability indicator values for the gold grade estimation of copper/gold deposit 2.
GEMNET II Application – Case Studies
206
Figure 8.20: Plan section (top) and cross section (bottom) of block model coloured by
module index for gold and copper grade estimation of copper/gold deposit 2.
GEMNET II Application – Case Studies
207
Figure 8.21: RBF centers locations and training patterns from module 1 network north (top)
and module 2 network (bottom) in copper/gold deposit 2.
GEMNET II Application – Case Studies
208
Figure 8.22: Plan section (top) and cross section (bottom) of block model coloured by gold
grade estimates for the copper/gold deposit 2.
GEMNET II Application – Case Studies
209
The results from grade estimation are shown in Fig. 8.22 as sections through
the estimated block model. Once again the real grade values for the blocks are
unknown and therefore it is not possible to compare the estimated values with the
actual. The time requirements for the three approaches were significantly different in
this case study. The complete geostatistical study including all four zones lasted over
a week. In this time, the author spent time driving the software and examining the
results. On the other hand, GEMNET II required a total of about eight hours to
develop the networks and complete the grade estimation. Quite clearly, the advantage
of GEMNET II in time requirements is significant and more importantly the results
from GEMNET II did not depend on the author’s knowledge of the given dataset.
This case study has also shown the importance of geological modelling in the
process of grade estimation. If the samples selected as input information for the
estimation process are not part of the same geological domain, none of the techniques
will be able to perform well. GEMNET II is not meant to replace the very important
stage of geological modelling.
8.4 Case Study 3 – Copper/Gold Deposit 3 This case study is very similar to the previous one in that the deposit consists of
several (five) orebodies. These orebodies come in the form of almost parallel veins.
The models of the orebodies have been constructed in VULCAN as part of a
geological study. The five orebodies and the associated drillholes are shown in Fig.
8.23.
GEMNET II Application – Case Studies
210
Figure 8.23: Plan and side views of copper/gold deposit 3 orebodies. Drillholes and extents
of block model are also shown.
The pattern generation process for the development of neural networks had to be
adjusted for the elongated shape of the orebodies and the drilling scheme.
Specifically, the patterns for module two networks in the east-west direction had to be
limited to those within 10% of the maximum distance between samples. This was
necessary, as the total number of possible patterns was too high for the hardware-
software combination (more than 100,000 patterns). The following table gives
information about the pattern generation process for the five zones.
GEMNET II Application – Case Studies
211
Table 8.6: Samples and block model file information and training pattern generation results
for copper/gold deposit 3.
Zone TQ1 TQ1A TQ3 TQ4 TQ7
Samples 1,912 829 1,144 534 330
Blocks 4,280 2,425 4,291 344 2,254
Patterns WE 6,018 1,013 100 94 744
Patterns NS 9,244 1,573 239 335 348
Patterns UL 4,945 1,865 4,908 1,288 585
All three methods were tested on the copper grade data from the five zones. It was not
possible to test their performance on gold grades due to problems with the specific
drillhole database. In the graphs following below, the data fit and estimated
distributions are shown as before. The output of the validation tools for GEMNET II is
given at the end of the case study and for the entire block model. The results from the
five zones are given in the following table. Again, the geostatistical study for the
entire deposit took at least a week to complete, while GEMNET II required about 12
hours to complete the estimation of copper grades.
Table 8.7: Statistics from copper/gold deposit 3 and estimation performance results.
Orebody TQ1 TQ1A TQ3 TQ4 TQ7
Actual Avg. Cu Grade 1.0187 1.0309 1.1006 0.7798 0.6468
ID2 Avg. Cu Grade 0.9963 1.0327 1.0451 0.7580 0.6083
Kriging Avg. Cu Grade 1.0096 0.9623 1.0540 0.7487 0.5744
G. II Avg. Cu Grade 1.0196 0.9498 1.0072 0.7654 0.6594
ID2 ABS % (Cu) 17.96 % 16.12 % 17.70 % 23.64 % 16.23 %
Kriging ABS % (Cu) 14.73 % 15.00 % 14.77 % 14.74 % 14.61 %
GEMNET II ABS % (Cu) 16.32 % 17.16 % 14.67 % 12.80 % 12.39 %
GEMNET II Application – Case Studies
212
From the above table it is clear that the three techniques are performing better than in
the previous case study, even though there are certain similarities between the two
specially in the geological modelling and drilling scheme. The improvement in
performance can be associated with a much better geological model, which means
better separation of the sample groups between the five zones.
The following graphs and slices through the deposit’s block model are
grouped by zone starting with zone TQ1. As before, the data fit and distribution
graphs are given first and following are the block model slices showing the reliability
indicator, module index, and estimated grade values.
Copper Grades Data Fit - TQ1
0
0.5
1
1.5
2
2.5
3
3.5
4
0 0.5 1 1.5 2 2.5 3 3.5 4
Actual
Estim
ated ID2
KrigingGEMNet II
Figure 8.24: Scatter diagram of actual vs. estimated copper grades from zone TQ1 of
copper/gold deposit 3.
GEMNET II Application – Case Studies
213
Copper Grade Distribution - TQ1
0
10
20
30
40
50
60
70
80
0.5 1 1.5 2 2.5 3 3.5 4 More
Bin
Freq
uenc
y ActualID2KrigingGEMNet II
Figure 8.25: Copper grade distributions from zone TQ1 of copper/gold deposit 3.
Copper Grades Data Fit - TQ1A
0
0.5
1
1.5
2
2.5
3
3.5
0 0.5 1 1.5 2 2.5 3 3.5
Actual
Estim
ated ID2
KrigingGEMNet II
Figure 8.26: Scatter diagram of actual vs. estimated copper grades from zone TQ1A of copper/gold deposit 3.
GEMNET II Application – Case Studies
214
Copper Grade Distribution
0
5
10
15
20
25
30
35
40
0.5 1 1.5 2 2.5 3 3.5 More
Bin
Freq
uenc
y ActualID2KrigingGEMNet II
Figure 8.27: Copper grade distributions from zone TQ1A of copper/gold deposit 3.
Copper Grades Data Fit
0
0.5
1
1.5
2
2.5
0 0.5 1 1.5 2 2.5Actual
Estim
ated ID2
KrigingGEMNet II
Figure 8.28: Scatter diagram of actual vs. estimated copper grades from zone TQ3 of
copper/gold deposit 3.
GEMNET II Application – Case Studies
215
Copper Grade Distribution
0
5
10
15
20
25
30
35
40
45
0.5 1 1.5 2 2.5 3 3.5 More
Bin
Freq
uenc
y ActualID2KrigingGEMNet II
Figure 8.29: Copper grade distributions from zone TQ3 of copper/gold deposit 3.
Copper Grades Data Fit
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6
Actual
Estim
ated ID2
KrigingGEMNet II
Figure 8.30: Scatter diagram of actual vs. estimated copper grades from zone TQ4 of
copper/gold deposit 3.
GEMNET II Application – Case Studies
216
Copper Grade Distribution
0
5
10
15
20
25
30
35
40
0.5 0.7 0.9 1.1 1.3 1.5 More
Bin
Freq
uenc
y ActualID2KrigingGEMNet II
Figure 8.31: Copper grade distributions from zone TQ4 of copper/gold deposit 3.
Copper Grades Data Fit
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6
Actual
Estim
ated ID2
KrigingGEMNet II
Figure 8.32: Scatter diagram of actual vs. estimated copper grades from zone TQ7 of
copper/gold deposit 3.
GEMNET II Application – Case Studies
217
Copper Grade Distribution
0
1
2
3
4
5
6
7
0.2 0.4 0.6 0.8 1 1.2 1.4 More
Bin
Freq
uenc
y Actual
ID2
Kriging
GEMNet II
Figure 8.33: Copper grade distributions from zone TQ7 of copper/gold deposit 3.
From the above graphs it is clear that GEMNET II tends underestimate high-
grade samples in the three high-grade zones (TQ1, TQ1A, TQ3) while it shows some
overestimation of the low-grade samples in the low-grade zone TQ7. Its performance
is consistent through the rest of the distribution. Generally, it shows to be less affected
by high-grade samples than the other two techniques, which can prove very useful.
GEMNET II is meant to be a robust technique that can accept data of unknown quality
and still provide sensible results. Its performance is verified by the absolute errors,
which were always close to if not better than those of kriging.
The underestimation of high-grade samples can also be explained by the
geometry of the zones. This geometry controls the number of samples being selected
as neighbours for the networks of module two in GEMNET II. As the zones are fairly
narrow and long, some of the sectors are consistently empty and the network of
module one provides the estimate for those. This network, as was explained before,
GEMNET II Application – Case Studies
218
tends to give estimates close to the overall average grade and hence the
underestimation of high-grade samples.
Inverse distance weighting performed exceptionally well in this case study
compared to both kriging and GEMNET II, considering how simple the method really
is. However, it was benefited by a complete geostatistical study that improved the
search method for the sample selection. The performance of kriging was once again
very good as in the previous studies.
The following figures show slices through the block model of the copper/gold
deposit 3 coloured by the reliability indicator, module index, and estimated copper
grade values.
Figure 8.34: Plan section (top) and cross section (bottom) of block model coloured by
reliability indicator values for the copper grade estimates of copper/gold deposit 3.
GEMNET II Application – Case Studies
219
It should be noted that the block model was modified to reflect the geological
environment of the deposit as modelled by the geologist (not the author!). For this
reason there are blocks that have been deleted from the model as shown in the figures.
The block model consists of major blocks and sub-blocks inside them that follow
better the zones and other geological entities. As the estimation only takes place
inside the zones, the blocks outside them retain the default values, which is why the
majority of the blocks in the slices have the same colour.
Figure 8.35: Plan section (top) and cross section (bottom) of block model coloured by
module index values for the copper grade estimates of copper/gold deposit 3.
As explained before, the zones are very narrow and the system is choosing the module
one network (red blocks) in most cases. In the cases where module two networks are
GEMNET II Application – Case Studies
220
also used, the reliability indicator shows disagreement between the individual
estimates. This is caused mainly by the module one network that still contributes to
the final estimate by filling empty sectors.
Figure 8.36: Plan section (top) and cross section (bottom) of block model coloured by copper
grade estimates for the copper/gold deposit 3.
8.5 Case Study 4 – Copper/Gold Deposit 4 The final case study for GEMNET II tested its limits in terms of the speed and
computational overhead. The dataset used is relatively large at least for a case study
(over 30,000 samples!). It is very different from the previous three not only in the
number of samples but also in the sampling density and the complexity of the
orebody. As shown in Fig. 8.37, it is a massive copper/gold deposit that has
undergone an extensive exploration programme.
GEMNET II Application – Case Studies
221
Figure 8.37: Orebody, drillholes, and block model extents from copper/gold deposit 4.
The dataset used in this case study consists of only the underground drillholes as these
were intersecting the orebody. There are over 300 underground drillholes from
existing underground workings that were used to delineate the orebody and prove the
reserves.
As expected, this was the longest case study in terms of the time required by
GEMNET II to complete the estimation process. More specifically, GEMNET II
required a total of 14 hours for the entire process. It is quite interesting though to give
the breakdown of this time period as to the various processes involved. The
generation of training patterns for the neural networks took most of this time (10
GEMNET II Application – Case Studies
222
hours!). Processing of the block model require around one hour and a half, and the
actual estimation process only 30 minutes. The time requirements in this case study
are very similar to those from other neural network applications that involve large
amount of data.
The author did not perform the geostatistical study. It has been performed by a
geologist at Maptek who has definitely done a better job than the author could have
done himself. Unfortunately there is no information on the time spent during the
geostatistical study, but the author believes that it would be at least a matter of days.
The following table summarises the results from the application of all three
techniques to the data of this case study. Once again, it should be noted that inverse
distance weighting has benefited from the geostatistical study that improved
significantly the results obtained with this technique.
Table 8.8: Summary of estimation results from copper/gold deposit 4. Average Grade ABS Error %
Actual 4.1316 -
Inverse Distance Weighting 3.9264 19.78 %
Kriging 3.9014 14.46 %
GEMNet II 3.8907 15.04 %
The performance of the three estimators becomes clearer by examining the
data fit and distribution graphs given in the following figures. All three techniques
performed well.
GEMNET II Application – Case Studies
223
Gold Grades Data Fit
0
2
4
6
8
10
12
0 2 4 6 8 10 12
Actual
Estim
ated ID2
KrigingGEMNet II
Figure 8.38: Scatter diagram of actual vs. estimated gold grades from copper/gold deposit 4.
Gold Grade Distribution
0
100
200
300
400
500
600
2 4 6 8 10 More
Bin
Freq
uenc
y ActualID2KrigingGEMNet II
Figure 8.39: Gold grade distributions from copper/gold deposit 4.
GEMNET II Application – Case Studies
224
Quite clearly, GEMNET II is performing well in low and average grades with some
underestimation of high-grade samples, which however is less than in the previous
cases.
Unlike the previous case studies, slices through the block model are given in
3D view illustrating the capabilities of the graphical environment (Envisage) and the
benefits from the integration of GEMNET II. As usual, the block model slices are
coloured according to the reliability indicator, module index, and estimated gold
grade values.
Figure 8.40: 3D view of sections through the block model coloured by the reliability
indicator values from copper/gold deposit 4. Orebody model is also shown.
GEMNET II Application – Case Studies
225
Figure 8.41: 3D view of block model sections coloured by module index values from
copper/gold deposit 4.
Figure 8.42: 3D view of block model sections coloured by estimated gold grades from
copper/gold deposit 4.
GEMNET II Application – Case Studies
226
8.6 Conclusions The case studies in this chapter have demonstrated the use of GEMNET II as an
integrated grade estimation system. The case studies were presented in order of
increasing complexity and were chosen to illustrate the usability as well as the
performance of GEMNET II. They were also chosen to provide the basis for
comparison with the two most popular advanced estimation techniques, inverse
distance weighting and kriging.
The four case studies were completed without major problems especially in
the application of GEMNET II. The author did not use any additional information or
knowledge for its application other that the grades provided by the drillhole samples.
The same cannot be said for the other two methods that required a geostatistical study
that normally took days to complete. On the other hand, the author did make use of
the validation tools developed for GEMNET II to examine the system’s operation and
draw conclusions as to potential problems. These validation tools were used to fine-
tune the modular neural network architecture that comprises the core of GEMNET II.
The benefits of integration with VULCAN, the resource-modelling package,
were also demonstrated. The advanced graphical functions provided by its graphical
environment, Envisage, allowed the visualisation of the results from GEMNET II and
the development and use of specialised validation tools.
In all four case studies, GEMNET II performed very well even in comparison
with the other already established techniques. The results obtained have shown that it
is a reliable and fast grade estimation system. GEMNET II has shown its potential as a
valid alternative that can handle large amounts of data quickly and without being
prone to extreme values. However, these case studies have clearly demonstrated that
GEMNET II Application – Case Studies
227
GEMNET II, as any other advanced grade estimation system, still shows dependency
on the results of geological modelling.
Finally, it should be noted once again that the case studies presented have been
limited by the fact that most of the available to the author data were confidential.
GEMNET II has been applied with success to a number of other deposits including
potash, zinc, and iron ore deposits. Unfortunately, the author did not have the right to
publish the results from these studies.
Conclusions and Further Research
185
9. Conclusions and Further Research
9.1 Conclusions Grade estimation is the most computationally intensive stage of a mineral deposit
evaluation. It is also one of the most critical ones, as the results obtained at this stage
will determine to a great extent the profitability of a mining project. In other words,
decisions that involve large amounts of financial resources are depending on the
results of the grade estimation process.
Grade estimation is mainly a process of interpolation from exploration data.
There is a high cost associated with exploration in mineral deposits and for this
reason, the amount of available data is usually low in comparison to the required
estimated area.
Depending on the complexity of the given deposit and the required accuracy
of the estimates, different techniques are currently used with the most advanced being
the techniques provided by the geostatistical methodology. These techniques have
been developed to reflect the geological picture of the deposit in space. When
effectively applied, these techniques can provide very accurate results.
However, the geostatistical methodology, being very complex, requires
knowledge and expertise to be effectively applied. This knowledge and expertise is
often not present and in some cases the people who apply geostatistics have
insufficient experience in the field. As a result, the grade estimates produced are not
accurate and the mining industry very often doubts the reliability of the method. It is
generally accepted by geostatisticians that given the same data, different people will
almost certainly produce different estimates using geostatistics.
Geostatistics is also based on assumptions for the distribution of grades, which
in many cases are acceptable. There are, however, deposits where the required
Conclusions and Further Research
186
assumptions cannot be made. Unfortunately, there are cases when people who apply
geostatistics do not consider this fact. In some cases it is also very difficult to
understand if the required assumptions can be valid for the given deposit. The above
problems have led scientists in search for alternative methods.
In recent years the application of Artificial Intelligence (AI) tools in the
mining industry has become more common, especially in system control applications
and decision-making. One of the most important AI tools, Artificial Neural Networks
(ANNs), has been applied with success to problems that involve large amounts of data
of unknown quality.
ANNs are computing structures based on simplified models of biological
neural systems such as the human brain. They develop solutions to problems by
‘learning’ a required response from examples. One of the problems that ANNs are
very successful in providing solutions is function approximation. Grade estimation
can be considered as a problem of approximating an unknown function from
examples provided from exploration data.
There are different ways of forming examples for training ANNs from
exploration data. Examples usually come in the form of input-output patterns, with the
output being the modelled variable. In the case of grade estimation, inputs can be the
sample co-ordinates in space, or other measurements and the output is normally the
grade. The choice of input parameters dictates the vector space where the grade will
be approximated. This choice is essential for the estimation process using ANNs.
The choice of a type of ANN for grade estimation is limited to those following
the supervised paradigm explained above. There are two main candidate types of
ANNs for function approximation problems. These are the Multi-Layered Perceptron
(MLP) and the Radial Basis Function network (RBF). The RBF network seems to be
Conclusions and Further Research
187
a better choice for grade estimation as it constructs local approximations as opposed
to global approximations of the MLP. It is generally accepted that grade is a localised
variable and therefore RBF networks are ideal for its estimation.
The objectives of the research presented in this thesis were to go a step further
into the development of a neural network based estimation techniques than other
researchers have done in the past. More specifically, the developed ANN based
system for grade estimation should be able to handle 3D exploration data from real
deposits and perform estimation on 3D block model basis. The estimation process
itself should honour the distribution of grades in 3D space and take into account the
spatial variability of grades in different directions in space.
The developed system, GEMNET II, is integrated with one of the leading
packages for earth resources modelling, VULCAN. The potential benefits of
integration were exploited to the maximum extent. GEMNET II takes advantage of
VULCAN’s graphical environment and capabilities to present its estimation results in
3D. A number of validation tools measuring the reliability of the produced estimates
as well as showing useful information on the estimation process itself have been
developed using these graphical capabilities. As a result, GEMNET II is not just an
ANN based interpolator but a complete system for grade estimation that can be
integrated in the larger mine planning and design process.
The reliability and estimation performance of GEMNET II has been verified by
a number of case studies, some of them presented in this thesis. From the results
obtained in most of these studies and in comparison to results obtained using
geostatistics, it becomes clear that GEMNET II is a valid alternative that can turn the
great potential of ANNs in the field of grade estimation to a complete system.
Conclusions and Further Research
188
GEMNET II is user-independent, i.e. its results do not depend on user input or
modifications of the estimation technique. This however does not mean that the user
has no control over the estimation process, or that GEMNET II can override the
geological modelling that should precede any grade estimation. GEMNET II still
depends on the given data – in fact its performance depends solely on them. The user
can therefore improve its performance by controlling the data used for building the
examples for training the networks in GEMNET II. The validation tools provided by
the system, can aid the user in this task by indicating areas where GEMNET II is
facing difficulties in giving accurate results.
The work presented in this thesis shows that ANNs can be used to develop
solutions for grade estimation problems and that ANNs as approximators do not lack
the mathematical background, a misconception done by many people in the mining
industry. ANNs have a very rich theoretical background that spans over many
different scientific fields. Their application in a problem like grade estimation is not a
‘black box’ approach, i.e. their results and overall operation can be validated and
justified. GEMNET II is a good example of how ANNs can be successfully used to
develop a grade estimation solution.
9.2 Further Research Artificial Neural Networks is a rapidly evolving field. This means that there is an
almost constant development of new architectures and learning algorithms. Therefore,
there will always be new ANNs to try for the problem of grade estimation. Regarding
GEMNET II, there have been many improvements to the standard Radial Basis
Function Network since the beginning of the work presented in this thesis. The most
important ones concern the intelligent control of the number of RBFs required for a
given problem as well as the design and adaptation of their shape. There is no reason
Conclusions and Further Research
189
why the hyper-spherical shape of the RBFs is ideal for all problems. Many researchers
had tried other basis functions like the rectangular basis functions.
As the architecture of the GEMNET II is modular, i.e. consists of several
neural networks, there will always be space for improvement. The number of
networks, the 3D search method, and most importantly the way the individual
estimates are used to form a single estimate can be areas of further work. The author
suggests that a more flexible search method that adapts to the given sampling
geometry could improve the estimation performance as well as the speed of training
pattern generation. A new search method of course would result to a change of the
number of networks – varying number of sectors leads to varying number of networks
trained on sector data.
The way that individual estimates are used to generate a single estimate for
each point can also be further optimised. In GEMNET II, an RBF network is
responsible for averaging the individual estimates but this is not necessarily the only
way for achieving this. New networks can be tested and perhaps another solution can
be found that will not be based on ANNs.
The effect of using various ANN modules on the block model estimates needs
to be examined. The use of different ANNs for different blocks can be a source of
inconsistencies in the estimates produced and can possibly introduce a bias.
The author believes that using GEMNET II and especially the validation tools
provided can help in investigating ways of improving the system. The integration with
VULCAN can be taken even further. Direct access of the block model, and possibly
allowing the use of grid models as the basis of grade estimation will significantly
increase the speed of the system.
Conclusions and Further Research
190
The use of a neural network simulator like SNNS helped in the development
stages of GEMNET II. Once the architecture is finalised, there is no reason why the
system should still depend on a simulator for the development of neural networks.
Including the network code into the core of the system will have significant effects on
the speed of training and application of the networks. However, this should not be
done to the expense of the flexibility to change critical parameters of the learning
algorithm or the RBFs.
The validation tools can be further developed to include options for more
accurate measurement of the estimates’ reliability, such as confidence intervals. An
indication of when the networks are extrapolating would also be useful to indicate
areas where the sampling is insufficient.
The performance of GEMNET II in terms of the block model estimates needs
to be investigated. As it is very difficult to have the actual block model grades from
real deposits, other cases should be examined such as simulated deposits in order to
examine the behaviour of the system while estimating volumes larger than those of
drillhole samples. The effect of the sample support input to the system needs further
investigation.
Finally, it should be noted that a system such as GEMNET II based on artificial
neural networks will require time to gain the acceptance of the mining industry. One
should not forget how difficult it was and how much time it required for geostatistics
to be established and widely used three decades ago. Allowing as many people as
possible to experience the use of GEMNET II and make their own conlcusions is the
only way to establish it as a valid alternative method for grade estimation and
probably the best way towards further improvements.
Appendix A – File Structures
239
Appendix A – File Structures
A1. SNNS Network Description File SNNS network definition file V1.4-3D generated at Tue Sep 28 11:56:29 1999 network name : east source files : no. of units : 44 no. of connections : 160 no. of unit types : 0 no. of site types : 0 learning function : RadialBasisLearning update function : Topological_Order unit default section : act | bias | st | subnet | layer | act func | out func ---------|----------|----|--------|-------|------------------|------------- 0.00000 | 0.00000 | h | 0 | 1 | Act_RBF_Gaussian | Out_Identity ---------|----------|----|--------|-------|------------------|------------- unit definition section : no. | typeName | unitName | act | bias | st | position | act func | out func | sites ----|----------|----------|----------|----------|----|----------|--------------|----------|------- 1 | | Grade | 0.02936 | 0.00000 | i | 2, 2,72 | Act_Identity | | 2 | | Distance | 0.44031 | 0.00000 | i | 3, 2,72 | Act_Identity | | 3 | | Length | 0.00646 | 0.00000 | i | 4, 2,72 | Act_Identity | | 4 | | c1 | 0.97190 | 0.93847 | h | 1, 7,68 ||| 5 | | c2 | 0.87191 | 0.85472 | h | 2, 7,68 ||| 6 | | c3 | 0.96511 | 0.70236 | h | 3, 7,68 ||| 7 | | c4 | 0.85429 | 0.89278 | h | 4, 7,68 ||| 8 | | c5 | 0.87341 | 0.91482 | h | 5, 7,68 ||| 9 | | c6 | 0.83433 | 0.94826 | h | 1, 7,69 ||| 10 | | c7 | 0.85495 | 0.89164 | h | 2, 7,69 ||| 11 | | c8 | 0.85175 | 0.90161 | h | 3, 7,69 ||| 12 | | c9 | 0.85231 | 0.88927 | h | 4, 7,69 ||| 13 | | c10 | 0.82321 | 0.95233 | h | 5, 7,69 ||| 14 | | c11 | 0.84729 | 0.88713 | h | 1, 7,70 ||| 15 | | c12 | 0.85888 | 0.90323 | h | 2, 7,70 ||| 16 | | c13 | 0.83049 | 0.85461 | h | 3, 7,70 ||| 17 | | c14 | 0.86441 | 0.89943 | h | 4, 7,70 ||| 18 | | c15 | 0.85100 | 0.88699 | h | 5, 7,70 ||| 19 | | c16 | 0.84793 | 0.92424 | h | 1, 7,71 ||| 20 | | c17 | 0.84574 | 0.86460 | h | 2, 7,71 ||| 21 | | c18 | 0.85321 | 0.89859 | h | 3, 7,71 ||| 22 | | c19 | 0.81511 | 0.96778 | h | 4, 7,71 ||| 23 | | c20 | 0.85326 | 0.90119 | h | 5, 7,71 ||| 24 | | c21 | 0.86209 | 0.88879 | h | 1, 7,72 ||| 25 | | c22 | 0.86017 | 0.88878 | h | 2, 7,72 ||| 26 | | c23 | 0.85986 | 0.89859 | h | 3, 7,72 ||| 27 | | c24 | 0.86807 | 0.93349 | h | 4, 7,72 ||| 28 | | c25 | 0.86555 | 0.89128 | h | 5, 7,72 ||| 29 | | c26 | 0.87574 | 0.92786 | h | 1, 7,73 ||| 30 | | c27 | 0.86566 | 0.90208 | h | 2, 7,73 ||| 31 | | c28 | 0.87349 | 0.87303 | h | 3, 7,73 ||| 32 | | c29 | 0.95634 | 0.91290 | h | 4, 7,73 ||| 33 | | c30 | 0.99132 | 0.96239 | h | 5, 7,73 ||| 34 | | c31 | 0.54082 | 0.81074 | h | 1, 7,74 ||| 35 | | c32 | 0.85517 | 0.90117 | h | 2, 7,74 |||
Appendix A – File Structures
240
36 | | c33 | 0.82977 | 0.97302 | h | 3, 7,74 ||| 37 | | c34 | 0.86362 | 0.92169 | h | 4, 7,74 ||| 38 | | c35 | 0.87332 | 0.93760 | h | 5, 7,74 ||| 39 | | c36 | 0.84698 | 0.92475 | h | 1, 7,75 ||| 40 | | c37 | 0.89793 | 0.92879 | h | 2, 7,75 ||| 41 | | c38 | 0.92686 | 0.90759 | h | 3, 7,75 ||| 42 | | c39 | 0.84971 | 0.90679 | h | 4, 7,75 ||| 43 | | c40 | 1.00000 | 0.67025 | h | 5, 7,75 ||| 44 | | Target | 0.10965 | 0.42555 | o | 3,12,72 | Act_Logistic | | ----|----------|----------|----------|----------|----|----------|--------------|----------|------- connection definition section : target | site | source:weight -------|------|--------------------------------------------------------------------------------------------------------------------- 4 | | 1: 0.10449, 2: 0.59754, 3: 0.00881 5 | | 1: 0.07599, 2: 0.04265, 3: 0.01435 6 | | 1: 0.06131, 2: 0.21773, 3: 0.00430 7 | | 1: 0.03022, 2: 0.02039, 3: 0.01435 8 | | 1: 0.00691, 2: 0.05634, 3: 0.00287 9 | | 1: 0.15199, 2: 0.02084, 3: 0.01076 10 | | 1: 0.04750, 2: 0.02152, 3: 0.00001 11 | | 1: 0.06909, 2: 0.02042, 3: 0.01578 12 | | 1: 0.03022, 2: 0.01647, 3: 0.01435 13 | | 1: 0.18998, 2: 0.01791, 3: 0.01435 14 | | 1: 0.17349, 2: 0.03287, 3: 0.01004 15 | | 1: 0.03800, 2: 0.03009, 3: 0.01435 16 | | 1: 0.23143, 2: 0.02020, 3: 0.00861 17 | | 1: 0.17349, 2: 0.06453, 3: 0.01004 18 | | 1: 0.11744, 2: 0.02308, 3: 0.01435 19 | | 1: 0.01036, 2: 0.01827, 3: 0.00646 20 | | 1: 0.15026, 2: 0.01704, 3: 0.00574 21 | | 1: 0.05354, 2: 0.02071, 3: 0.01004 22 | | 1: 0.20812, 2: 0.01691, 3: 0.00359 23 | | 1: 0.04836, 2: 0.02118, 3: 0.01435 24 | | 1: 0.03627, 2: 0.03181, 3: 0.00003 25 | | 1: 0.08981, 2: 0.03318, 3: 0.01435 26 | | 1: 0.18048, 2: 0.05929, 3: 0.01004 27 | | 1: 0.13126, 2: 0.06458, 3: 0.00859 28 | | 1: 0.02159, 2: 0.04027, 3: 0.05021 29 | | 1: 0.02159, 2: 0.06224, 3: 0.00574 30 | | 1: 0.05354, 2: 0.04115, 3: 0.00716 31 | | 1: 0.15026, 2: 0.06573, 3: 0.00574 32 | | 1: 0.11744, 2: 0.23762, 3: 0.01435 33 | | 1: 0.09499, 2: 0.37245, 3: 0.01865 34 | | 1: 0.90000, 2: 0.44864, 3: 0.01548 35 | | 1: 0.12867, 2: 0.03576, 3: 0.01578 36 | | 1: 0.15285, 2: 0.02016, 3: 0.00359 37 | | 1: 0.15976, 2: 0.06343, 3: 0.01291 38 | | 1: 0.11054, 2: 0.06900, 3: 0.00717 39 | | 1: 0.22884, 2: 0.06650, 3: 0.01433 40 | | 1: 0.18998, 2: 0.14022, 3: 0.01435 41 | | 1: 0.06304, 2: 0.15300, 3: 0.00430 42 | | 1: 0.10190, 2: 0.02282, 3: 0.00001 43 | | 1: 0.02936, 2: 0.44031, 3: 0.00646 44 | | 4: 1.97859, 5:49.71570, 6:59.90763, 7:-37.35740, 8:-16.30947, 9:37.28542, 10:-39.35506, 11:12.70683, 12:-30.57939, 13:22.76179, 14:-13.37884, 15:-13.06231, 16:-15.15001, 17: 5.64048, 18:-39.55381, 19:50.59846, 20:-33.14001, 21:-9.88406, 22:22.56762, 23:14.93855, 24:39.54713, 25:32.85326, 26:-6.63768, 27:-38.57726, 28:12.00233, 29:-22.77133, 30:-3.05694, 31:42.40548, 32:-5.19528, 33:27.19900, 34:-2.25054, 35:-47.43574, 36:48.85722, 37:-50.62162, 38:-30.12116, 39:16.00671, 40:-21.60845, 41:-2.33463, 42:17.54285, 43:-39.69540 -------|------|---------------------------------------------------------------------------------------------------------------------
Appendix A – File Structures
241
A2. SNNS Network Pattern File SNNS pattern definition file V3.2 generated at Tue Jun 16 11:15:00 1998 No. of patterns : 8618 No. of input units : 3 No. of output units : 1 0.001727 0.021182 0.002740 0.041451 0.063040 0.021057 0.010760 0.041451 0.158895 0.020829 0.014347 0.041451 0.140760 0.020625 0.009326 0.041451 0.293610 0.019989 0.012195 0.041451 0.014680 0.019852 0.008608 0.041451 0.162349 0.019708 0.014347 0.041451 0.071675 0.019544 0.014347 0.041451 0.004318 0.019400 0.014347 0.041451 0.008636 0.019272 0.014347 0.041451 0.075993 0.019161 0.014347 0.041451 0.001727 0.021308 0.002740 0.054404 0.063040 0.021180 0.010760 0.054404 0.158895 0.020946 0.014347 0.054404 0.140760 0.020736 0.009326 0.054404 0.293610 0.020079 0.012195 0.054404 0.014680 0.019935 0.008608 0.054404 0.162349 0.019785 0.014347 0.054404 0.071675 0.019613 0.014347 0.054404
Appendix A – File Structures
242
A3. BATCHMAN Network Development Script # GEMNet II # East Module Training Procedure # Optimized 29/7/1999 # Ioannis Kapageridis 1999 print("GEMNet II - Neural Network Development") print("Module 1 - East Network") loadNet ("east\eastut.net") loadPattern("east\east.pat") loadPattern("east\eastx.pat") setPattern("east\eastx.pat") print ("Number of patterns :",PAT) trainNet() setInitFunc("Randomize_Weights") initNet() setInitFunc("RBF_Weights_Kohonen",1000.0,0.4,0.0) initNet() setInitFunc("RBF_Weights",-0.8,0.8,0.2,0.9,0.0) initNet() saveNet ("east\eastini.net") print("SSE = ",SSE) # hidden unit bias training setLearnFunc("RadialBasisLearning",0.0,0.001,0.0,0.01,0.6) while CYCLES < 500 do trainNet() endwhile print("SSE = ",SSE) # RBF centres training setLearnFunc("RadialBasisLearning",0.001,0.0,0.0,0.01,0.6) while CYCLES < 1000 do trainNet() endwhile print("SSE = ",SSE) # hidden-output layer weights training setLearnFunc("RadialBasisLearning",0.0,0.0,0.001,0.01,0.6) while CYCLES < 1250 do trainNet() endwhile print ("SSE = ",SSE, " MSE = ", MSE) loadPattern("east\eastx.pat") setPattern("east\eastx.pat") saveResult("east\east.res",1,PAT,FALSE,TRUE,"create") saveNet("east\easttr.net")
Appendix A – File Structures
243
A4. SNNS2C Network C Code Extract /********************************************************* d:\gemnns\east\east.c -------------------------------------------------------- generated at Tue Sep 28 12:33:23 1999 by snns2c ( Bernward Kett 1995 ) *********************************************************/ #include <math.h> #define Act_Logistic(sum, bias) ( (sum+bias<10000.0) ? ( 1.0/(1.0 + exp(-sum-bias) ) ) : 0.0 ) #define Act_Identity(sum, bias) ( sum ) #define Act_RBF_Gaussian(sum2, bias) (exp(-sum2 * bias) ) #define NULL (void *)0 typedef struct UT { float act; /* Activation */ float Bias; /* Bias of the Unit */ int NoOfSources; /* Number of predecessor units */ struct UT **sources; /* predecessor units */ float *weights; /* weights from predecessor units */ } UnitType, *pUnit; /* Forward Declaration for all unit types */ static UnitType Units[45]; /* Sources definition section */ static pUnit Sources[] = { Units + 1, Units + 2, Units + 3, Units + 1, Units + 2, Units + 3, Units + 1, Units + 2, Units + 3, Units + 1, Units + 2, Units + 3, Units + 1, Units + 2, Units + 3, Units + 1, Units + 2, Units + 3, Units + 1, Units + 2, Units + 3, Units + 1, Units + 2, Units + 3, Units + 1, Units + 2, Units + 3, Units + 1, Units + 2, Units + 3, Units + 1, Units + 2, Units + 3, Units + 1, Units + 2, Units + 3, Units + 1, Units + 2, Units + 3, Units + 1, Units + 2, Units + 3, Units + 1, Units + 2, Units + 3, Units + 4, Units + 5, Units + 6, Units + 7, Units + 8, Units + 9, Units + 10, Units + 11, Units + 12, Units + 13, Units + 14, Units + 15, Units + 16, Units + 17, Units + 18, Units + 19, Units + 20, Units + 21, Units + 22, Units + 23, Units + 24, Units + 25, Units + 26, Units + 27, Units + 28, Units + 29, Units + 30, Units + 31, Units + 32, Units + 33, Units + 34, Units + 35, Units + 36, Units + 37, Units + 38, Units + 39, Units + 40, Units + 41, Units + 42, Units + 43, }; /* Weigths definition section */ static float Weights[] = { 0.104490, 0.597540, 0.008810, 0.075990, 0.042650, 0.014350, 0.061310, 0.217730, 0.004300, 0.030220, 0.020390, 0.014350, 0.006910, 0.056340, 0.002870, 0.151990, 0.020840, 0.010760, 0.047500, 0.021520, 0.000010, 0.069090, 0.020420, 0.015780, 0.030220, 0.016470, 0.014350, 0.189980, 0.017910, 0.014350, 0.173490, 0.032870, 0.010040, 0.038000, 0.030090, 0.014350, 0.231430, 0.020200, 0.008610, 0.173490, 0.064530, 0.010040,
Appendix A – File Structures
244
1.978590, 49.715698, 59.907631, -37.357399, -16.309469, 37.285419, -39.355061, 12.706830, -30.579390, 22.761789, -13.378840, -13.062310, -15.150010, 5.640480, -39.553810, 50.598461, -33.140011, -9.884060, 22.567619, 14.938550, 39.547131, 32.853260, -6.637680, -38.577259, 12.002330, -22.771330, -3.056940, 42.405479, -5.195280, 27.198999, -2.250540, -47.435741, 48.857220, -50.621620, -30.121161, 16.006710, -21.608450, -2.334630, 17.542850, -39.695400, }; /* unit definition section (see also UnitType) */ static UnitType Units[45] = { { 0.0, 0.0, 0, NULL , NULL }, { /* unit 1 (Grade) */ 0.0, 0.000000, 0, &Sources[0] , &Weights[0] , }, { /* unit 2 (Distance) */ 0.0, 0.000000, 0, &Sources[0] , &Weights[0] , }, { /* unit 3 (Length) */ 0.0, 0.000000, 0, &Sources[0] , &Weights[0] , }, { /* unit 4 (c1) */ 0.0, 0.938470, 3, &Sources[0] , &Weights[0] , }, { /* unit 5 (c2) */ 0.0, 0.854720, 3, &Sources[3] , &Weights[3] , }, { /* unit 6 (c3) */ 0.0, 0.702360, 3, &Sources[6] , &Weights[6] , }, { /* unit 7 (c4) */ 0.0, 0.892780, 3, &Sources[9] , &Weights[9] , }, { /* unit 8 (c5) */ 0.0, 0.914820, 3, &Sources[12] , &Weights[12] , }, { /* unit 9 (c6) */ 0.0, 0.948260, 3, &Sources[15] , &Weights[15] , }, { /* unit 10 (c7) */ 0.0, 0.891640, 3, &Sources[18] , &Weights[18] , }, { /* unit 11 (c8) */ 0.0, 0.901610, 3, &Sources[21] , &Weights[21] , }, { /* unit 12 (c9) */ 0.0, 0.889270, 3, &Sources[24] , &Weights[24] , }, { /* unit 13 (c10) */
Appendix A – File Structures
245
0.0, 0.952330, 3, &Sources[27] , &Weights[27] , }, { /* unit 14 (c11) */ 0.0, 0.887130, 3, &Sources[30] , &Weights[30] , }, { /* unit 15 (c12) */ 0.0, 0.903230, 3, &Sources[33] , &Weights[33] , }, { /* unit 16 (c13) */ 0.0, 0.854610, 3, &Sources[36] , &Weights[36] , }, { /* unit 17 (c14) */ 0.0, 0.899430, 3, &Sources[39] , &Weights[39] , }, { /* unit 18 (c15) */ 0.0, 0.886990, 3, &Sources[42] , &Weights[42] , }, { /* unit 19 (c16) */ 0.0, 0.924240, 3, &Sources[45] , &Weights[45] , }, { /* unit 20 (c17) */ 0.0, 0.864600, 3, &Sources[48] , &Weights[48] , }, int east(float *in, float *out, int init) { int member, source; float sum; enum{OK, Error, Not_Valid}; pUnit unit; /* layer definition section (names & member units) */ static pUnit Input[3] = {Units + 1, Units + 2, Units + 3}; /* members */ static pUnit Hidden1[40] = {Units + 4, Units + 5, Units + 6, Units + 7, Units + 8, Units + 9, Units + 10, Units + 11, Units + 12, Units + 13, Units + 14, Units + 15, Units + 16, Units + 17, Units + 18, Units + 19, Units + 20, Units + 21, Units + 22, Units + 23, Units + 24, Units + 25, Units + 26, Units + 27, Units + 28, Units + 29, Units + 30, Units + 31, Units + 32, Units + 33, Units + 34, Units + 35, Units + 36, Units + 37, Units + 38, Units + 39, Units + 40, Units + 41, Units + 42, Units + 43}; /* members */ static pUnit Output1[1] = {Units + 44}; /* members */ static int Output[1] = {44}; for(member = 0; member < 3; member++) { Input[member]->act = in[member]; } for (member = 0; member < 40; member++) { unit = Hidden1[member]; sum = 0.0; for (source = 0; source < unit->NoOfSources; source++) { static float diff; diff = unit->sources[source]->act - unit->weights[source];
Appendix A – File Structures
246
sum += diff * diff; } unit->act = Act_RBF_Gaussian(sum, unit->Bias); }; for (member = 0; member < 1; member++) { unit = Output1[member]; sum = 0.0; for (source = 0; source < unit->NoOfSources; source++) { sum += unit->sources[source]->act * unit->weights[source]; } unit->act = Act_Logistic(sum, unit->Bias); }; for(member = 0; member < 1; member++) { out[member] = Units[Output[member]].act; } return(OK); }
Appendix A – File Structures
247
A5. VULCAN Composites File * * DEFINITION * HEADER_VARIABLES 5 * COMPID C 16 0 key * CTYPE C 12 0 * DATE C 12 0 * TIME C 12 0 * DESCRP C 80 0 * VARIABLES 17 * DHID C 12 0 * MIDX F 12 3 * MIDY F 12 3 * MIDZ F 12 3 * TOPX F 12 3 * TOPY F 12 3 * TOPZ F 12 3 * BOTX F 12 3 * BOTY F 12 3 * BOTZ F 12 3 * LENGTH F 12 3 * FROM F 12 3 * TO F 12 3 * GEOCOD C 12 0 * BOUND C 12 0 * AU F 12 3 * ORE F 2 0 * * HEADER:GOLD STRAIGHT 23-Oct-98 16:40:47 Compositing Run DDFD/A7 1910.318 2088.532 1013.920 1910.649 2088.666 1014.277 1909.987 2088.398 1013.563 1.010 144.720 145.730NONE 0 1.550 0 DDFD/A7 1909.656 2088.264 1013.206 1909.987 2088.398 1013.563 1909.325 2088.131 1012.848 1.010 145.730 146.740NONE 0 3.930 0 DDFD/A7 1908.994 2087.997 1012.491 1909.325 2088.131 1012.848 1908.662 2087.863 1012.134 1.010 146.740 147.750NONE 0 1.370 0 DDFD/A7 1908.331 2087.729 1011.777 1908.662 2087.863 1012.134 1908.000 2087.595 1011.420 1.010 147.750 148.760NONE 0 2.990 0 DDFD/A7 1907.669 2087.462 1011.063 1908.000 2087.595 1011.420 1907.338 2087.328 1010.706 1.010 148.760 149.770NONE 0 1.650 0 DDFD/A7 1907.007 2087.194 1010.349 1907.338 2087.328 1010.706 1906.676 2087.060 1009.992 1.010 149.770 150.780NONE 0 1.070 0 DDFD/A7 1906.345 2086.927 1009.635 1906.676 2087.060 1009.992 1906.014 2086.793 1009.278 1.010 150.780 151.790NONE 0 1.620 0 DDFD/A7 1905.683 2086.659 1008.920 1906.014 2086.793 1009.278 1905.352 2086.525 1008.563 1.010 151.790 152.800NONE 0 2.690 0 DDFD/A7 1905.020 2086.392 1008.206 1905.352 2086.525 1008.563 1904.689 2086.258 1007.849 1.010 152.800 153.810NONE 0 4.230 0 DDFD/A7 1904.358 2086.124 1007.492 1904.689 2086.258 1007.849 1904.027 2085.990 1007.135 1.010 153.810 154.820NONE 0 1.550 0 DDFD/A7 1903.696 2085.856 1006.778 1904.027 2085.990 1007.135 1903.365 2085.723 1006.421 1.010 154.820 155.830NONE 0 9.120 0
Appendix B –Case Study Data
Appendix B – Case Study Data
B1. Case Study 1 – 2D Iron Ore Deposit Easting Northing % Fe
0 170 34.310 40 35.515 135 28.655 145 29.4
125 20 41.5175 50 36.8120 180 33.4160 175 36240 185 30.2260 115 33.2235 15 33.7365 60 34.3285 110 35.3345 115 31335 170 27.4325 195 33.9350 235 37.6290 230 39.9
10 390 27.285 380 34.250 270 30.2
200 280 30.4400 355 39.9360 335 40335 310 40.6
5 195 33.920 105 32.525 155 29.650 40 30.6
155 15 40.4145 125 30.1130 185 35.3175 185 41.4220 90 28.5205 0 40.1265 65 24.4390 65 31.6325 105 39.5310 150 34.8385 165 29.9325 220 37.8375 215 29.8200 230 37.4
55 375 27.4395 245 36.5
165 355 40.8270 285 32.9365 340 40330 320 44.1330 290 41.4
0 0 45.3100 0 30.7200 0 40300 0 33.3400 0 33.5
50 50 30.4150 50 36.7250 50 27.6350 50 34.7
0 100 37.9100 100 40.5200 100 31.8300 100 39.8400 100 35.4
50 150 32.4150 150 34.7250 150 34.4350 150 28.9
0 200 34.1100 200 31.5200 200 39.1300 200 35.5400 200 34.9
50 250 33.7150 250 35.4250 250 36.3350 250 34.5
0 300 34.9100 300 27.4200 300 27.5300 300 39400 300 32.4
50 350 26.2150 350 40250 350 29.1350 350 39.3
0 400 36.6100 400 34.6200 400 38.9300 400 37.9400 400 35.4
Appendix B –Case Study Data
B2. Case Study 2 – 2D Copper Deposit Easting Northing Cu
182.88 579.12 0.175243.84 548.64 0.417335.28 548.64 0.489
67.06 487.68 0.215152.4 487.68 0.396
213.36 487.68 0.685274.32 487.68 0.377335.28 487.68 0.427
457.2 487.68 0.1491.44 426.72 0.392152.4 426.72 0.32
210.31 426.72 0.717274.32 426.72 0.806335.28 426.72 0.889396.24 426.72 0.475
152.4 365.76 0.23243.84 365.76 0.833274.32 365.76 0.453335.28 365.76 0.719396.24 365.76 1.009
457.2 365.76 0.893518.16 365.76 0.089579.12 365.76 0.092121.92 335.28 0.102335.28 304.8 0.915396.24 304.8 1.335
457.2 304.8 0.519
518.16 304.8 0.072579.12 304.8 0.04284.68 304.8 1.365220.68 304.8 0.023
152.4 274.32 0.644274.32 243.84 0.258335.28 243.84 0.638396.24 243.84 1.615
457.2 243.84 0.765518.16 243.84 0.465579.12 243.84 0.034115.82 219.46 0.476182.88 213.36 0.409274.32 182.88 0.165335.28 182.88 0.063396.24 182.88 0.406
457.2 182.88 0.909518.16 182.88 0.012
152.4 152.4 0.228274.32 121.92 0.224335.28 121.92 0.188396.24 121.92 0.027
457.2 121.92 0.395335.28 64.01 0.225
Appendix B –Case Study Data
B3. Case Study 3 – 3D Gold Deposit Easting Northing Elevation Length Au 78303.29 4776.742 120.257 0.435 0.02878017.81 4631.307 93.487 0.682 0.04578303.38 4776.22 118.688 0.02 0.08978303.09 4777.861 123.62 0.02 0.10477902.53 4564.935 74.358 0.199 0.12378263.42 4744.765 95.17 0.343 0.141
77902.5 4558.01 79.926 0.589 0.15578303.34 4776.443 119.357 0.03 0.15977902.73 4564.614 72.496 0.189 0.16378018.33 4630.683 90.6 0.199 0.17278018.14 4630.912 91.658 0.02 0.1878018.53 4630.444 89.493 0.305 0.18178299.67 4784.724 92.872 0.257 0.199
78303.2 4777.247 121.773 0.03 0.20178299.8 4784.294 91.44 0.218 0.203
77902.63 4564.766 73.378 0.06 0.20778265.54 4740.429 95.17 0.857 0.21177903.05 4564.098 69.507 0.38 0.22577903.25 4563.777 67.645 0.444 0.22778264.56 4742.429 95.17 0.644 0.23477902.54 4557.89 78.833 0.208 0.23778299.74 4784.509 92.156 0.179 0.23877902.84 4564.445 71.516 0.199 0.25177902.94 4564.276 70.536 0.343 0.25477902.58 4557.775 77.79 0.462 0.25678299.52 4785.227 94.594 0.305 0.26678014.15 4644.552 62.082 0.1 0.27478014.21 4644.394 61.507 0.267 0.27778264.09 4743.395 95.17 0.218 0.27878263.03 4745.574 95.17 0.238 0.28178299.59 4785.007 93.828 0.159 0.28578220.04 4727.461 95.17 0.961 0.28778299.46 4785.448 95.36 0.371 0.2978014.27 4644.235 60.931 2.179 0.30578102.56 4676.925 153.74 0.946 0.30578263.72 4744.159 95.17 0.02 0.31377903.98 4562.514 60.492 0.904 0.32177906.55 4550.72 95.17 0.497 0.32578299.38 4785.71 96.27 0.333 0.32678014.19 4644.46 61.747 0.02 0.32777903.37 4563.583 66.518 0.333 0.33678017.97 4631.109 92.573 0.01 0.337
77903.7 4563.003 63.236 0.54 0.34977902.64 4557.615 76.25 0.343 0.36177905.67 4552.12 95.17 0.54 0.36478024.59 4634.531 95.17 0.514 0.36478174.45 4716.776 88.219 0.286 0.36777902.52 4557.944 79.33 0.11 0.368
78219.09 4730.069 95.17 0.389 0.36877903.5 4563.378 65.343 0.286 0.371
77903.15 4563.938 68.576 0.08 0.38178174.1 4717.228 89.604 0.333 0.38177903.8 4562.828 62.256 0.343 0.386
78175 4716.042 86.053 0.228 0.39878264.93 4741.665 95.17 0.199 0.41178100.83 4680.052 153.74 1.55 0.42178219.46 4729.059 95.17 0.286 0.43878220.51 4726.169 95.17 0.659 0.441
78014.1 4644.671 62.514 0.159 0.45777907.08 4549.872 95.17 0.218 0.47578014.03 4644.869 63.233 1.513 0.49778024.98 4633.71 95.17 0.371 0.49977906.02 4551.568 95.17 0.352 0.50678265.25 4741.013 95.17 0.179 0.54778220.76 4725.465 95.17 0.589 0.57177903.89 4562.671 61.374 0.13 0.59678013.93 4645.132 64.193 1.549 0.612
78365.7 4791.595 153.84 0.847 0.61878014.6 4643.391 57.862 0.352 0.623
78014.52 4643.589 58.581 2.154 0.64678103.53 4675.176 153.74 1.56 0.64677904.08 4562.323 59.414 0.565 0.65678014.65 4643.259 57.382 1.527 0.65778219.82 4728.049 95.17 0.296 0.673
78102 4677.931 153.74 1.513 0.69578366.26 4789.672 153.84 0.942 0.70678102.83 4676.444 153.74 0.882 0.71478220.25 4726.874 95.17 0.169 0.73378102.24 4677.494 153.74 0.324 0.7478014.24 4644.328 61.267 0.169 0.74178172.44 4712.58 95.17 0.745 0.74678013.66 4645.782 66.644 1.559 0.7878014.74 4643.008 56.471 1.493 0.782
78101.7 4678.478 153.74 1.558 0.78778172.1 4713.573 95.17 1.559 0.805
78014.57 4643.457 58.102 0.169 0.8178365.98 4790.633 153.84 0.802 0.81278014.38 4643.945 59.876 2.239 0.81578102.44 4677.144 153.74 0.07 0.82278365.44 4792.508 153.84 0.38 0.83778315.56 4779.559 153.84 1.872 0.85278061.19 4660.662 129.789 0.218 0.85378172.68 4711.895 95.17 0.333 0.85478366.52 4788.759 153.84 0.621 0.88778101.42 4678.981 153.74 0.471 0.91278013.82 4645.391 65.153 0.13 0.91978013.32 4646.565 69.625 0.862 0.926
Appendix B –Case Study Data
78061.5 4659.817 129.789 0.589 0.93778315.08 4780.977 153.84 0.637 0.94778101.14 4679.484 153.74 0.159 0.96478013.21 4646.83 70.634 0.597 0.96678061.83 4658.924 129.789 1.547 0.96778171.71 4714.707 95.17 0.841 0.96978012.63 4627.031 129.789 1.557 0.98178171.87 4714.235 95.17 0.904 1.042
78103.19 4675.788 153.74 0.573 1.07278013.5 4646.161 68.086 1.518 1.076
78100.45 4680.73 153.74 0.435 1.13678316.04 4778.188 153.84 0.751 1.13678013.73 4645.618 66.019 0.149 1.1478013.42 4646.35 68.807 0.913 1.17878315.32 4780.268 153.84 0.724 1.196
Appendix B –Case Study Data
B4. Case Study 4 – 3D Chrome Deposit Easting Northing Elevation Chromite13384.18 22298.82 663.02 7.713197.41 22053.74 702.095 23.813311.75 22093.95 715.03 20.3113311.75 22093.95 705.68 18.1713311.75 22093.95 706.88 28.8613311.75 22093.95 712.23 9.2713311.75 22093.95 708.98 26.0313382.35 22123.39 691.42 15.0513297.75 22223.5 695.37 15.63
13352 22301.75 683.46 1.0113352 22301.75 681.86 13.7413352 22301.75 713.61 7.213352 22301.75 686.16 10.9613352 22301.75 690.66 8.0413352 22301.75 692.06 16.8
13323.25 22291.75 680.34 7.2213328.31 22289.96 706.817 11.2
13333.9 22305.12 702.84 26.513333.9 22302.75 700.472 26.913333.9 22306.71 704.431 12.2
13323.25 22291.75 690.29 15.0113383.08 22294.74 670.338 22.513383.71 22297.08 666.137 15.7
13382.4 22292.2 674.884 16.613381.8 22289.96 678.911 22.9
13386.71 22315.73 701.553 14.6113361.12 22392.97 710.023 9.5513311.75 22093.95 743.53 16.2113311.75 22093.95 730.68 24.2613311.75 22093.95 751.58 14.9513311.75 22093.95 745.08 21.6213330.52 22103.94 741.882 28.813314.85 22125.05 752.94 913313.25 22119.07 759.127 33.113318.21 22137.58 739.964 3713315.14 22126.14 751.808 15.813337.66 22146.56 735.332 20.413318.38 22138.23 739.292 21.613316.42 22139.87 755.798 17.5413316.27 22139.32 756.364 10.8113337.04 22128.29 716.674 14.1913316.98 22141.98 753.606 24.2613338.19 22148.54 733.281 15.713337.92 22147.55 734.306 19.213314.53 22172.26 750.899 7.56
13314.2 22170.72 752.776 7.8113316.2 22180.15 741.285 15.5813316.9 22183.42 737.302 23.69
13341.15 22162.6 733.222 15
13340.76 22161.13 734.743 1713341.41 22163.56 732.233 15.413297.75 22223.5 758.52 9.2113297.75 22223.5 762.92 14.6513297.75 22223.5 724.32 7.5713297.75 22223.5 727.52 6.1413346.36 22221.91 722.733 11.4
13347.4 22225.79 717.945 6.313320 22262.75 727.66 6.3
13330.32 22301.27 732.779 8.1913320 22262.75 730.26 10.71
13326.77 22288.02 746.497 17.5513327.95 22292.43 741.936 19.213384.06 22296.62 755.765 12.3713385.94 22303.66 745.361 15.71
13333.9 22353.24 750.959 8.313330.49 22318.76 742.864 15.3913332.58 22326.58 734.768 10.1413331.03 22320.78 740.778 7.0513333.74 22314.04 719.556 26.3513334.17 22315.65 717.895 12.0813388.94 22314.85 728.815 18.88
13249.5 22380.69 739.71 29.5213250.75 22410.38 739.822 14.2
13249.5 22374.89 733.912 26.613250.75 22411.83 741.272 15.78
13249.5 22377.69 736.705 23.813327.27 22366 750.57 10.2
13358.8 22384.3 719.003 7.8913357.78 22380.51 722.928 8.6213357.41 22379.11 724.377 7.0113335.09 22395.16 720.376 15.3313420.48 22395.98 757.862 30.4
13396.5 22374 740.55 12.5213396.5 22374 754.35 17.4813396.5 22374 742.1 17.1913396.5 22374 748.05 17.0213396.5 22374 749.35 17.9513396.5 22374 751.1 18.31
13429.45 22400.55 731.145 8.913428.73 22401.28 730.12 7.513432.34 23380.89 731.634 16.313431.73 23378.61 734.003 14.613432.71 23382.26 730.22 11.613297.75 22223.5 767.12 15.75
References
253
References
1. Amari, S., Learning Patterns and Pattern Sequences by Self-Organising Nets of Threshold
Elements, IEEE Trans. Computers, C-21 (11), 1197-1206, November 1972.
2. Anderson, J.A., Cognitive Capabilities of a Parallel System. In: Bienenstock, E., et al [eds],
Disordered Systems and Biological Organisation, NATO ASI Series, F20, Springer-Verlag,
New York, 1986
3. Arbib, M.A. (ed), The Handbook of Brain Theory and Neural Networks. MIT Press,
Cambridge, 1995.
4. Ash, T., Dynamic Node Creation in Backpropagation Networks. ICS Report 8901, Institute of
Cognitive Science, University of California, San Diego, California, 1989.
5. Badiozamani, K., Computer Methods. – Mining Engineering Handbook
6. Barhen, J, and Reister, D., DeepNet: an Ultrafast Neural Learning Code for Seismic
Imaging. In: International Joint Conference on Neural Networks (IJCNN ’99), International
Neural Network Society and The Neural Networks Council of IEEE, Washington DC, USA,
1999.
7. Barto, A.G., Reinforcement Learning and Adaptive Critic Methods. In: White, D.A., and
Sofge, D.A., (eds), Handbook of Intelligent Control, pp. 469-491, Van Nostrand Reinhold,
New York, 1992.
8. Bischof, H., Schneider, W., and Pinz, A.J., Multispectral Classification of Landsat Images
Using Neural Networks. IEEE Transactions of Geoscience and Remote Sensing, Vol. 30, No.
3, 1992.
9. Bishop, C.M., Neural Networks for Pattern Recognition. Clarendon Press, Oxford, 1995. 60
10. Bradford, S.H., The Application of Artificial Intelligence to Mineral Processing Control.
Ph.D. Thesis, Department of Mineral Resources Engineering, University of Nottingham, 1994.
11. Broomhead, D.S., and Lowe, D., Multivariable Functional Interpolation and Adaptive
Networks. Complex Systems, Vol. 2, pp 321-355, 1988.
12. Burnett, C.C.H., Application of Neural Networks to Mineral Reserve Estimation. Ph.D.
Thesis, Department of Mineral Resources Engineering, University of Nottingham, 1995.
13. Caiti, A., and Parisini, T., Mapping of Ocean Sediments by Networks of Parallel
Interpolating Units. IEEE Conference on Neural Networks for Ocean Engineering, pp 231-
238, Washington DC, USA, 1991.
14. Chen, S., Nonlinear Time Series Modelling and Prediction Using Gaussian RBF networks
with Enhanced Clustering and RLS Learning. Electronic Letters, Vol. 31, No. 2, pp 117-118,
1995.
15. Chinunrueng, C., and Sequin, C.H., Optimal Adaptive k-means Algorithm with Dynamic
Adjustment of Learning Rate. IEEE Trans. On Neural Networks, Vol. 6, pp 157-169. 1994.
References
254
16. Clarici, E., Owen, D., Durucan, S., and Ravenscroft, P., Recoverable Reserve Estimation
Using a Neural Network. 24th International Symposium on the Application of Computers and
Operations Research in the Minerals Industries (APCOM), Montreal, Quebec, Canada, 1993.
17. Clark, I., Practical Geostatistics. Elsevier, Amsterdam, 1979.
18. Cortez, L.P., Sousa, A.J., and Durao, F.O., Mineral Resources Estimation Using Neural
Networks and Geostatistical Techniques. 27th International Symposium on the Application of
Computers and Operations Research in the Minerals Industries (APCOM), The Institution of
Mining and Metallurgy (IMM), London, 1998.
19. Cybenko, G., Approximation by Superpositions of a Sigmoidal Function. Mathematics of
Control, Signals, and Systems, Vol. 2, pp 303-314, 1989.
20. David, M., Geostatistical Ore Reserve Estimation. Elsevier, Amsterdam, 1977.
21. David, M., Handbook of Applied Advanced Geostatistical Ore Reserve Estimation. Elsevier,
Amsterdam, 1988.
22. Duchon, J., Spline Minimising Rotation-Invariant Semi-norms in Sobolev Spaces. In:
Schempp W., and Zeller, K., (eds), Constructive Theory of Functions of Several Variables,
Lecture Notes in Mathematics, pp 85-100, 1977.
23. Duda, R.O., and Hart, P.E., Pattern Classification and Scene Analysis. Wiley, New York,
1973.
24. Fahlman, S.E., Fast Learning Variations on Backpropagation: An Empirical Study. In:
Touretzky, D.S., Hinton, G., and Sejnowski, T., (eds), Proceedings of 1988 Connectionist
Models Summer School, Morgan Kaufmann Publishers, San Mateo, California, 1988.
25. Flament, F., Thibault, J., and Hodouin, D., Neural Network Based Control of Mineral
Grinding Plants. Minerals Engineering, Vol. 6, No. 3, pp 235-249, 1993.
26. Garcia, G., and Whitman, W.W., Inversion of a Lateral Log Using Neural Networks. SPE
24454, Society of Petroleum Engineers, 1992.
27. Geva, S., and Sitte, J., A Constructive Method for Multivariate Function Approximation by
Multilayer Perceptrons. IEEE Transactions on Neural Networks, Vol. 3, No. 4, 1992.
28. Golub, G.H., and Van Loan, C.G., Matrix Computations, 3rd Edition. Johns Hopkins
University Press, Baltimore, 1996.
29. Gopal, S., and Woodcock, C., Remote Sensing of Forest Change Using Artificial Neural
Networks. IEEE Transactions of Geoscience and Remote Sensing, Vol. 34, No. 2, 1996.
30. Grossberg, S., Studies of Mind and Brain: Neural Principles of Learning, Perception,
Development, Cognition, and Motor Control. Reidel Press, Boston, 1982.
31. Hassoun, M.H., Fundamentals of Artificial Neural Networks. MIT Press, Cambridge, 1995.
14
32. Haykin, .S., Neural Networks – A Comprehensive Foundation. Prentice Hall, New Jersey,
1999.
33. Hebb, D., The Organisation of Behaviour. John Wiley, New York, 1949.
34. Hopfield, J.J., Neural Networks and Physical Systems with Emergent Collective
Computational Abilities. Proc. National Acad. Sci.,79, pp 2554-2558, 1982.
References
255
35. Hopfield, J.J., Neurons with Graded Response Have Collective Computational Properties
Like those of Two-State Neurons. Proc. National Acad. Sci., USA, Vol. 81, 3088-3092.
36. Hughes, W.E., Davis, F.B., and Darey, R.K., Drillhole Interpolation: Mineralised
Interpolation Techniques. In: Crawford, J.T., and Hustrulid, W. A., (eds), Open-Pit Mine
Planning and Design, AIME, New York, pp 51-64, 1979.
37. Isaaks, E.H., and Srivastava, R.M., Applied Geostatistics. Oxford University Press, New
York, 1989.
38. Journel, A.G., and Huijbregts, Ch.J., Mining Geostatistics. Academic Press, London, 1978.
39. Kapageridis I., Denby B., and Hunter G., Integration of a Neural Ore Grade Estimation
Tool In a 3D Resource Modeling Package. In: Proceedings of the International Joint
Conference on Neural Networks (IJCNN ’99), International Neural Network Society and The
Neural Networks Council of IEEE, Washington D.C., 1999.
40. Kapageridis I., Denby B., Neural Network Modelling of Ore Grade Spatial Variability. In:
Proceedings of the International Conference for Artificial Neural Networks (ICANN 98), Vol.
1, pp 209 – 214, Springer-Verlag, Skovde, 1998.
41. Kapageridis I., Denby B., Ore Grade Estimation with Modular Neural Network Systems – a
Case Study. In: Panagiotou G (ed) Information technology in the minerals industry (MineIT
’97). AA Balkema, Rotterdam, 1998.
42. Kapageridis, I.K., Assessment of Neural Network Prediction Techniques for Grade
Estimation. MSc Thesis, AIMS Research Unit, Department of Mineral Resources
Engineering, University of Nottingham, 1996.
43. King, R.L., Hicks, M.A., and Signer, S.P., Using Unsupervised Learning for Feature
Detection in a Coal Mine Roof. Engineering Applications of Artificial Intelligence, Vol. 6,
No. 6, pp 565-573, 1993.
44. Kirsch, A., An Introduction to the Mathematical Theory of Inverse Problems. Springer-
Verlag, New York, 1996.
45. Kohonen, T., Correlation Matrix Memories. IEEE Trans. Computers, Vol. C-21, pp 353-359,
1972.
46. Kohonen, T., Self-Organisation and Associative Memory. Springer-Verlag, Berlin, 1984.
47. Kohonen, T., Self-Organising Maps, 2nd Edition. Springer-Verlag, Berlin, 1995.
48. Krasnopolsky, V., Using NNs to Retrieve Multiple Geophysical Parameters from Satelite
Data. In: International Joint Conference on Neural Networks (IJCNN ’99), International
Neural Network Society and The Neural Networks Council of IEEE, Washington DC, USA,
1999.
49. Krige, D.G., Log-normal – de Wijsian Geostatistics for Ore Evaluation. South African
Institute of Mining and Metallurgy, Johannesburg, 1981.
50. Lang, K.J., and Hinton, G.F., The Development of the Time-Delay Neural Network
Architecture for Speech Recognition. Technical Report CMU-CS-88-152, Carnegie-Mellon
University, Pittsburgh PA, 1988.
References
256
51. Leonard, J.A., Kramer, M.A., and Ungar, L.H., A Neural Network Architecture that
Computes Its Own Reliability. Computers Chem. Engineering, Vol. 16, No. 9, pp 819-835,
1992.
52. Leonard, J.A., Kramer, M.A., and Ungar, L.H., Using Radial Basis Functions to
Approximated a Function and Its Error Bounds. IEEE Transactions on Neural Networks, Vol.
3, No. 4, pp 624-627, 1992.
53. Looney, C.G., Pattern Recognition Using Neural Networks: Theory and Algorithms for
Engineers and Scientists. Oxford University Press, New York, 1997.
54. Lowe, D., Novel ‘Topographic’ Nonlinear Feature Extraction Using Radial Basis Functions
for Concentration Coding in the ‘Artificial Nose’. In: Third IEE International Conference on
Artificial Neural Networks, Conference Publication 349, pp 95-99, Institute of Electrical
Engineers, 1993.
55. Lowe, D., Radial Basis Function Networks. In: Arbib, M.A.(ed), The handbook of Brain
Theory and Neural Networks, pp 930-934, MIT Press, Cambridge, 1995.
56. Malki, H.A., and Baldwin, J.L., On the Comparison Results of the Neural Networks Trained
Using Well-Logs from one Service Company and Tested on Another Service Company’s Data.
In: Simpson, P.K., (ed), Neural Networks Applications, IEEE Technology Update Series, pp
665-668, IEEE, New York, 1996.
57. Maptek, Envisage Core Reference Manual. Maptek/KRJA Systems Ltd, 1998.
58. Matheron, G., The Theory of Regionalised Variables and Its Applications. Les Cahiers du
Centre de Morphologie Mathematique de Fontainebleau, Ecole des Mines de Paris, 211p,
1971.
59. Maxwell, A.P., Denby, B., and Pitts, W., The Application of Neural Networks to Size
Analysis of Minerals on Conveyors. 25th International Symposium on the Application of
Computers and Operations Research in the Minerals Industries (APCOM), Brisbane,
Australia, 1995.
60. McCulloch, W., and Pitts, W., A Logical Calculus of the Ideas Immanent in Nervous
Activity. Bulletin of Mathematical Biophysics, 1943, Vol. 5, pp. 115-133.
61. Meinguet, J., Multivariate Interpolation at Arbitrary Points Made Simple. Journal of Applied
Mathematics and Physics (ZAMP), 30, pp 292-304, 1979.
62. Micchelli, C.A., Interpolation of Scattered Data: Distance Matrices and Conditionally
Positive Definite Functions. Constructive Approximation, Vol. 2, pp 11-22, 1986.
63. Millar, D.L., and Hudson, J.A., Rock Engineering System Performance Monitoring Using
Neural Networks. Preprints of the ‘Artificial Intelligence in the Minerals Sector’ (one day
symposium held at the University of Nottingham), 1993.
64. Minsky, M., Neural Nets and the Brain – Model Problem. Doctoral Dissertation, Princeton
University, Princeton NJ, 1954.
65. Moody, J., and Darken, C.J., Fast Learning in Networks of Locally-Tuned Processing Units.
Neural Computation, Vol. 1, pp 281-294, 1989.
References
257
66. Morozov, V.A., Regularisation Methods for Ill-Posed Problems. CRC Press, Boca Raton, FL,
1993.
67. Murat, M.E., and Rudman, A.J., Automated First Arrival Picking: A Neural Network
Approach. Geophysical Prospecting, Vol. 40, pp 587-604, 1992.
68. Nadaraya, E.A., On Estimating Regression. Theory of Probability and its Applications, Vol.
9, pp 141-142, 1964.
69. Neumann, J. von, Probabilistic Logic and the Synthesis of Reliable Organisms From
Unreliable Components. In: Shannon, C., and McCarthy, J. (eds), Automata Series, Princeton
University Press, Princeton, 1956, pp. 43-98.
70. Neural Mining Solutions, Neural Computing in Mineral Exploration. White Paper, Neural
Technologies, 1996.
71. Noble, A.C., Ore Reserve/Resource Estimation. In: Mining Engineering Handbook, SME.
72. Oja, M., and Nystom, L., The Use of Self-Organising Maps in Particle Shape Quantification.
In: Hoberg, H., and von Blottnitz, H., (eds), Proceedings of the XX International Mineral
Processing Congress, Vol. 1, pp 141-150, Aachen, Germany, 1997.
73. Park, J., and Sandberg, I.W., Approximation and Radial Basis Function Networks. Neural
Computation, Vol. 5, pp 305-316, 1993.
74. Park, J., and Sandberg, I.W., Universal Approximation Using Radial Basis Function
Networks. Neural Computation, Vol. 3, pp 246-257, 1991.
75. Parzen, E., On Estimation of A Probability Density Function and Mode. Ann. Math. Statist.,
Vol. 33, pp 1065-1076, 1962.
76. Petersen, K.R.P., and Lorenzen, L., Gold Liberation Modelling Using Neural Network
Analysis of Diagnostic Leaching Data. In: Hoberg, H., and von Blottnitz, H., (eds),
Proceedings of the XX International Mineral Processing Congress, Vol. 1, pp 391-400,
Aachen, Germany, 1997.
77. Poggio, T., and Girosi, F., Regularisation Algorithms for Learning that Are Equivalent to
Multilayer Networks. Science, Vol. 247, pp 978-982, 1990.
78. Poulton, M., and Zaverton, K., Comparison of Neural Network Paradigms for Classification
of TM Images. 23rd International Symposium on the Application of Computers and Operations
Research in the Minerals Industries (APCOM), Arizona, USA, 1992.
79. Powel, M.D., Approximation Theory and Methods. Cambridge University Press, Cambridge,
1981.
80. Powel, M.D., The Theory of Radial Basis Function Approximation in 1990. In Light, W.,
(ed.), Advances in Numerical Analysis Vol. II: Wavelets, Subdivision Algorithms, and Radial
Basis Functions, pp 105-210, Oxford Science Publications, Oxford, 1992
81. Readdy, L.A., Bolin, D.S., and Mathieson, G.A., Ore Reserve Calculation – Underground
Mining Methods Handbook.
82. Ripley, B.D., Pattern Recognition and Neural Networks. Cambridge University Press,
Cambridge, 1996.
References
258
83. Ripley, B.D., Statistical Ideas for Selecting Network Architectures. In: Kappen, B., and
Gielen, S., (eds), Neural Networks: Artificial Intelligence and Industrial Applications,
Springer, London, 1995.
84. Roesler, K.S., Improved Geo-Sensing Using Artificial Intelligence Techniques for
Tomographic Interpretation. 23rd International Symposium on the Application of Computers
and Operations Research in the Minerals Industries (APCOM), Arizona, USA, 1992.
85. Rogers, S.J., Fang, J.H., Karr, C.L., and Stanley, D.A., Determination of Lithology from
Well Logs Using Neural Networks. Bulletin of the American Association of Petroleum
Geologists, pp 731-739, May 1992.
86. Rojas, R., A Graphical Proof of the Backpropagation Learning Algorithm. In: V. Malyshkin
(ed.), Parallel Computing technologies, PACT 93, Obnisk, Russia.
87. Rojas, R., Neural Networks – A Systematic Introduction. Springer-Verlag, Berlin, 1996.
88. Rosenblatt, F., The Perceptron: A Probabilistic Model for Information Storage and
Organisation in the Brain. Pyschol. Rev., 65, 386-408, 1958.
89. Rumelhart, D., and McClelland, J., Parallel Distributed Processing. MIT Press, Cambridge
MA, 1986.
90. Rumelhart, D.E., and Zipper. D., Feature Discovery by Competitive Learning. Cognitive
Science, Vol. 9, pp.75-112, 1985.
91. Ryman-Tubb, N., and Bolt, G., The Use of Neural Techniques for Integrated Process System
Modelling and Optimisation. White Paper, Neural Technologies Limited, 1996.
92. Schalkoff, R.J., Artificial Neural Networks. McGraw-Hill, Computer Science Series, New
York, 1997.
93. Schalkoff, R.J., Digital Image Processing and Computer Vision. John Wiley & Sons, New
York, 1989.
94. Schofield, D., Surface Mine Design Using Intelligent Computing Techniques. Ph.D. Thesis,
Department of Mineral Resources Engineering, University of Nottingham, 1992.
95. Signer, S.P., and King, R.L., Evaluation of Coal Mine Roof Supports Using Artificial
Intelligence. In: 23rd International Symposium on the Application of Computers and
Operations Research in the Minerals Industries (APCOM), Arizona, USA, 1992.
96. Singh, S.P., (ed.), Approximation Theory, Spline Functions and Applications. Kluwer,
Dordrecht, The Netherlands, 1992.
97. SNNS, Stuttgart Neural Network Simulator Version 4.1 User’s Manual. Report No. 6/95,
Institute for Parallel and Distributed High Performance Systems (IPVR), University of
Stuttgart, 1996.
98. Steibuch, K., Die Lernmatrix. Kybernetik (Biol. Cyber.), 1(1), 36-45, 1961.
99. Stent, G.S., A Physiological Mechanism for Hebb’s Postulate of Learning. Proceedings of the
National Academy of Sciences, USA, Vol. 70, pp. 997-1001, 1973.
100. Stevens, C., Die Nervenzelle, in: Gerhin und Nervensystem, 1988, pp. 2-13.
101. Tikhonov, A.N., and Arsenin, V.Y., Solutions to Ill-Posed Problems. W.H. Winston,
Washington, DC, 1977.
References
259
102. Tikhonov, A.N., On Solving Incorrectly Posed Problems and Method of Regularisation.
Doklady Akademii Nauk USSR, Vol. 151, pp 501-504, 1963.
103. Van der Walt, T.J., van Deventer, J.S.J., Barnard, E., and Oosthuizen, G.D., The
Simulation of Ill-Defined Processing Operations Using Connectionist Networks. 23rd
International Symposium on the Application of Computers and Operations Research in the
Minerals Industries (APCOM), Arizona, USA, pp 881-888, 1992.
104. Van Deventer, J.S.J., Bezuidenhout, M., and Moolman, D.W., On-Line Visualisation of
Flotation Performance Using Neural Computer Vision of the Froth Texture. In: Hoberg, H.,
and von Blottnitz, H., (eds), Proceedings of the XX International Mineral Processing
Congress, Vol. 1, pp 315-326, Aachen, Germany, 1997.
105. Waibel, A., Hanazawa, T., Hinton, G., Shikano, K., and Lang, K.J., Phoneme Recognition
Using Time-Delay Neural Networks. IEEE Transactions on Acoustics, Speech and Signal
Processing, Vol. ASSP-37, pp. 328-339, 1989.
106. Walter, K.U., Neural Network Technology for Strata Strength Characterisation. In:
International Joint Conference on Neural Networks (IJCNN ’99), International Neural
Network Society and The Neural Networks Council of IEEE, Washington DC, USA, 1999.
107. Wanstedt, S., huang, Y., and Malmstrom, L., Using Neural Networks to Interpret
Geophysical Logs in the Zinkgruvan Mine. In: Panagiotou, G., (ed) Information technology in
the minerals industry (MineIT ’97). AA Balkema, Rotterdam, 1998.
108. Watson, G.S., Smooth Regression Analysis. Sankya: The Indian Journal of Statistics, Series
A, Vol. 26, pp 359-372, 1964.
109. Widrow, B., Generalisation and Information Storage in Networks of ADALINE Neurons. In:
Yovits, G.T., (ed.), Self-Organising Systems, Spartan Books, Washington DC, 1962.
110. Williams, P.M., Image Compression for Neural Networks Using Chebyshev Polynomials. In:
Alexander, I., and Taylor, J., (eds), Artificial Neural Networks, pp 1139-1142, 1992.
111. Wolfram, S., Mathematica 2.1 User’s Manual. Wolfram Research, Cambridge University
Press, 1991.
112. Wu, X., and Zhou, Y., Reserve Estimation Using Neural Network Techniques. Computers &
Geosciences, Vol. 9, No. 4, pp 567-575, 1993.
113. Wu, X., Neural Network-Based Material Modelling. Ph.D. Thesis, Dept. Civil Engineering,
University of Illinois, Urbana, Illinois, 1991.
114. Xiao, R., and Chandrasekar, V., Development of a Neural Network Based Algorithm for
Rainfall Estimation from Radar Observations. IEEE Transactions on Geoscience and Remote
Sensing, Vol. 35, No. 1, 1997.
115. Yama, B.R., and Lineberry, G.T., Artificial Neural Network Application for a Predictive
Task in Mining. SME, Mining Engineering, February 1999, pp 59-64.
116. Yee, P.V., Regularised Radial Basis Function Networks: Theory and Applications to
Probability Estimation, Classification, and Time Series Prediction. Ph.D. Thesis, McMaster
University, Hamilton, Ontario, 1998.
References
260
117. Zadeh, L.A., Knowledge Representation in Fuzzy Logic. In: Yager, R.R., and Zadeh, L.A.,
eds, An Introduction to Fuzzy Logic Applications in Intelligent Systems, Kluwer Academic,
Boston, 1992.
118. Zipellius, A., and Engel, A., Statistical Mechanics of Neural Networks. In: Arbib, M.A.(ed),
The handbook of Brain Theory and Neural Networks, pp 930-934, MIT Press, Cambridge,
1995.