APPLICATION OF ARTIFICIAL NEURAL NETWORK SYSTEMS … · 2nd Regional VULCAN Conference, Maptek/KRJA Systems, Nice, 1999. Abstract iv Acknowledgements I would like to thank Professor

University of Nottingham School of Chemical, Environmental, and Mining Engineering

APPLICATION OF ARTIFICIAL NEURAL NETWORK SYSTEMS TO GRADE ESTIMATION

FROM EXPLORATION DATA

by

Ioannis K. Kapageridis M.Sc.

Thesis submitted to the University of Nottingham for the Degree of Doctor of Philosophy

October 1999

Abstract

i

Abstract

Artificial Neural Networks (ANN) become increasingly popular within the resources

industry. ANN technology provides solutions to problems characterised by shortage

or bad quality of input data. It is a purpose of this research work to show that

estimation of ore grades within a mineral deposit is one of these problems where

ANNs can be applied successfully.

Ore grade is one of the main variables that characterise an orebody. Almost

every mining project begins with the determination of ore grade distribution in three-

dimensional space, a problem often reduced to modelling the spatial variability of ore

grade values. At the early stages of a mining project, the distribution of ore grades

has to be determined to enable the calculation of ore reserves within the deposit and

to aid the planning of mining operations throughout the entire life of a mine. The

estimation of ore grades/reserves is a very important and money-consuming stage in a

mine project. The profitability of the project is often depending on the results of

grade estimation.

For the last three decades the mining industry has adopted and applied

geostatistics as the main solution to problems of evaluation of mineral deposits.

Geostatistics provide powerful tools for modelling most of the aspects of an ore

deposit. However, geostatistics and other more conventional methods require a lot of

assumptions, knowledge, skills and time to be effectively applied while their results

are not always easy to justify.

The work that has been undertaken in the AIMS Research Unit at the

University of Nottingham aimed at assessing the suitability of ANN systems for the

problem of ore grade estimation and the development of a complete ANN based

Abstract

ii

system that will handle real exploration data in order to provide ore grade estimates.

GEMNET II is a modular neural network system designed and developed by the

Author to receive 3D exploration data from an orebody and perform ore grade

estimation on a block model basis. The aims of the system are to provide a valid

alternative to conventional grade estimation techniques while reducing considerably

the time and knowledge required for development and application.

Abstract

iii

Affirmation The following papers have been published based on the research presented in this thesis: Kapageridis I., Denby B. Ore grade estimation with modular neural network systems – a case study. In: Panagiotou G (ed) Information technology in the minerals industry (MineIT ’97). AA Balkema, Rotterdam, 1998. Kapageridis I., Denby B. Neural network modelling of ore grade spatial variability. In: Proceedings of the International Conference for Artificial Neural Networks (ICANN 98), Vol. 1, pp 209 – 214, Springer-Verlag, Skovde, 1998. Kapageridis I., Denby B., and Hunter G. Integration of a Neural Ore Grade Estimation Tool In a 3D Resource Modeling Package. In: Proceedings of the International Joint Conference on Neural Networks (IJCNN ’99), International Neural Network Society, and The Neural Networks Council of IEEE, Washington D.C., 1999. Kapageridis I., Denby B., Schofield, D., and Hunter G. GEMNET II – A Neural Ore Grade Estimation System. In: 29th Internation Symposium on the Application of Computers and Operations Research in the Minerals Industries (APCOM ’99), Denver, Colorado. Kapageridis I, Denby B., and Hunter G. Ore Grade Estimation and Artificial Neural Networks. Mineral Wealth Journal, Jul. – Sep. 99, No. 112, The Scientific Society of the Mineral Wealth Technologists, Athens. Kapageridis I., Denby B. Ore Grade Estimation Using Artificial Neural Networks. In: 2nd Regional VULCAN Conference, Maptek/KRJA Systems, Nice, 1999.

Abstract

iv

Acknowledgements I would like to thank Professor Bryan Denby for his guidance and help through the

duration of my studies at the University of Nottingham. I would also like to thank

him for introducing me to the exciting world of the AIMS Research Unit.

Thanks should go to everyone at the AIMS Research Unit, people who have

been there and others who still are, and who made it all so much easier. Special

thanks to Dr. Damian Schofield for being such a good friend and teacher, and also for

sharing his music CD collection with me.

A big thank you goes to the State Scholarships Foundation of Greece for

making it all possible. Their investment in me was most appreciated.

Many thanks to everyone at the Nottingham office of Maptek/KRJA Systems

for the help and support over the last year of my studies. In particular, I would like to

thank Dr. Graham Hunter, David Muller, and Les Neilson for their help and advice.

Finally, I would like to thank all my friends and in particular David Newton,

Marina Lisurenko, and Stefanos Gazeas for their support and for some unforgettable

times in Nottingham.

Contents

v

Contents ABSTRACT ........................................................................................................................................... I

AFFIRMATION................................................................................................................................. III

ACKNOWLEDGEMENTS ................................................................................................................IV

CONTENTS.......................................................................................................................................... V

LIST OF FIGURES..........................................................................................................................VIII

LIST OF TABLES............................................................................................................................XIII

1. INTRODUCTION ............................................................................................................................. 1 1.1 THE PROBLEM OF GRADE ESTIMATION .......................................................................................... 1 1.2 GRADE DATA FROM EXPLORATION PROGRAMS ............................................................................. 3 1.3 EXISTING METHODS FOR GRADE ESTIMATION ............................................................................... 7

1.3.1 General ................................................................................................................................... 7 1.3.2 Geometrical Methods.............................................................................................................. 7 1.3.3 Inverse Distance Method ...................................................................................................... 10 1.3.4 Geostatistics ......................................................................................................................... 12 1.3.5 Conclusions .......................................................................................................................... 15

1.4 BLOCK MODELLING & GRID MODELLING IN GRADE ESTIMATION ............................................... 16 1.5 ARTIFICIAL NEURAL NETWORKS FOR GRADE ESTIMATION.......................................................... 18 1.6 RESEARCH OBJECTIVES ................................................................................................................ 19 1.7 THESIS OVERVIEW........................................................................................................................ 20

2. ARTIFICIAL NEURAL NETWORKS THEORY....................................................................... 23 2.1 INTRODUCTION............................................................................................................................. 23

2.1.1 Biological Background ......................................................................................................... 23 2.1.2 Statistical Background.......................................................................................................... 25 2.1.3 History .................................................................................................................................. 27

2.2 BASIC STRUCTURE – PRINCIPLES.................................................................................................. 29 2.2.1 The Artificial Neuron – the Processing Element .................................................................. 29 2.2.2 The Artificial Neural Network .............................................................................................. 31

2.3 LEARNING ALGORITHMS .............................................................................................................. 33 2.3.1 Overview............................................................................................................................... 33 2.3.2 Error Correction Learning ................................................................................................... 33 2.3.3 Memory Based Learning....................................................................................................... 35 2.3.4 Hebbian Learning................................................................................................................. 35 2.3.5 Competitive Learning ........................................................................................................... 36 2.3.6 Boltzmann Learning ............................................................................................................. 37 2.3.7 Self-Organized Learning ...................................................................................................... 39 2.3.8 Reinforcement Learning ....................................................................................................... 40

2.4 MAJOR TYPES OF ARTIFICIAL NEURAL NETWORKS...................................................................... 40 2.4.1 Feedforward Networks ......................................................................................................... 40 2.4.2 Recurrent Networks .............................................................................................................. 42 2.4.3 Self-Organizing Networks..................................................................................................... 43 2.4.4 Radial Basis Function Networks and Time Delay Neural Networks .................................... 44 2.4.5 Fuzzy Neural Networks......................................................................................................... 46

2.5 CONCLUSIONS .............................................................................................................................. 48 3. RADIAL BASIS FUNCTION NETWORKS ................................................................................ 23

3.1 INTRODUCTION............................................................................................................................. 23 3.2 RADIAL BASIS FUNCTION NETWORKS – THEORETICAL FOUNDATIONS ........................................ 24

3.2.1 Overview............................................................................................................................... 24 3.2.2 Multivariable Interpolation .................................................................................................. 24

Contents

vi

3.2.3 The Hyper-Surface Reconstruction Problem........................................................................ 26 3.2.4 Regularisation ...................................................................................................................... 28

3.3 RADIAL BASIS FUNCTION NETWORKS .......................................................................................... 31 3.3.1 General ................................................................................................................................. 31 3.3.2 RBF Structure ....................................................................................................................... 31 3.3.3 RBF Initialisation and Learning........................................................................................... 32

3.4 FUNCTION APPROXIMATION WITH RBFNS ................................................................................... 39 3.4.1 General ................................................................................................................................. 39 3.4.2 Universal Approximation...................................................................................................... 39 3.4.3 Input Dimensionality ............................................................................................................ 40 3.4.4 Comparison of RBFNs and Multi-Layer Perceptrons .......................................................... 41

3.5 SUITABILITY OF RBFNS FOR GRADE ESTIMATION ....................................................................... 42 4. MINING APPLICATIONS OF ARTIFICIAL NEURAL NETWORKS................................... 71

4.1 OVERVIEW.................................................................................................................................... 71 4.2 ANN SYSTEMS FOR EXPLORATION AND RESOURCE ESTIMATION ................................................ 72

4.2.1 General ................................................................................................................................. 72 4.2.2 Sample Location Based Systems........................................................................................... 73

POPULATIONS..................................................................................................................................... 79 4.2.3 Sample Neighborhood Based Systems .................................................................................. 80 4.2.4 Conclusions .......................................................................................................................... 85

4.3 ANN SYSTEMS FOR OTHER MINING APPLICATIONS..................................................................... 86 4.3.1 Overview............................................................................................................................... 86 4.3.2 Geophysics............................................................................................................................ 86 4.3.3 Rock Engineering ................................................................................................................. 89 4.3.4 Mineral Processing............................................................................................................... 89 4.3.5 Remote Sensing..................................................................................................................... 91 4.3.6 Process Control-Optimisation and Equipment Selection ..................................................... 93

4.4 CONCLUSIONS .............................................................................................................................. 94 5. DEVELOPMENT OF A MODULAR NEURAL NETWORK SYSTEM FOR GRADE ESTIMATION ..................................................................................................................................... 96

5.1 INTRODUCTION............................................................................................................................. 96 5.2 FORMING THE INPUT SPACE FROM 2D SAMPLES........................................................................... 98 5.3 DEVELOPMENT OF THE NEURAL NETWORK TOPOLOGIES ........................................................... 106

5.3.1 Overview............................................................................................................................. 106 5.3.2 The Hidden Layer ............................................................................................................... 107 5.3.3 Final Weights and Output................................................................................................... 110

5.4 LEARNING FROM 2D SAMPLES.................................................................................................... 111 5.4.1 Overview............................................................................................................................. 111 5.4.2 Module 1 – Learning from Octants..................................................................................... 112 5.4.3 Module 2 – Learning from Quadrants ................................................................................ 115 5.4.4 Module 3 – Learning from Sample 2D Co-ordinates ......................................................... 117

5.5 TRANSITION FROM 2D TO 3D DATA ........................................................................................... 120 5.5.1 General ............................................................................................................................... 120 5.5.2 Input Space: Adding the Third Co-ordinate ....................................................................... 121 5.5.3 Input Space: Adding the Sample Volume............................................................................ 122 5.5.4 Search Method: Expanding to Three Dimensions .............................................................. 123

5.6 COMPLETE PROTOTYPE OF THE MNNS ...................................................................................... 126 5.7 CONCLUSIONS ............................................................................................................................ 129

6. CASE STUDIES OF THE PROTOTYPE MODULAR NEURAL NETWORK SYSTEM.... 131 6.1 OVERVIEW.................................................................................................................................. 131 6.2 CASE STUDY 1 – 2D IRON ORE DEPOSIT..................................................................................... 133 6.3 CASE STUDY 2 – 2D COPPER DEPOSIT........................................................................................ 136 6.4 CASE STUDY 3 – 3D GOLD DEPOSIT........................................................................................... 140 6.5 CASE STUDY 4 – 3D CHROMITE DEPOSIT ................................................................................... 146 6.6 CONCLUSIONS ............................................................................................................................ 149

7. GEMNET II – AN INTEGRATED SYSTEM FOR GRADE ESTIMATION......................... 150

Contents

vii

7.1 OVERVIEW.................................................................................................................................. 150 7.2 CORE ARCHITECTURE AND OPERATION...................................................................................... 152

7.2.1 Exploration Data Processing and Control Module ............................................................ 152 7.2.2 Module Two – Modeling Grade’s Spatial Distribution ...................................................... 159 7.2.3 Module One – Modelling Grade’s Spatial Variability........................................................ 162 7.2.4 Final Module – Providing a Single Grade Estimate........................................................... 164

7.3 VALIDATION............................................................................................................................... 167 7.3.1 Training and Validation Errors.......................................................................................... 167 7.3.2 Reliability Indicator............................................................................................................ 168 7.3.3 Module Index ...................................................................................................................... 170 7.3.4 RBF Centres Visualisation ................................................................................................. 171

7.4 INTEGRATION ............................................................................................................................. 172 7.4.1 Neural Network Simulator.................................................................................................. 172 7.4.2 Interface with VULCAN – 3D Visualization....................................................................... 176

7.5 CONCLUSIONS ............................................................................................................................ 182 8. GEMNET II APPLICATION – CASE STUDIES...................................................................... 185

8.1 OVERVIEW.................................................................................................................................. 185 8.2 CASE STUDY 1 – COPPER/GOLD DEPOSIT 1 ................................................................................ 188 8.3 CASE STUDY 2 – COPPER/GOLD DEPOSIT 2 ................................................................................ 197 8.4 CASE STUDY 3 – COPPER/GOLD DEPOSIT 3 ................................................................................ 209 8.5 CASE STUDY 4 – COPPER/GOLD DEPOSIT 4 ................................................................................ 220 8.6 CONCLUSIONS ............................................................................................................................ 226

9. CONCLUSIONS AND FURTHER RESEARCH....................................................................... 185 9.1 CONCLUSIONS ............................................................................................................................ 185 9.2 FURTHER RESEARCH .................................................................................................................. 188

APPENDIX A – FILE STRUCTURES............................................................................................ 239 A1. SNNS NETWORK DESCRIPTION FILE ......................................................................................... 239 A2. SNNS NETWORK PATTERN FILE ............................................................................................... 241 A3. BATCHMAN NETWORK DEVELOPMENT SCRIPT ..................................................................... 242 A4. SNNS2C NETWORK C CODE EXTRACT..................................................................................... 243 A5. VULCAN COMPOSITES FILE..................................................................................................... 247

APPENDIX B – CASE STUDY DATA ........................................................................................... 254 B1. CASE STUDY 1 – 2D IRON ORE DEPOSIT.................................................................................... 254 B2. CASE STUDY 2 – 2D COPPER DEPOSIT ....................................................................................... 254 B3. CASE STUDY 3 – 3D GOLD DEPOSIT .......................................................................................... 246 B4. CASE STUDY 4 – 3D CHROME DEPOSIT ..................................................................................... 246

REFERENCES .................................................................................................................................. 253

Contents

viii

List of Figures Chapter 1

Figure 1.1: Drillholes from exploration programme and development, intersecting the orebody

(coloured by gold assays – screenshot from VULCAN Envisage). 4

Figure 1.2: Compositing of drillhole samples using interval equal to sample length. 6

Figure 1.3: Polygonal method of ore grade estimation. 8

Figure 1.4: Triangular method of ore grade estimation. 9

Figure 1.5: Search ellipse used during selection of samples for ore grade estimation. 12

Figure 1.6: Frequency histogram (left) and variogram (right) of copper grades (percentages). 15

Figure 1.7: Grid modeling as visualised in an advanced 3D graphics environment. 17

Figure 1.8: Sections through a block model intersecting the orebody. 18

Chapter 2

Figure 2.1: Illustration of a typical neuron [13]. 25

Figure 2.2: Propagation of an action potential through a neuron’s axon [13]. 26

Figure 2.3: The five major models of computation as they were presented six decades ago [18]. 29

Figure 2.4: Structure of the processing element [32]. 30

Figure 2.5: Effect of bias on the input to the activation function (induced local field) [32]. 31

Figure 2.6: Common activation functions: (a) unipolar threshold, (b) bipolar threshold, (c)

unipolar sigmoid, and (d) bipolar sigmoid [33]. 32

Figure 2.7: Basic structure of a layered ANN [32]. 33

Figure 2.8: Structure of the feedforward artificial neural network. There can be more than one

middle or hidden layers [33]. 42

Figure 2.9: a) Recurrent network without self-feedback connections, b) recurrent network

with self-feedback connections [32]. 44

Figure 2.10: Structure of a two-dimensional Self-Organising Map [32]. 45

Figure 2.11: Basic structure of the Radial Basis Function Network [33]. 46

Figure 2.12: The concept of Time Delay Neural Networks for speech recognition [40]. 47

Figure 2.13: An approach to FNN implementation [44]. 49

Contents

ix

Chapter 3

Figure 3.1: Regularisation network [32]. 58

Figure 3.2: Structure of generalised RBF network [32]. 61

Figure 3.3: Illustration of input space dissection performed by the RBF and MLP networks [69]. 70

Chapter 4

Figure 4.1: ANN for ore grade/reserve estimation by Wu and Zhou [73]. 77

Figure 4.2: General structure of the AMAN neural system. 80

Figure 4.3: Back-propagation network used in the NNRK hybrid system. 82

Figure 4.4: Drillhole data used for testing the performance of the NNRK system. 83

Figure 4.5: 2D approach of learning from neighbour samples arranged on a regular grid. 85

Figure 4.6: Modular network approach implemented in the GEMNet system [84]. 86

Figure 4.7: Scatter diagram of GEMNet estimates on a copper deposit [84]. 87

Figure 4.8: Contour maps of GEMNet reliability indicator and grade estimates of a copper

deposit [84]. 88

Figure 4.9: Back-propagation network used for lateral log inversion [86]. Connections between

layers are not shown. 91

Figure 4.10: Estimated grades and assays (red and blue) vs. actual (black) (89). 92

Chapter 5

Figure 5.1: Illustration of quadrant and octant search method (special case where only one

sample is allowed per sector). Respective grid nodes are also shown. 104

Figure 5.2: Estimation results from neural network architecture developed for use with gridded

data. The use of irregular data has an obvious effect in the performance of the system. 105

Figure 5.3: Neural network architectures receiving inputs from a quadrant search (left) and from

an octant search (right). 106

Figure 5.4: Improvement in estimation by the introduction of the neighbour sample distance in

the input vector. 108

Figure 5.5: Modular neural network architecture developed for ore grade estimation from 2D

samples [113]. 110

Figure 5.6: Partitioning of the original dataset into three parts each one targeted at a different

module of the MNNS. 115

Figure 5.7: RBF network used as part of module 1 in MNNS. Training patterns from an octant

search were used to train the network. 117

Figure 5.8: Posting of the basis function centres from the RBF network of Fig. 5.7 in the

normalised input space (X-Grade, Y-Distance). 118

Figure 5.9: Graph showing the learned relationship between the network’s inputs (grade and

Contents

x

distance of neighbour sample) and the network’s output (target grade) for the RBF

network of Fig. 5.7. 119

Figure 5.10: Example of an RBF network from Module 2. 120

Figure 5.11: Posting of the basis function centres from the RBF network of Fig. 5.10 in the

normalised input space (X-Grade, Y-Distance). 121


distance of neighbour sample) and the network’s output (target grade) for the RBF

network of Fig. 5.10. 121

Figure 5.13: Module 3 MLP network trained on sample co-ordinates. 122

Figure 5.14: Learned mapping between sample co-ordinates (easting and northing) and sample

ore grade for MLP network of Module 3. 124

Figure 5.15: 3D version of quadrant search. 127

Figure 5.16: 3D version of octant search. 128

Figure 5.17: Simplified 3D search method used in the MNNS for sample selection. 129

Figure 5.18: Diagram showing the structure of the MNNS for 3D data (units are the neural

network modules). 130

Figure 5.19: Learned weighting of outputs from module one RBF networks by the RBF of

module two. 131

Figure 5.20: Learned relationships between sample co-ordinates, length (inputs) and sample

grade (output) from the RBF network of module three. 133

Chapter 6

Figure 6.1: Posting of input/training samples (blue) and test samples (red) from the iron ore

deposit. 138

Figure 6.2: Scatter diagram of actual vs. estimated iron ore grades. 139

Figure 6.3: Iron ore grade distributions – actual and estimated. 140

Figure 6.4: Contour maps of iron ore actual and estimated grades. 141

Figure 6.5: Posting of input/training samples (blue) and test samples (red) from the copper

deposit. 142

Figure 6.6: Scatter diagram of actual vs. estimated copper grades. 143

Figure 6.7: Copper grade distributions – actual and estimated. 144

Figure 6.8: Contour maps of copper actual and estimated grades. 145

Figure 6.9: 3D view of the orebody and drillhole samples used in the 3D gold deposit study. 147

Figure 6.10: Scatter diagram of actual vs. estimated gold grades. 148

Figure 6.11: Gold grade distributions – actual and estimated. 149

Figure 6.12: Gold grades distribution of the complete dataset. 150

Figure 6.13: Drillholes from a 3D chromite deposit. 151

Figure 6.14: Scatter diagram of actual vs. estimated chromite grades. 153

Figure 6.15: Chromite grade distributions – actual and estimated. 153

Contents

xi

Chapter 7

Figure 7.1: Simplified block diagram showing the operational steps of the data processing and

control module in GEMNET II. 159

Figure 7.2: Normalisation information panel. 160

Figure 7.3: Interaction between GEMNET II and other parts of the integrated system during

operation of the data processing and control module. 165

Figure 7.4: RBF centres from second module located in 3D space. Drillholes and modelled

orebody are also shown. 168

Figure 7.5: RBF centres of west sector RBF network and respective training samples in the

input pattern hyperspace (X-Grade, Y-Distance, Z-Length). 170

Figure 7.6: Final module’s RBF network. 172

Figure 7.7: Block model coloured by the reliability indicator in GEMNET II. 176

Figure 7.8: Block model coloured by module index in GEMNET II. Cyan blocks represent first

module estimates while red blocks represent second module estimates. 177

Figure 7.9: First module RBF centres visualisation in GEMNET II. Drillholes and orebody

model are also shown. 178

Figure 7.10: Diagram of the main components of SNNS. 179

Figure 7.11: Modules and extensions of VULCAN. 185

Figure 7.12: Menu structure of GEMNET II in Envisage. 186

Figure 7.13: GEMNET II panels in Envisage. 187

Figure 7.14: Console window with messages from GEMNET II operation. 188

Figure 7.15: GEMNET II online help. 189

Chapter 8

Figure 8.1: Orebody and drillholes from copper/gold deposit 1. 195

Figure 8.2: Scatter diagram of actual vs. estimated copper grades from copper/gold deposit 1. 196

Figure 8.3: Copper grade distributions from copper/gold deposit 1. 197

Figure 8.4: Scatter diagram of actual vs. estimated gold grades from copper/gold deposit 1. 198

Figure 8.5: Gold grade distributions from copper/gold deposit 1. 199

Figure 8.6: Plan section (top) and cross section (bottom) of block model coloured by reliability

indicator values for the gold grade estimation of copper/gold deposit 1. 200

Figure 8.7: Plan section (top) and cross section (bottom) of block model coloured by module

index for gold and copper grade estimation of copper/gold deposit 1. 201

Figure 8.8: RBF centers locations and training patterns from module 1 networks, north (top)

and east (bottom). 202

Figure 8.9: Plan section (top) and cross section (bottom) of block model coloured by gold grade

estimates for the copper/gold deposit 1. 203

Figure 8.10: Orebodies and drillholes from copper/gold deposit 2. 205

Contents

xii

Figure 8.11: Scatter diagram of actual vs. estimated gold grades from zone TQ1 of copper/gold

deposit 2. 208

Figure 8.12: Gold grade distributions from zone TQ1 of copper/gold deposit 2. 208

Figure 8.13: Scatter diagram of actual vs. estimated gold grades from zone TQ1A of copper/

gold deposit 2. 209

Figure 8.14: Gold grade distributions from zone TQ1A of copper/gold deposit 2. 209


deposit 2. 210



deposit 2. 211



indicator values for the gold grade estimation of copper/gold deposit 2. 213


index for gold and copper grade estimation of copper/gold deposit 2. 214

Figure 8.21: RBF centers locations and training patterns from module 1 network north (top)

and module 2 network (bottom) in copper/gold deposit 2. 215

Figure 8.22: Plan section (top) and cross section (bottom) of block model coloured by gold

grade estimates for the copper/gold deposit 2. 216

Contents

xiii

List of Tables

Chapter 4

Table 4.1: Comparison of NNRK, ANN, and kriging estimates. 83

Chapter 5

Table 5.1: Learning strategy for Module 3 MLP network. 123

Chapter 6

Table 6.1: Characteristics of datasets from the MNNS case studies. 137

Table 6.2: Mean absolute errors from case study 1. 139


Table 6.4: Actual and estimated average gold grades. 147


Table 6.6: Actual and estimated average chromite grades. 152


Chapter 7

Table 7.1: System variables available in BATCHMAN. 181

Chapter 8

Table 8.1: Main characteristics of the four deposits used for testing the final GEMNET II

architecture. 193

Table 8.2: Statistics of data from copper/gold deposit 1. 196

Table 8.3: Actual and estimated average copper and gold grades from copper/gold deposit 1. 199

Table 8.4: Samples and block model file information and training pattern generation results for

copper/gold deposit 2. 206

Table 8.5: Statistics from copper/gold deposit 2 and estimation performance results. 207

Introduction

1

1. Introduction

1.1 The Problem of Grade Estimation Grade estimation is one of the most complicated aspects in mining. It also happens to

be one of the most important. The complexity of grade estimation originates from

scientific uncertainty, common to similar engineering problems, and the necessity for

human intervention. The combination of scientific uncertainty and human judgement

is common to all grade estimation procedures regardless of the chosen methodology.

In statistical terms, grade estimation is a problem of prediction. Geo-scientists

are given a set of samples from which they need to construct a quantitative model of

an orebody’s grade by interpolating and extrapolating between these samples. Geo-

scientists can be people coming from very different fields like geology, mathematics

and statistics. The quantitative model they will construct, ideally takes into

consideration the qualitative model of the orebody built by the geologists interpreting

the exploration data.

The amount of data available for support of the grade estimation process is

usually very small compared with the amount of information that has to be extracted

from them. This data also occupies a very small volume in 3D space compared to the

volume of the orebody that undergoes grade estimation. The quality of this data is

dependant on a number of processes that involve human interaction and allow for the

introduction of measurement errors at the early stages of sampling, analysing and

logging. It should be noticed also that exploration data is usually very expensive.

There are various methods developed for performing grade estimation.

Generally, these methods can be classified in three categories: geometrical, distance

based and geostatistical. There are certain assumptions inherent to each of these

methods, while most of them depend on human judgement and allow for the

Introduction

2

introduction of human errors. These assumptions mainly regard the spatial

distribution characteristics of grade, such as the continuity in different directions in

space. It would be an understatement to say that a great percentage of the people who

apply these methods do not understand or take into consideration these assumptions.

Especially in the case of geostatistics, due to the built-in complexity of the

methodology, people tend to overlook the significance of these assumptions or

underestimate the negative effects that any misjudgements might have. As a result,

mining projects often begin with ‘great expectations’ that may never become reality.

Over- or underestimation of grades is only one of the many unforgiving results from a

wrong choice and application of grade estimation methods.

In recent years, many researchers in the field of grade/reserves estimation

have noticed these problems and tried to suggest possible alternatives. Some of them

have tried to prove that the assumptions inherent in geostatistics cannot be valid most

of the times and therefore other methods should be considered. However, these

discussions commonly concentrate more in disapproving geostatistics and other

established methodologies rather than progress towards a new and valid method.

It seems to be a common belief that the geostatistical methodology has created

a special league of people who understand the underlying mechanisms and theory.

Unfortunately these people are the minority of the scientists and engineers who are

asked to provide with grade estimates based on which large amounts of investment

money will be spent. In most of the cases people misuse geostatistics or completely

avoid them even though they could benefit from their use. Many geologists build

their own picture of the orebody in their minds using their experience and even their

instincts. They ‘develop’ their own methods of estimation by adjusting less advanced

methods to the exploration data at the early stages of a mining project. What is even

Introduction

3

more unfortunate is that they continue to build confidence on those early models of

the orebody, something that inevitably leads them to the difficult point of not being

able to fit new data coming from the mine to their model.

There are too many examples of successful application of geostatistics and

other existing methods for one to completely disregard them. Specifically in the case

of geostatistics, this success cannot be credited to luck because as will be discussed

later, it is a very painful and time consuming process that leaves no space for

mistakes or misjudgements. Therefore, careful choice of a method and careful

application of this method to exploration data can produce reliable results. As already

discussed though, the current methods for grade estimation and particularly

geostatistics require a large amount of knowledge and skills to be effectively applied.

They can be very time consuming and difficult to explain to people who make

investment decisions. Finally, their results depend on the skills and experience of the

modeller, and the quality of the exploration data. They can also be prone to errors

when handling data, which does not follow the necessary assumptions. In the next

paragraph a brief discussion is given on the exploration data used during grade

estimation in order to explain the potential problems they can cause in this process.

1.2 Grade Data from Exploration Programs Drilling is the most common way to enter the 3D space under the ground surface to

extract samples from the underlying rock. Other methods exist such as the

construction of shafts and tunnels.

The geologist, based on the samples obtained, will conclude as to the presence

of a mineralised body. Economics usually dictate the maximum number of drillholes

even though this is also controlled by the complexity of the geological environment.

There are many types of drilling equipment. The layout of a drilling programme does

Introduction

4

not follow specific rules. Figure 1.1 shows a set of drillholes from a copper/gold

orebody. Hole spacing and size depend solely on the characteristics of the orebody.

This is a major source of complication when it comes to developing a grade

estimation technique.

Figure 1.1: Drillholes from exploration programme and development, intersecting the orebody

(coloured by gold assays – screenshot from VULCAN Envisage).

Once the samples are obtained and logged, the mineralised parts are prepared

for assay. Computers are extensively used during this process for logging and storage

of the samples. The outcome of the exploration programme and post-processing is a

series of files containing records of drillhole samples. There are usually three files

describing the contents and the position of the samples in 3D space. These files are:

• Collar table file: this file contains the co-ordinates of the drillhole collars

and the overall geometry of the drillhole.

Introduction

5

• Survey table file: this file provides all the necessary information to derive

the co-ordinates of individual samples in space. The combination of the

survey and collar tables is necessary in order to visualise drillholes

correctly using 3D computer graphics and enable the development of a

drillhole database.

• Assay table file: the results of the assay analysis are stored as per sample

in this file. When combined with the previous two files, this leads to the

completion of the drillhole database. This database is the source of input

data for the process of grade estimation.

Following the development of a drillhole database is the compositing of drillholes

into intervals. These intervals refer to drillhole length and can be fixed or they can be

derived from the sample lengths. In the first case, if the interval is greater than the

length of the samples, then more than one samples are used to provide the assay value

for that interval. Figure 1.2 illustrates the process of compositing. Compositing is

usually a length-weighted average except in the case of extremely variable density

where compositing must be weighted by length times density [71]. In the case of the

intervals being derived directly from the sample lengths, the number of composites

equals the number of samples in the database and the compositing procedure is

reduced to a reconstruction of the database into a single file containing all the

information. This type of compositing will be used throughout this thesis in order to

provide the input data files for the various case studies.

Introduction

6

Figure 1.2: Compositing of drillhole samples using interval equal to sample length.

A typical composites file starts with a header describing the structure of the

file and the format used for reporting the values of the various parameters. After the

header follows the main part of the file consisting of the sample records. Records

typically contain the following parameters:

Sample id, top xyz, bottom xyz, middle xyz, length, from, to, geocode, assay values

The top, bottom, and middle co-ordinates are derived from the survey and collar

tables as explained above. The from and to fields refer to the distance from the

drillhole collar to the beginning and end of the sample respectively. There can be a

number of codes describing geology, lithology, etc. These parameters allow the

interaction between the qualitative model of the orebody, built by the geologist, and

the quantitative model, which will be developed after grade estimation. Finally, there

Introduction

7

can be more than one variable values reported for every composite, e.g. gold and

copper grade.

The irregularities of the drilling scheme, the limited amount of drillholes,

which are economically feasible, and the complex procedures necessary for the

analysis of the obtained samples account for many of the problems during grade

estimation. Additionally, the grades themselves will often present behaviour, which is

very difficult to model using the information available from an exploration

programme. People responsible for the exploration programme are always facing the

questions of how much would it help to add an extra drillhole to the samples

database, whether the cost of the extra drillhole is justifiable by the derived benefits

and naturally, where to drill in the given area.

1.3 Existing Methods for Grade Estimation

1.3.1 General In the following paragraphs, several of the most common existing methods for

grade estimation will be discussed briefly. Attention will be given to their specific

areas of application. Every method presents special characteristics that make it more

applicable to certain types of deposits. There is no such thing as a universally

applicable method for grade estimation. The selection of a method for a particular

deposit depends on the geological and engineering attributes of the latter.

1.3.2 Geometrical Methods Before computers dominated the field of grade estimation, the geometrical methods

were the most often employed [81] and they are still used for quick evaluation of

reserves. These methods include the polygonal (Fig. 1.3), triangular (Fig. 1.4) and the

method of sections.

Introduction

8

Figure 1.3: Polygonal method of grade estimation.

The polygonal method is very often used with drillhole data. It can be applied on

plans, cross sections, and longitudinal sections. The average grade of the sample

inside the polygon is assigned to the entire polygon and provides the grade estimate

for the area of the polygon. The thickness of the mineralisation in the sample is also

applied to the polygon to provide a volume for the reserve estimate. The assumption

here is that the area of influence of any sample extends halfway to the adjacent

sample points. The polygons are constructed by joining the bisectors perpendicular to

the lines connecting these sample points. The polygonal method is applied to simple -

moderate geometry deposits with low to medium grade variability (e.g. coal,

sedimentary iron, limestone, evaporites).

Introduction

9

Figure 1.4: Triangular method of grade estimation.

The triangular method is a slightly more advanced method than the polygonal. In this

method the triangle area between three adjacent drillholes receives the average grade

of the three samples involved. In computation terms, the triangular method is much

faster since the areas are easy to calculate from the co-ordinates of the three points.

This method can be applied to the same cases as the polygonal method.

The last of the three geometrical methods to be mentioned in this thesis, the

method of sections, is the most manual one and requires a lot of time and patience.

The areas of influence of the drillhole samples expand half way to adjacent sections

and to adjacent drillholes in the same section. The grades of the samples are assigned

to their areas of influence. The method of sections is usually applied in deposits with

very complex geometry where the other methods present.

Introduction

10

The geometrical methods suffer from problems concerning the predicted

distribution of grades. Depending on the average grade of the deposit and the cutoff

grade, they can lead to over- or underestimation of grades.

1.3.3 Inverse Distance Method Inverse distance weighting as well as kriging – the geostatistical interpolation tool -

belong to the class of moving average methods. They are both based on repetitive

calculations and therefore require the use of computers. Inverse distance weighting

consists of searching the database for the samples surrounding a point (or a block)

and computing the weighted average of those samples’ grades. This average is

calculated using the equation below:

g* = Σwigi i = 1,2,3,…n (1.1)

where g* is the grade estimate, gi is the i sample’s grade, wi is the weight for the

sample I, and n is the number of samples. The difference between inverse distance

weighting and kriging is in the way the weights w are calculated. In the case of

inverse distance, the weights are calculated as an inverse power of distance as

follows:

∑ −

−

= poweri

poweri

i dd

w i = 1…n (1.2)

where wi is the weight for sample i, di is the distance between sample i and the

estimated point and weighting power is the inverse distance weighting power. The sample

Introduction

11

selection strategy is as important as the weighting power. The following rules –

guidelines can be used during sample selection [71]:

• Samples should be chosen from the estimate point’s geologic domain;

• The search distance should be at least equal to the distance between samples;

• There should be a maximum number of samples to be selected;

• Samples must be a minimum distance from the estimate point to prevent

excessive extrapolation;

• Trends in the grade should be accounted for by the use of a search ellipse.

Modelling of the grade’s range of continuity in various directions is necessary to

provide the axes of the search ellipse (Fig. 1.5). This is commonly achieved using

variogram modelling (see next paragraph);

• The number of samples from any drillhole should be kept up to a maximum of

three. More samples leads to redundant data and can cause problems specially if

kriging is used as the interpolation method;

• Quadrant or octant search schemes may be used in the case of clustered data to

improve the estimation results [71].

The weighting power as well as the search radius and number of samples used can

affect the degree of smoothing. Unfortunately, these can only be found through trial

and error in order to honour the trends in the grade or match production results or

even follow the ideas of the geologist about deposit.

Introduction

12

Figure 1.5: Search ellipse used during selection of samples for grade estimation. The ellipse is divided

in quadrants and a maximum number of points is selected from each one of them.

Inverse distance weighting can be applied to deposits with simple to moderate

geometry and with low to high variability of grade (e.g. all the types mentioned in

polygonal method, bauxite, lateritic nickel, porphyry copper, gold veins, gold placers,

alluvial diamond, stockwork) [71].

1.3.4 Geostatistics The work of G. Matheron and D. Krige in the early 1960s led to the development of

an ore reserve estimation methodology, which is known as geostatistics. The theory

of geostatistics combines aspects from different sciences such as geology, statistics

and probability theory. It is a highly complex methodology with its main purpose

being the best possible estimation of ore grades within a deposit given a certain

amount of information. Geostatistics as any other method will not improve on the

quantity and quality of input data.

Introduction

13

Matheron’s theory of regionalised variables [58] forms the basis of

geostatistical methodology. In brief, according to this theory, any mineralisation can

be characterised by the spatial distribution of a certain number of measurable

quantities (regionalised variables) [38]. Geostatistics follows the observation that

samples within an ore deposit are spatially correlated with one another. Attention is

also given to the relationship between sample variance and sample size.

Every geostatistical study begins with the process of structural analysis,

which is by far the most important step of this methodology. Structural analysis

examines the structures of the spatial distribution of ore grades via the development

of variograms. The variogram utilises all the available structural information to

provide a model of the spatial correlation or continuity of ore grades. The calculation

of a variogram should be based on data from similar geological domains. The

variogram function is as follows:

( ) ( ) ( )( )∑ +−= 2

21 hxgxgn

h iiγ i = 1…n (1.3)

Where g(xi) is the grade at point xi, g(xi + h) is the grade of a point at distance h from

point xi, and n is the number of sample pairs. Sample pairs are oriented in the same

direction and separated by the distance h. Their volume should also be constant. This

is being considered during compositing of drillholes. For the purposes of constant

semi-variogram support, compositing should be performed on a constant interval.

The variogram function is calculated for different values of distance h. The

resulting graph is known as the experimental variogram. As shown in Figure 1.7, the

variogram usually increases with increasing distance and reaches a plateau level. The

distance h at which the variogram stops increasing and becomes more or less level is

Introduction

14

called the variogram range. The value of the variogram at this distance is called sill

of the variogram (C+Co). Finally, the value of the variogram at distance h = 0 is

called the nugget effect (Co). There is a number of different meanings given to a high

nugget effect in comparison to the sill of the variogram, such as low quality samples

and non-homogenic sampling zone. Most of the times it is fairly difficult to identify

these three parameters from the experimental variogram graph and therefore it

becomes difficult to fit one of the available models. It is a process that requires skill,

experience and large amounts of time. It is also a point where mistakes are being

made, undermining the entire process of grade estimation.

Following the variogram modelling is the geostatistical method for grade

interpolation, called kriging. Kriging is a linear estimation method, which is based on

the position of the samples and the continuity of grades as shown by the variograms.

The method finds the optimal weights wi for equation (1.1) by evaluating the

estimation variance from the calculated variograms. Therefore kriging is not based

only on distance, as is the inverse distance method.

Figure 1.7: Frequency histogram (left) and variogram (right) of copper grades (percentages).

Introduction

15

There are a number of variations of kriging each suited to different types of

deposits and sampling schemes. The geostatistical methodology is very well

documented and there are many good publications on this field [38,20,37,17]. There

have also been developed non-linear variants of kriging such as log-normal and

disjunctive kriging [21,49] which are far more advanced than linear kriging but also

far more complicated.

Generally, it is difficult to argue with the efficiency and reliability of a

properly developed geostatistical study. However, there is always the issue of

justifying the extra complexity and cost of geostatistics especially at the beginning of

a mining project when there are no actual values to compare with.

1.3.5 Conclusions From the very brief discussion above, it becomes clear that there is still a need for a

fast and reliable method for ore grade estimation, the results of which will depend

only on the complexity and variability of the given deposit and not so much on the

quality and quantity of the given data. The required method should also not depend

on the skills and knowledge of the person who is applying it, while remaining easy to

understand and apply.

The methods developed so far and especially geostatistics suffer either from

over-simplification of the ore grade estimation process, as in the case of the

geometrical methods, or from over-sophistication, as in the case of geostatistics.

Choosing one of the available methods is usually a compromise between speed and

reliability, cost and attention to detail. This is a compromise very few mining

companies are willing to make but many of them have to because of the resources

available.

Introduction

16

1.4 Block Modelling & Grid Modelling in Grade Estimation Grade estimation usually involves interpolation between known samples, which

become available from an exploration program or from the development of the mine.

The interpolation process is based on locations commonly arranged on a regular

geometric structure designed to provide for the necessary detail and cover the

volume/area of interest. Block and grid model are the main structures used during

grade estimation and deposit modelling. The choice depends on the type and

complexity of the deposit and the value of interest [5].

Figure 1.8: Grid modeling as visualised in an advanced 3D graphics environment.

Grid models (Fig. 1.8) consist of a series of computer two-dimensional

matrices. These matrices may contain estimates of different parameters such as

grades, thickness, structures and other values. A grid is usually defined by its origin

Introduction

17

in space, i.e. the easting, northing, and elevation of its starting position, the distance

between its nodes in both directions, and its dimension in these directions, i.e. the

number of nodes. This structure dramatically reduces the amount of information

necessary to represent a complete model of the deposit and has the additional

advantage of allowing easy manipulation of the various parameters included by

performing simple calculations between the grids. Grid modeling is best suited for

deposits with two of their dimensions being significantly greater than the third.

Block models are far more complex structures being three-dimensional and

allowing the storage of more than one parameter. Figure 1.9 shows two sections

through a block model. The volume including the deposit of interest is divided into

blocks with specific volume associated with them. Their centroid’s relative X, Y, and

Z co-ordinates as to the origin of the model define these blocks. Their dimensions can

vary from one to the other – usually decreasing close to geologic structures and other

features that require more detail. There can be more than one variables associated

with every block, some estimated and others derived. grade estimation on a block

model basis means the extension of point samples to block estimates with volume.

Introduction

18

Figure 1.9: Sections through a block model intersecting the orebody. A surface topography model has

limited the block model.

Block models allow the modeling of deposits with very complex geometry.

They do require though excessive computational power and they tend to be more

demanding as the number of variables stored increases. They are also more difficult

to visualise as they are three-dimensional and they can only be effectively plotted in

sections.

1.5 Artificial Neural Networks for Grade Estimation Artificial neural networks (ANNs) are the result of decades of research for a

biologically motivated computing paradigm. There are many different opinions as to

their definition and applicability to technological problems. It is a common belief

though, that ANNs present an alternative to the concept of programmed or hard

computing. The ever-emerging ANN technology brought the concept of neural

computing, which finds its way more and more into real engineering problems. ANNs

Introduction

19

are parallel computing structures, which replace program development with learning

[92].

There have been many cases of successful applications of ANNs to function

approximation, prediction and pattern recognition problems in the past. This fact as

well as special characteristics of ANNs that will be discussed in the next chapter

makes them a natural choice for the problem of grade estimation. As discussed in the

previous paragraphs, grade estimation is commonly reduced to a problem of function

approximation. ANNs and specifically the chosen type of ANNs can provide, as this

thesis will try to prove, a valid methodology for grade estimation.

1.6 Research Objectives Disregarding of the existing methodologies for grade estimation is definitely not one

of the aims of this thesis. The GEMNet II system described was developed to provide

a flexible but complete alternative method, which takes into consideration the theory

behind deposit formation while minimising the dependence on certain assumptions.

The main objectives of the development of GEMNet II can be identified as follows:

• To find a suitable neural network architecture for the problem of grade

estimation.

• To take advantage of the function approximation properties of ANNs.

• To break down the problem of grade estimation into less complex functions that

can be modelled using these properties.

• To integrate the developed neural network architecture in a system which will be

user-friendly and flexible.

• To provide means of validating the results of this system.

• To minimise the knowledge required for using the system.

Introduction

20

• To compare the performance of the system with existing grade estimation

techniques on the basis of estimation properties, usability and time requirements.

1.7 Thesis Overview Given below is a description of the chapters included in this thesis:

• Chapter 2 - Artificial Neural Networks Theory

Gives a brief discussion on the theory behind ANNs, the main ANN architectures

and their main application areas.

• Chapter 3 - Radial Basis Function Networks

Examines a special type of ANN architecture, which will form the basis of the

GEMNet II system. An in-depth analysis of Radial Basis Function Networks is

presented in order to provide a better understanding of their operation and their

suitability to the problem of grade estimation.

• Chapter 4 – Mining Applications of Artificial Neural Networks

Discusses a number of examples of ANNs application to grade/reserves

estimation. Examples of similar applications from non-mining areas are also

given. Presents a number of reported uses of ANN systems to mining and shows

how this technology begins to gain ground in the mining industry.

• Chapter 5 - Development of a Modular Neural Network System for Grade

Estimation

Describes the development of prototype modular neural network systems for use

with 2D and 3D exploration data. The transition from two to three dimensions is

discussed.

• Chapter 6 - Case Studies of the Prototype Modular Neural Network System

Introduction

21

Presents a number of case studies, which were used to guide the development of

the prototype system. These case studies were also used to validate the overall

approach.

• Chapter 7 - GEMNET II – An Integrated Modular System for Grade

Estimation

Explains the design and development of the GEMNet II system. The system

architecture as well as application is analysed. The integration of the system in an

advanced 3D resource-modelling environment is also discussed.

• Chapter 8 - GEMNet II Application – Case Studies

Contains several examples of the application of GEMNet II to real deposits with

real sampling schemes. The case studies are presented in order of increasing

complexity. Other techniques are applied to the same data in order to provide

with a basis for comparison and evaluation of GEMNet II system’s performance.

• Chapter 9 - Conclusions – Further Research

Gives a discussion on the conclusions from the research described and the

potential areas for further research and development.

Artificial Neural Networks Theory

23

2. Artificial Neural Networks Theory

2.1 Introduction

2.1.1 Biological Background The human brain and generally the mammalian nervous system has been the source

of inspiration for decades of research for a computational model, which is based not

on hard-coded programming but on learning from experience. The human brain,

central to the human nervous system, is generally understood not as a single neural

network but as a network of neural networks each having their own architecture,

learning strategy, and objectives. The massive parallelism of the human brain and the

deriving advantages of this structure always attracted the attention of scientists

especially in the field of computing.

Biological neural networks, regardless of their function and complexity, are

composed of building blocks known as neurons (Fig. 2.1). The minimal structure of

a neuron consists of four elements: dendrites, synapses, cell body, and axon.

Dendrites are the transmission channels for information coming into the neuron. The

signals, which propagate through the dendrites, originate from the synapses, which

form the input contact points with other neurons. Synapses are also centres of

information storage in biological neural networks. There are however other storage

mechanisms inside the biological neurons, which are still not very well understood

and extend outside the four-element neuron model described here. The axon is

responsible for transmitting the output of the neuron. There is only one axon per

neuron, but axons can have more than one branches the tips of which form synapses

upon other neurons [3]. The cell body of the neuron is where most of the processing

takes place. The cell body also provides the necessary chemicals and energy for its

operation.


24

Axon Cell body Synapses Dendrites

Figure 2.1: Illustration of a typical neuron [100].

Transmission of information within biological neural networks is achieved by

means of ions, semi permeable membranes and action potentials as opposed to simple

electronic transport in metallic cables [87]. Neural signals produced at the neuron

travel through the axon in the form of ions, which in the case of neurons are called

neurotransmitters. The neuron is constantly trying to keep a balanced electrical

system by transporting excess positive ions out of the cell while holding negative ions

inside. These movements of ions through the neuron are known as depolarisation

waves or action potentials (Fig 2.2).

The information transmitted between neurons is processed using a number of

electrical and chemical processes. The synapses play a leading role in the regulation

of these processes. Synapses direct the transmission of information and control the

flow of neurotransmitters. The cell body integrates incoming signals and when these

reach a certain level the activation threshold is reached and the neuron generates an

action potential, which propagates through the neuron’s axon.

Synapses, as already mentioned, are the centres of information storage. The

synapses store information by modifying the permeability of the cell to different

kinds of neurotransmitters and therefore altering their effect to the neuron’s


25

activation. This information needs to be refreshed periodically in order to maintain

the optimal behaviour of the neuron. This form of information storage is also known

as synaptic efficiency, which represents the ability of a particular synapse to evoke the

depolarisation of the cell body.

Figure 2.2: Propagation of an action potential through a neuron’s axon [100].

All the above knowledge of the way neurons transmit, store, and process

information is far from being complete and therefore any derived artificial model

cannot be considered to be anywhere close to being as complex as its biological

counterpart both in the level of neurons and neural networks. ANNs follow the simple

four-element model of the biological neuron in the definition of their building block,

the artificial neuron or processing element.

2.1.2 Statistical Background The study of the human brain and other biological nervous structures is not the only

source of inspiration and formalisation for the development of artificial neural

network models. ANNs are commonly treated as fine-grained parallel

implementations of non-linear static or dynamic systems [31]. The biological

structures when simplified to an artificial model become a system that can be best

described by a traditional mathematical or statistical model such as non-parametric

pattern classifiers, clustering algorithms, non-linear filters, and statistical regression


26

models rather than a true biological model. These statistical models are either

parametric with a small number of parameters, or non-parametric and completely

flexible. Artificial neural network methods cover the area in between with models of

large but not unlimited flexibility given by a large number of parameters as required

in large-scale practical problems [82].

The behaviour and dynamics of the structure of artificial networks can be

shown to implement the operation of classical mathematical estimators and optimal

discriminators [47]. It is generally accepted that the earlier models of artificial

neurons and neural networks in the 1940s and ‘50s tried to imitate as close as

possible the biological model while more recent models have been elaborated for new

generations of information-processing devices. In most cases of ANNs it is almost

impossible to get any agreement between their behaviour and experimental

neurophysiological measurements. This results from the over-simplification of the

biological nervous systems, which is dictated by the incomplete understanding of the

numerous chemical and electrical processes involved.

Understanding the operation properties of ANNs can be approached by a

number of different methods. Statistical mechanics is a very important tool for

analysing the learning ability of a neural network. Statistical mechanics provide a

description of the collective properties of complex systems consisting of many

interacting elements on the basis of the individual behaviour and mutual interaction

of these elements [118]. Within this approach, ANNs are defined as ensembles of

neurons with certain activity, which interact through synaptic couplings. Both the

activities and synaptic couplings are assumed to evolve dynamically.


27

In the following paragraphs, a discussion on various aspects of ANNs will be

given which will show to a greater extent the strong connection between statistics and

neural computing.

2.1.3 History Almost every introduction to ANNs begins with a brief presentation of the historical

development of ANNs and neural computation in general. There are many good

reasons for discussing the history of ANNs. The brief discussion in this paragraph

will show how this multi-science field of computing evolved through time. This

historical analysis will help to assess the growth and potential of ANNs as an

approach to the problem of computing.

ANNs are the realisation of one of the first formal definitions of

computability, namely the biological model. In the 1930s and ‘40s there were at least

five alternative models of computation (Figure 2.3) [86]:

1. mathematical model

2. logic-operational model (Turing machines)

3. computer model

4. cellular automata

5. biological model (Neural Networks)


28

Figure 2.3: The five major models of computation as they were presented six decades ago [86].

The computer model of von Neuman became the most popular and widespread used

one, but this did not mean the dismissal of the other approaches. In fact John von

Neuman himself has participated in the development of other models like the first

ANNs [69]. In 1943 Warren McCulloch and Walter Pitts introduced the first models

of artificial neurons [60]. Donald Hebb in his book entitled The Organisation of

Behaviour [33] tried to build a qualitative explanation of experimental results from

psychology using a specific learning law for the synapses of neurons that he

proposed.

The first hardware implementations of ANNs included the Snark by Marvin

Minsky [64], the Mark I Perceptron by Frank Rosenblatt and others [88], the

ADALINE by Bernard Widrow [109], and the Learnmatrix by Karl Steinbuch [98].

After a quiet period in the 1950s and early 1960s, the field of neural computing

became once again the centre of research activity. Researchers such as Teuvo


29

Kohonen [46], James Anderson [2], Stephen Grossberg [30], and Shun-ichi Amari [1]

brought back the interest in the field and by the 1980s the first neural network

applications became a reality. John Hopfield [34] was also another example of an

established scientist who helped to raise the worldwide awareness to the neural

computing field. By the late 1980s the field was very well established through

research groups in most of the major universities and research institutions around the

world. David Rumelhart and James McClelland [89] are also worth mentioning for

their contribution to the field through the publication of the Parallel Distributed

Processing, which are considered as major references of neural computing.

2.2 Basic Structure – Principles

2.2.1 The Artificial Neuron – the Processing Element The artificial neuron or processing element (PE) is the basic unit of an ANN. It is a

simplified version of the four-element model described in Paragraph 2.1. There are

both software and hardware implementations of PEs. Their basic structure is

illustrated in Figure 2.4.

Figure 2.4: Structure of the processing element [32].

The PE k includes a set of synapses each being identified by a weight w. Each input

signal xj to the PE k is multiplied by the synaptic weight wkj. The weighted input


30

signals are summed by the adder of the PE (linear combiner). The outcome of the

summation is passed to an activation function also known as squashing function

because it squashes (i.e. limits) the amplitude range of the PE’s output to a finite

value [32]. The bias bk is applied to the adder and has the effect of increasing or

decreasing the output of the latter. Figure 2.5 shows the effect of the bias on the

output of the linear combiner.

Figure 2.5: Effect of bias on the input to the activation function (induced local field) [32].

The following equations describe the model of the PE in mathematical terms:

∑=

=m

jjkjk xw

1υ (2.1)

And

( )kky υϕ= (2.2)

where x1, x2, …, xm are the input signals which are multiplied by the synaptic weights

wk1, wk2, …, wkm and then added to give the linear combiner output υk. The bias bk is

applied to uk to provide the input to the activation function ϕ( . ). Finally, the output


31

of the activation function gives the output of the neuron yk. Figure 2.6 illustrates the

most common activation functions used in modern PEs.

Figure 2.6: Common activation functions: (a) unipolar threshold, (b) bipolar threshold, (c) unipolar

sigmoid, and (d) bipolar sigmoid [53].

2.2.2 The Artificial Neural Network The model of the artificial neuron or processing element described above forms the

basis of the artificial neural network (ANN) structure. ANNs consist of layers of

interconnected PEs as shown in Fig. 2.7. This layered structure is the most common

in ANNs and is usually called the fully connected feedforward or acyclic network.

However, there are ANNs that do not adopt this structure as will be discussed in

Section 2.4.

The starting point of the ANN structure is a layer of input units that allows the

entering of information into the network. The input units cannot be considered as PEs

mainly because there is no processing of information taking place at them with the

exception of normalisation (when required). Normalisation is the process of


32

equalising the signal range (commonly to a range between 0.1 and 0.9) of different

inputs. Normalisation ensures that changes in the signals of different inputs have the

same effect on the network’s behaviour regardless of their magnitude.

Figure 2.7: Basic structure of a layered ANN [32].

Following the input layer is one ore more internal or hidden layers. The use of

the word hidden is mainly due to the fact that they are not accessible from outside the

ANN. The first hidden layer is fully interconnected with the units of the input layer.

In other words, all PEs of the hidden layer receive the signal from each input unit.

The signals are multiplied by a weight, which is different for every connection. In the

case of more than one hidden layers, there will be full interconnection between

subsequent layers as in the case of the input and first hidden layer.

The final part of the ANN structure is the output layer. The units of this layer

are also PEs, which receive the signals from the last hidden layer and perform similar


33

processing to that of the hidden PEs. If normalisation is used in the input layer, then

the outputs of the output PEs have to be transformed back to the range of the original

data to get sensible results. This is required normally when the ANN is used for

function approximation.

2.3 Learning Algorithms

2.3.1 Overview Learning from examples is the main operation of any ANN. Learning in this case

means the ability of an ANN to improve its performance through an interactive

process of adjusting its free parameters. The adjustment of an ANN’s free parameters

is stimulated by a set of examples presented to the network during the application of a

set of well-defined rules for improving its performance called a learning algorithm.

There are many different learning algorithms for ANNs, each with a different way of

adjusting the connection weights of PEs and different way of formalising the

measurement of the ANN’s performance. They are generally grouped into supervised

and unsupervised algorithms. Supervised algorithms are applied when the required

ANN outputs are known in advance, while unsupervised algorithms are applied when

the correct outputs are not known and need to be found. Over the next paragraphs of

this section, the main learning processes and algorithms will be discussed briefly.

2.3.2 Error Correction Learning In order to explain the error correction learning algorithm, the basic structure of any

ANN, the PE, will be examined. The example is based on the assumption that the PE

is the only unit of the output layer of a feedforward ANN. As in any learning

algorithm, adjusting the synaptic weights of the PEs is an iterative process involving

a number of time steps.


34

The PE k is presented an input signal vector x(n) at the time step n. This signal

vector is produced by the units of the previous layer - the last hidden layer in this

case. The output signal yk(n) of the ANN’s only output is compared to a target output

dk(n), which produces an error signal ek(n):

ek(n) = dk(n) – yk(n) (2.3)

The production of the error signal activates a corrective mechanism – a sequence of

corrective adjustments to the synaptic weights of the PE that bring the output signal

closer to the target output. A cost function or index of performance is defined based

on the error signal as follows [32]:

E(n) = e2k(n) / 2 (2.4)

Eventually the process of adjusting the synaptic weights of the PE reaches a stabilised

weight state and learning terminates. This learning process of cost function

minimisation is also known as the delta rule or Widrow-Hoff rule [99]. The

adjustment Δwkj(n) of the synaptic weight wkj at time step n is given by:

Δwkj(n) = ηek(n)xi(n) (2.5)

Where η is the learning-rate parameter. The new value of the synaptic weight at time

step n+1 will be:

wkj(n+1) = wkj(n) + Δwkj(n) (2.6)


35

The correct choice of the learning-rate parameter is very important for the overall

performance of the ANN.

2.3.3 Memory Based Learning Memory based learning is mainly used for pattern classification purposes. Learning

takes the form of past experiences stored in a memory of classified input-output

examples {(xi, di)}Ni=1, where xi is the input vector, di the target output, and N the

number of patterns [32]. In the case of a new vector xnew presented to the network, the

algorithm will try to classify it by looking at the training data in a local

neighbourhood of xnew. There is a number of different algorithms for memory based

learning, which differ in the way they define two major aspects:

• the local neighbourhood of the new vector xnew

• the learning rule applied to training data in the local neighbourhood of xnew.

In Chapter 3 an in-depth discussion of a very important type of memory-based

classifier will be given, namely the radial basis function network.

2.3.4 Hebbian Learning The oldest of the learning rules is Hebb’s postulate of learning [33]. Hebb, in his

book The Organisation of Behaviour made the following statement as the basis for

associative learning:

When an axon of cell A is near enough to excite a cell B and repeatedly or

persistently takes part in firing it, some growth process or metabolic changes take


36

place in one or both cells such that A’s efficiency as one of the cells firing B, is

increased [33, p.62].

Transferring this statement from the neurobiological context into a more algorithmic

language, the following two-part rule [99]:

1. If two neurons on either side of a synapse are activated simultaneously, then the

strength of that synapse is selectively increased.

2. If two neurons on either side of a synapse are activated asynchronously, then that

synapse is selectively weakened or eliminated.

The second part of the rule was not included in the original Hebb’s rule but was

added for consistency reasons. The mathematical formulation of Hebbian learning is

given by the following equation of the synaptic weight wkj adjustment Δwkj(n):

Δwkj(n) = ηyk(n)xj(n) (2.7)

where xj and yk are the presynaptic and postsynaptic signals at time step n, and η is

the learning rate parameter. Hebbian learning is strongly supported by physiological

evidence in the area of the brain called the hippocampus.

2.3.5 Competitive Learning Competitive learning is one of the major types of unsupervised learning. In

competitive learning the output PEs of an ANN compete to become active when an

input signal is presented. In other words the output PEs try to provide the output

associated with an input vector. Competitive learning is based on three elements [90]:


37

1. A number of similar PEs, which can have however some randomly distributed

synaptic weights causing a different response to a given set of input vectors.

2. A limited strength for each PE.

3. A competition mechanism for the PEs to gain the right to respond to a given

input. The mechanism must ensure that only one output PE responds at a time –

that PE is called the winner-takes-all-neuron.

The winning PE is the one with the largest induced local field υk for an input pattern

x. The output of the winning PE is set to one while the outputs of all other PEs is set

to zero. The adjustment of the synaptic weight wkj for the winning PE is given by the

following equation:

Δwkj = η(xj – wkj) (2.8)

while for the loosing PEs:

Δwkj = 0 (2.9)

This leads to moving the synaptic weight vector of the winning PE towards the input

vector.

2.3.6 Boltzmann Learning Named in honour of Ludwig Boltzmann, the Boltzmann learning rule is a stochastic

learning algorithm based on statistical mechanics [32]. An ANN designed to follow

the Boltzmann learning algorithm is called a Boltzmann Machine (BM). A BM

implements a stochastic response function to characterise the transitions of individual


38

PEs between different states. There are two possible states for BM units: on state

denoted by +1 or off state denoted by –1. A BM is characterised by an energy

function, E, which depends on the states of the BM units:

( ) ∑∑−=i j

jiji xxwxE21 (2.10)

where xj is the state of PE j, and wkj is the synaptic weight between PE j and k. There

are no weights between a PE and itself (j ≠ k, or wjj = 0), in other words, none of the

PEs has self-feedback. During a BM’s operation a PE is chosen at random. Its output

is characterised in terms of a state transition function:

TEjj jexxP /1

1)( Δ−+=−→ (2.11)

where ΔEj is the change in the energy function of the BM as a result of the state

transition and T is the pseudotemperature. The PEs in a BM fall into two categories:

visible and hidden. The visible units form the connection of the network with its

environment. These units have two modes of operation: clamped and unclamped or

free running. In clamped mode, the visible units are clamped onto specific states

while in unclamped mode they operate freely. The hidden units always operate freely.

The adjustment of the synaptic weight is defined by [45]:

)( −+ −=Δ jijijiw ρρη (2.12)


39

where ρ+kj is the correlation between the states of PEs j and k in clamped mode, ρ-

kj is

the correlation between the states of PEs j and k in free-running mode, both range

from –1 to +1.

2.3.7 Self-Organized Learning Self-organising learning is usually considered to be just another way to describe

unsupervised learning. Self-organising learning is however a member of the group of

unsupervised learning algorithms together with competitive and reinforcement

learning. The most widely known model of self-organising networks is that of the

Self-Organising Maps or Kohonen Networks proposed by Teuvo Kohonen [45] as a

realisation of the ideas developed by Rosenblatt, von der Malsburg, and other

researchers.

A Kohonen network is an arrangement of PEs in a multi-dimensional lattice

(Par. 2.4.3). This structure enables the identification of the immediate neighbourhood

of every PE. Kohonen learning is based on a neighbourhood function φ(i,k)

representing the strength of the coupling between PE i and k during the training

process. The neighbourhood function is defined for all units i inside a neighbourhood

of radius r of unit k to equal one and for all other units to equal zero. The adjustment

of the weight vectors follows the rule below:

Δwi = wi + ηφ(i,k)(ξ – wi), for i = 1, …,m (2.13)

where m is the total number of PEs, η is a learning constant, and ξ is an input vector

selected using the desired probability distribution over the input space. The learning

process is repeated several times with the neighbourhood radius and the learning

constant being reduced according to a schedule. The value of the neighbourhood


40

function also decreases so that the influence of each PE upon its neighbours is

reduced. The effect of the schedule is to accelerate learning at the beginning of the

learning process and produce smaller corrections towards the end. The overall result

of Kohonen’s learning algorithm is that each PE learns to specialise on different

regions of input space and learns to produce the highest output for an input from such

a region.

2.3.8 Reinforcement Learning Reinforcement learning is another important member of the group of unsupervised

learning algorithms. It is closely related to dynamic programming which is why it is

sometimes referred to as neurodynamic programming.

Reinforcement learning is, in essence, an input-output mapping achieved

through interaction with the environment (input space) in order to minimise a scalar

index of performance [7]. Unlike other learning processes, reinforcement learning

aims to minimising a cost-to-go function defined as the cumulative cost of actions

taken over a sequence of steps instead of the immediate cost. The function of the

network is to find these actions and feed them back to the environment.

Reinforcement learning is very appealing since it allows for the network to interact

with its environment and develop the ability to increase its performance on the basis

of the outcomes of its experience from this interaction.

2.4 Major Types of Artificial Neural Networks

2.4.1 Feedforward Networks Beyond any doubt the most popular and widely used ANN structure, the feedforward

network is a hierarchical design consisting of fully interconnected layers of PEs.

Generally the operation of this network is mapping an n-dimensional input to an m-

dimensional output, in other words modelling of a function F : ℜn → ℜm. This is


41

achieved by means of training on examples (x1,y1), (x2,y2), …,(xk,yk) of the mapping,

where yk = f(xk).

Figure 2.8: Structure of the feedforward artificial neural network. There can be more than one middle

or hidden layers [53].

The feedforward network is commonly used together with an error correction

algorithm such as backpropagation, gradient descent, or conjugate gradient descent.

The structure of the feedforward network, as shown in Fig. 2.8 comprises a number

of layers of PEs. There are three types of layers depending on their location and

function: input , hidden, and output. The connections between the layers are generally

feedforward during presentation of an input signal. However, during training the

network allows the backpropagation of error signals to the hidden units in order to

adjust the connection weights. Feedforward networks may have more than one hidden

layers. Extra hidden layers allow more complex mappings but also require more

information for training of the network. The choice is usually between an excessive

number of PEs in one hidden layer and a low number of PEs but in more than one

hidden layer.


42

2.4.2 Recurrent Networks The main difference between the ANN structure described above and that of the

recurrent networks is in the presence of feedback loops. A recurrent network may or

may not have input and output units since the outputs of a single layer of units can be

directed back to the inputs of the same units, i.e. every PE branches its output to the

inputs of all other units in the layer. Figures 2.9a and 2.9b show examples of

recurrent network with and without input and output units. The other difference

between these two examples is in the presence of self-feedback loops . In Fig. 2.9a

each PE sends its output to the input of every other PE while in Fig. 2.9b the PEs also

receive their own outputs as inputs. Feedback loops are usually passed through unit-

delay elements leading to a nonlinear dynamical behavior [32].

A particular example of a recurrent network is the Amari-Hopfield model or

Hopfield network [35]. The Hopfield network consists of a single layer of PEs which

receive an initial input vector. This input vector consists of component values which

may be either 1 or –1 and are fed one per PE. The initial output from each PE is fed

back to a branching node which fans out to every PE except the one from which the

output signal originates. The branching connections to every PE are weighted by N-1

weights, N being the total number of PEs in the network. The weighted signals are

summed and passed through a threshold activation function resulting in an updated

output. Hopfield networks normally operate asynchronously, i.e. the PEs are activated

one at a time and therefore a single updated output is produced at any given time,

there is a random input added to the weighted signal sum, and the new updated output

is held and used to update each future asynchronous activation of any PE [53].


43

(a) (b)

Figure 2.9: a) Recurrent network without self-feedback connections, b) recurrent

network with self-feedback connections [32].

2.4.3 Self-Organizing Networks Self-organizing networks or self-organizing maps (SOMs) are a special class of the

unsupervised ANNs group. SOMs were developed by Teuvo Kohonen [45]. The

learning process applied to these networks, as was described in a previous paragraph,

follows the competitive learning paradigm. SOMs construct topology-preserving

mappings of the input data in a way that the location of a PE carries semantic

information. The SOM can be considered as a specific type of clustering algorithm. A

large number of clusters are chosen and arranged on a square or hexagonal grid in

one ore two dimensions. This grid is in essence a lattice of PEs of the SOMs single

computational layer. Input patterns representing similar examples are mapped to

nearby nodes of the grid. Figure 2.10 illustrates the basic SOM structure.


44

Figure 2.10: Structure of a two-dimensional Self-Organising Map [32].

2.4.4 Radial Basis Function Networks and Time Delay Neural Networks Radial Basis Function Networks (RBFNs) and Time Delay Neural Networks

(TDNNs) are two different ANN topologies with characteristics which separate them

from other classes of ANNs. The RBFNs are powerful network structures which

construct global approximations to functions using combinations of Radial Basis

Functions (RBFs) centred around weight vectors [54]. The basic RBFN structure is

shown in Fig. 2.11. A non-linear basis function is centred around each hidden node

weight vector. Hidden nodes have an adaptable range of influence or receptive field.

The output of the hidden nodes is a radial function of the distance between each

pattern vector and each hidden node weight vector.

The RBFN structure’s original motivation was in terms of functional

approximation techniques, regression and regularisation, and biological pattern

formation. The RBFN structure was chosen after a series of tests as the basic ANN


45

structure for the GEMNet II system for ore grade estimation. Chapter 4 gives a more

in-depth discussion of the RBFNs and the reasons behind their choice as the building

block of the GEMNet II system.

Figure 2.11: Basic structure of the Radial Basis Function Network [53].

TDNNs are based on ordinary time delays to perform temporal processing

[50, 105]. The TDNN (Fig. 2.12a) is a multi-layered feedforward ANN whose PEs

are replicated across time. The building block of a TDNN is a PE whose inputs are

delayed in time. The activation of a PE is computed by passing the weighted sum of

its inputs through an activation function like a threshold or sigmoid function. The

overall behaviour of the network is modified through the introduction of delays. The

M inputs of a PE are multiplied by N delay steps. Hidden PEs receive M * N delayed

inputs plus M “undelayed” inputs, a total of M * (N+1) inputs. However, only the

hidden PEs activated at any given time step have connections to the inputs with all

the other units having the same connection pattern but shifted to a later point in time

according to their delay position in time.


46

TDNNs are used for position independent recognition of features within larger

patterns. The TDNNs are trained on time-position independent detection of sub-

patterns, a feature that makes them independent from error-prone pre-processing

algorithms for time alignment. They are used to capture the concept of time

symmetry as encountered in the recognition of phonemes using frequency-time

images known as spectrograms (Fig. 2.11b).

Figure 2.11: The concept of Time Delay Neural Networks for speech recognition [50].

2.4.5 Fuzzy Neural Networks Fuzzy logic and systems can be used in conjunction with ANNs in more than one

way to provide solutions for control problems, decision making, and pattern

recognition. The most common way of integrating the two technologies is the fuzzy

logic implementation by ANNs leading to neuro-fuzzy systems.

Fuzzy systems provide means of capturing uncertainty. Uncertainty is

inherent in almost every real-world problem. The essential characteristics of fuzzy

logic are as follows [117]:


47

• Exact reasoning is viewed as a limiting case.

• Everything is a matter of degree.

• Inference is viewed as the process of propagation of elastic constraints.

• Any logical system can be fuzzified.

The integration of ANNs with fuzzy systems results to a Fuzzy Neural Network

(FNN) of one of the following types [93]:

• FNN with crisp number of inputs and fuzzy weights.

• FNN with fuzzy set input signals and crisp weights.

• FNN with both fuzzy input signals and fuzzy weights.

The building block of an FNN is a fuzzy version of the PE described in Paragraph

2.2.1. A possible FNN structure consists of a layered net with an input layer

implementing membership functions, a first hidden layer implementing fuzzy rules

and combining membership functions, a second hidden layer combining fuzzy values,

and an output layer providing defuzzification. Figure 2.12 illustrates an approach to

FNN implementation.

Figure 2.12: An approach to FNN implementation.


48

2.5 Conclusions The discussion given in this chapter covered the basic concepts of artificial neural

networks as well as major types of ANN learning and architecture. The potential of

this technology became clear through examples of ANNs presenting special

characteristics and areas of application. The ever increasing research activity in this

field has also been discussed, showing that ANNs become more and more popular as

tools for solving an increasing number of real-world problems. ANN technology

finds its way to a number of diverse engineering and decision making problems of the

mining industry, as it will be demonstrated in Chapter 4 through a set of successful

examples.

Radial Basis Function Networks

23

3. Radial Basis Function Networks

3.1 Introduction In this chapter the discussion continues with an analysis of a very unique type of

ANNs, the Radial Basis Function networks (RBF). RBFs were initially used for

solving problems of real multivariate interpolation. Work on this subject has been

extensively surveyed by Powell [79]. The theory of RBFs is one of the main fields of

study in numerical analysis [96, 80].

RBFNs are very simple structures. Their design is in essence a problem of

curve fitting in a high-dimensional space. Learning in RBFNs means finding the

hyper-surface in multi-dimensional space that fits the training data in the best

possible way. This is clearly different from most of the ANN design principles

discussed in the previous chapter. Function approximation and pattern classification

are the main areas of RBFNs application. One of the main advantages of RBFNs lies

in their strong scientific foundation. RBFs have been motivated by statistical pattern

processing theory, regression and regularisation, biological pattern formation, and

mapping in the presence of noisy data [96]. Therefore, RBFNs have inherited a wide

range of useful theoretical properties, which have been used to provide solutions to a

much wider range of problems than the RBFs themselves.

The choice of RBFNs in the development of GEMNet II was based on these

theoretical properties, which will be further discussed over the next paragraphs, but

also on results from experiments carried out using data from real mineral deposits.

The use of RBFNs also helped achieving one of the main aims of GEMNet II, which

is to provide a fast alternative to existing grade estimation techniques. In the tests

carried out at the beginning of the project the speed of development of RBFNs has

been unparalleled by any other architecture tested.


24

3.2 Radial Basis Function Networks – Theoretical Foundations

3.2.1 Overview The basic principles of RBFs and of the derived networks will be discussed in this

section. For the purposes of this thesis, the discussion will concentrate to the theory

behind the use of RBFs for interpolation problems and not for pattern classification.

The transition from the original RBF methods for interpolation to RBFNs will also be

analysed.

3.2.2 Multivariable Interpolation RBFs were first introduced to the problem of multivariable interpolation as an

approach to dealing with irregularly positioned data points. The problem of

multivariable interpolation is as follows [79]:

Given m different points },...,2,1;{ mixi = in nℜ , and m real numbers },...,2,1;{ mifi = ,

one has to calculate a function s from nℜ to ℜ that satisfies the interpolation conditions

.,...,2,1,)( mifxs ii == (3.1)

The choice of s from a linear space that depends on the positions of the data points

forms the approach of RBFs. RBFs have the general form:

mixxx ni ,...,2,1,),( =ℜ∈−φ (3.2)

Where φ is the basis function from +ℜ to ℜ and the norm of nℜ is Euclidean.

Several interpolation methods have been considered in which s has the form:


25

( )∑=

ℜ∈−=m

i

nii xxxxs

1.,)( φλ (3.3)

With the condition of the matrix

( ) ,,...,2,1,, mjixxA jiij =−= φ (3.4)

being non-singular, the condition 3.1 defines the coefficients { }mii ,...,2,1; =λ

uniquely. The matrix A is normally called the interpolation matrix. These methods

have a very useful property, proved by Micchelli [62], that, if the data points are

different then, for all positive integers m and n, A is always non-singular. This theory

applies to many choices of φ. However, in the case of the basis functions of the form

( ) 0, ≥= rrr lφ (3.5)

the theory applies only under conditions concerning the degree l, and the dimension

of the input space mo. The class of RBFs covered by Micchelli’s theorem includes the

following functions:

1. Multiquadratics:

( ) ( ) 2122 crr +=φ for some c > 0 and ℜ∈r (3.6)

2. Inverse Multiquadratics:

( )( ) 2

122

1

crr

+=φ for some c > 0 and ℜ∈r (3.7)


26

3. Gaussian Functions:

( ) ⎟⎟⎠

⎞⎜⎜⎝

⎛−= 2

2

2exp

σφ rr for some σ > 0 and ℜ∈r (3.8)

4. Thin Plate Splines:

( ) ( )rrr ln2=φ , ℜ∈r (3.9)

It should be noticed that multiquadratics and thin plate splines decrease by moving

away from the centre of the basis function, while Gaussian and inverse

multiquadratics increase. The thin plate splines are interpolating functions derived by

variational methods [22, 61].

3.2.3 The Hyper-Surface Reconstruction Problem The interpolation technique described above suffers from a very serious problem: If

the number of data points in the training sample is greater than the number of degrees

of freedom of the underlying physical process, then fitting as many RBFs as the

number of data points leads to over-determination of the hyper-surface reconstruction

problem [11]. This is known in neural network terms as overfitting or overtraining.

Allowing an RBFN to reach this stage means degradation of its generalization

performance.

The problem of learning the hyper-surface defining the output in terms of the

input can be either well-posed or ill-posed. These terms have been in use in applied

mathematics for over a century. An unknown mapping f between a domain X and an

output range Y (both taken as metric spaces) is considered. Reconstructing this

mapping f is said to be well-posed when the following three conditions are satisfied

[101, 66, and 44]:


27

1. Existence: for every input vector Xx∈ there is an output y = f(x), where Yy∈ .

2. Uniqueness: for any pair of input vectors x, t ∈ X, f(x) = f(t) only if x = t.

3. Continuity: also referred to as stability, continuity requires for any ε > 0 there will

exist δ = δ(ε) so that if ρx (x, t) < δ then ρy(f(x),f(t)) < ε, where ρ(⋅,⋅) is the

distance between two arguments in their respective spaces [32].

A problem is ill-posed when any of these conditions is not satisfied. Normally, a

physical phenomenon such as an orebody deposition, is a well-posed problem.

Learning from drillhole data is, however, an ill-posed problem because:

• For any pair of input vectors x, t there can be f(x) = f(t) even when x ≠ t.

• It is well known that drillhole and other physical samples from mineral deposits

contain physical sampling errors leading to the possibility for the neural network

to produce an output outside the range Y for a specified input. That means

violation of the continuity criterion.

The second of the reasons has a more serious impact to solving the problem, as lack

of continuity means that the computed input-output mapping does not represent the

true solution.

The issues of hyper-surface reconstruction with RBFs being an ill-posed

problem and leading to overfitting need to be addressed. A number of methods have

been developed for making an ill-posed problem into a well-posed one, as well as

preventing overfitting. The most important one, regularisation, will be discussed in

the following paragraph.


28

3.2.4 Regularisation Regularisation is a method developed by Tikhonov in 1963 [102] for solving ill-

posed problems. Its use has been mostly explored in approximation theory.

Regularisation aims at overcoming the lack of continuity of an ill-posed problem by

means of an auxiliary nonnegative functional embedding prior information about the

solution. Such information is commonly the assumption that similar inputs

correspond to similar outputs. Tikhonov’s theory involves two terms:

1. Standard Error Term: denoted by E(F), represents the standard error or distance

between the desired response (target output) di and the actual response yi for the

training example i = 1, 2, …,N. The standard error term is defined as:

∑ ∑= =

−=−=N

i

N

iiiiis xFdydFE

1 1

22 )]([21)(

21)( (3.10)

2. Regularising Term: denoted by Ec(F), provides the means for embedding

geometrical information about the approximating function F(x) to the solution.

This term is defined as:

2

21)( FFEc D= (3.11)

where D is a linear differential operator. It is in this operator that prior information

about the form of the solution is embedded and therefore its selection depends on the

problem at hand.


29

Regularisation provides a way of reducing the number of basis functions

when fitting RBFs by adding a penalty term described above as the regularising term

[83]. The principle of regularisation is the following:

Find the function Fλ(x) that minimises the Tikhonov functional E(F), defined by

E(F) = Es(F) + λEc(F)

Where λ is a positive real number called the regularisation parameter. The choice of

λ is very crucial as it controls the balance of contribution from the sample data and

the prior information. It can also be seen as an indicator of the sufficiency of the

given data samples to specify the solution to the above minimisation problem.

The implementation of the regularisation theory leads to the regularisation

network [77]. As shown in Fig. 3.1, it consists of three layers. The first layer consists

of a number of input nodes equal to the dimension mo of the input vector x. The

second or hidden layer consists of non-linear nodes connected directly to all the input

nodes. The number of hidden nodes equals the number of samples N.

Figure 3.1: Regularisation network [32].


30

The activation function used in the hidden nodes is a Green’s function G(x, xi). One

of the most common Green’s functions is the multivariate Gaussian function:

)2

1exp(),( 22 ii

i xxxxG −−=σ

(3.12)

where xi denotes the centre of the function, σi its width or receptive field, and wj the

unknown coefficients. These coefficients are defined as follows:

[ ] NixFdw iii ,...,2,1,)(1=−=

λ (3.13)

The minimising solution, denoted as Fλ(x), is given by:

∑=

=N

iii xxGwxF

1),()(λ (3.14)

The solution reached by the regularisation network exists in an N-dimensional

subspace of the space of smooth functions, the set of Green’s functions constituting

the basis for this subspace [77]. As Poggio and Girosi point out, the regularisation

network has three useful properties:

1. It is a universal approximator as it can approximate arbitrarily well any

multivariate continuous function, given sufficient number of hidden nodes.


31

2. It has the best-approximation property, i.e. given an unknown non-linear function

f, there always exists a choice of coefficients that approximate f better than all

other choices.

3. It provides the optimal solution. In other words, the regularisation network

minimises a functional that measures the solution’s deviation from its true value

as represented by the training data.

3.3 Radial Basis Function Networks

3.3.1 General The structure described above as the regularisation network has a very important

weakness: as the number of functions depends initially to the number of training

samples, the network produced can be very expensive in computational terms. This

can be easily understood by considering the computation of the network’s linear

weights, which requires inversion of a very large matrix. Therefore there is a need for

reducing the complexity of the network leading to an approximation of the

regularised solution.

This is achieved by the introduction of a simplified version of the

regularisation network, the generalised radial basis function network. From this point

on, it will be assumed that RBFNs are generalised RBFNs. RBFNs involve searching

for a sub-optimal solution in a lower-dimensional space. This solution approximates

the regularised solution discussed before.

3.3.2 RBF Structure Figure 3.2 illustrates the basic structure of the (generalised) RBFN. The first obvious

difference between this network and that of Fig. 3.1 is in the number of hidden layer

basis functions. In the RBFN there are m1 RBFs, typically less than the number of

training samples, while in the regularisation network there were N RBFs, with N


32

equal to the number of training samples. Other structural differences include the

number of weights being also reduced to m1, and the introduction of a bias applied to

the output unit.

Figure 3.2: Structure of generalised RBFN [32].

Significant differences, not so obvious from the figures, concern the centre

positions and receptive fields of the RBFs as well as the linear weights associated

with the output layer. These are all unknown parameters and have to be learned by

the RBFN during training. In the regularisation network, only the linear weights are

unknown and require training. In the next paragraph, the function of the RBFN will

be further analysed. Special attention is given to the way of initially positioning the

RBF centres during initialisation and the RBF learning algorithms.

3.3.3 RBF Initialisation and Learning For an RBFN to be able to receive training samples and function as a hyper-surface

reconstruction network, a number of its parameters need to be calculated. These

parameters include:


33

• The linear weights between hidden and output layer.

• The bias to the output units.

• The centres of the hidden layer RBFs.

There are a number of methods for RBFN initialisation and learning. The most

common methods are:

1. Random Centre Selection: it is the simplest of the methods. The centres are

randomly chosen from the training data set. It is a common method used when the

training data represent well the problem at hand. Learning using this approach is

concentrated in adjusting the linear weights between the hidden and output layer.

This is achieved using the pseudoinverse method [11]. The weights are calculated

using the formula below:

dGw += (3.15)

where d represents the target output vector in the training data set. G+ is the

pseudoinverse of matrix G, defined as

}{ , jigG = (3.16)

where gi,j is the output of RBF i when presented with input vector j. Golub and

Van Loan [28] provide an in depth discussion over the computation of a

pseudoinverse matrix.


34

2. Self-Organised Centre Selection: The learning method described above requires

a data set representative to the problem at hand. There is no guarantee that the

randomly selected centres reflect accurately the distribution of the data points. To

overcome this problem, a clustering algorithm is used that creates homogeneous

groups of data from the given data set. There are a number of clustering

algorithms, however, in the case of RBFNs, the k-means clustering algorithm is

the most commonly used [23]. Moody and Darken [65] describe the use of k-

means clustering algorithm. The number of centres k is set in advance. With the

number of centres set, the algorithm proceeds with the following steps [9]:

I. The values of the initial RBF centres tk(0) are set randomly. These values

need to be different between them.

II. A vector x is selected from the data set and passed to the algorithm. The

index k(x) of the best-matching centre for the vector is calculated using the

minimum-distance Euclidean criterion:

1,...,2,1,)()(minarg)( mkntnxxk kk=−= (3.17)

where tk(n) is the kth centre at iteration n.

III. The RBF centres are adjusted using the following rule:

⎩⎨⎧ =−+

=+otherwisent

xkkntnxntnt

k

kkk ),(

)()],()([)()1(

η (3.18)


35

where η is the learning-rate parameter receiving values between 0 and 1.

This parameter controls the speed of learning, i.e. the degree of

adjustment on the particular network parameter, in this case, the RBF

centres.

IV. The iteration pointer n is increased by 1 and the algorithm loops back to

step II. This process continues until the centres become stable.

The self-organised stage described above is followed by a supervised learning

stage, which allows the calculation of the linear weights between the hidden and

output layer. The overall approach depends largely on the initial selection of

centres. Several enhancements to the initial centre selection have been introduced

in order to avoid the situation where some initial centres get trapped in regions of

the input space with low density of data points [14 and 15]. An advanced version

of this learning method is used in the development stages of the GEMNET II

system.

3. Orthogonal Least Squares: the OLS algorithm involves sequential addition of

new RBFs to a network, which starts with a single basis function. Each new RBF

is positioned to each data point and the linear weights are calculated for each

position. The centre that gives the smallest residual error is retained. This way the

number of RBFs increases step by step. The selection of a candidate data point for

centre positioning is done by constructing a set of orthogonal vectors in the space

S spanned by the hidden unit activation vectors for each training pattern. The data

point that produces the greatest reduction in the residual error is chosen as the


36

location of the new RBF centre. It is important to stop the algorithm well before

every data point is selected to ensure good generalisation.

4. Supervised Centre Selection: the basis of this method is the least-mean-square

algorithm (LMS). A supervised learning process based on the LMS algorithm sets

all the free parameters of the RBFN. The LMS algorithm takes the form of a

gradient descent procedure. Initially, a cost function is defined as follows:

∑=

=N

jjeE

1

2

21 (3.19)

where N is the number of training samples, and ej is the error defined as:

∑=

−−=

−=M

iCijij

jjj

itxGwd

xFde

1

)(

)(* (3.20)

where Ci is the norm-weighting matrix. The method aims at minimising E by

adjusting the free parameters of the network, the weights wi, the centres ti, and

the receptive fields 1−Σ i . The adjustments to these three parameters are calculated

below [32]:

Linear Weights Adjustment:

∑=

−=∂∂ N

jCijj

i intxGne

nwnE

1))(()(

)()( (3.21)

Centres Position Adjustment:


37

∑=

− −Σ−=∂∂ N

jijiCijji

i

ntxntxGenwntnE

i1

1 )]([))((')(2)()( (3.22)

Receptive Fields Adjustment:

( )∑=

− −−=Σ∂∂ N

jjiCijji

i

nQntxGnenwn

nEi1

1 )()(')()()(

)( (3.23)

The update rules for the three parameters, based on the three learning-rate

parameters η1, η2, and η3, are given below:

Linear Weights Update Rule:

)()()()1( 1 nw

nEnwnwi

ii ∂∂

−=+ η (3.24)

Centres Positions Update Rule:

)()()()1( 2 nt

nEntnti

ii ∂∂

−=+ η (3.25)

Receptive Fields Update Rule:

)(

)()()1( 1311

nnEnn

iii −−−

Σ∂∂

−Σ=+Σ η (3.26)

It should be noticed that this gradient-descent procedure for RBFNs does not

involve error back-propagation.

5. Regularisation Based Learning: the final RBF learning method described is

based on regularisation theory. Yee [116] provides the justification for this RBF

design procedure that is based on four main elements:

Tijijji ntxntxnQ )]()][([)( −−=


38

I. A radial-basis function, G, admissible as the kernel of a mean-square

consistent Nadaraya-Watson regression estimate (NWRE) [68, 108].

II. A common for all centres, input norm-weighting matrix, 1−Σ , with entries

),...,,(021 mhhhdiag=Σ (3.27)

where h1, h2, …, hmo are the bandwidths of a consistent NWRE kernel G

for each dimension of the input space. These bandwidths are given as the

product of the sample variance of the ith input variable estimated from the

available training data and a scale factor determined using a cross-

validation procedure.

III. Regularised strict interpolation for the training of the linear weights using

the following equation:

dIGw 1−+= )( λ (3.28)

where G is Green’s matrix and I is the N-by-N identity matrix.

IV. The choice of the regularisation parameter λ and the scale factors is

achieved using a method such as the common cross-validation (CV).

Generally, larger values of λ lead to larger noise in measuring the

parameters. In a similar manner, the larger the values for a specific scale

factor, the less important is the associated input dimension for the

variation of the network output in relation to variations in the input. In

other words, the scale factors can be used for ranking the significance of

the input variables and can aid the reduction of the input space

dimensionality.


39

3.4 Function Approximation with RBFNs

3.4.1 General In this section, the discussion continues with an evaluation of the function

approximation capabilities of RBFNs. It will be shown that the range of RBFNs is

broad enough to uniformly approximate any continuous function. The effects of the

input space dimension and the amount of input data on the RBFN approximation

properties will also be analysed.

3.4.2 Universal Approximation The universal approximation theorem for RBFNs, as stated by Park and Sandberg

[74], opened the way for their use in function approximation problems, which were

commonly approached using Multi-Layered Perceptrons. The work of Park and

Sandberg [74, 73], Cybenko [19], and Poggio and Girosi [77] led to a new model for

function approximation based on generalised RBFNs. Specifically, the theorem can

be stated as below:

Let G:Rmo→R is an integrable bounded function such that G is continuous and

∫ ≠0

0)(mR

dxxG

Let ℑG denote the family of RBFNs consisting of functions F:Rmo→R represented by

∑=

⎟⎠⎞

⎜⎝⎛ −

=1

1)(

m

i

ii

txGwxFσ

where σ > 0, wi ∈ R and ti ∈ Rmo for i = 1, 2, …, m1. For any continuous input-output

mapping function f(x) there is an RBFN with a set of centres { } 11

miit = and a common receptive

field σ > 0 such that the input-output mapping function F(x) realised by the RBFN is close to

f(x) in the Lp norm, p ∈ [1,∞].


40

The universal approximation theorem provides the theoretical basis for the design of

RBFNs for practical applications.

3.4.3 Input Dimensionality A very critical issue in the use of RBFNs as function approximators is the dimension

of the input space and its effect on the intrinsic complexity of the approximating

function(s). It is generally accepted that this complexity increases exponentially in the

ratio mo/s, where mo is the input dimensionality and s is a smoothness index of the

number of constraints imposed on the approximating function. Therefore, for the

RBFN to be able to achieve a sensible rate of convergence, the smoothness index s

needs to be increased with the number of parameters in the approximating function.

However, the space of approximating functions attainable with RBFNs becomes

increasingly constrained as the input dimensionality is increased [32].

Increased dimensionality also has a great effect on the computational overhead

caused during training of the RBFN. The dimension of the input space has a direct

control over the RBFN architecture – the number of input nodes, the number of

RBFs, and consequently, the number of linear weights between hidden and output

layer. Therefore, any increase in the input dimensionality causes an increase in

computer memory and power requirements, and an almost certain increase in

development time. The most common ways of addressing the high input

dimensionality for a given problem are to identify and ignore the inputs that do not

contribute considerably to the output or to try to combine inputs that present a high

correlation. Another way of reducing the input dimensionality, which is not always

applicable though, is to try and break a complex problem into a number of low

dimensionality problems that can be more effectively addressed using RBFNs.


41

3.4.4 Comparison of RBFNs and Multi-Layer Perceptrons Comparison of RBFNs with MLPs is inevitable since they are both used for similar

applications and both are universal approximators. This comparison also leads to

better understanding of these two ANN architectures. The differences between the

two architectures are both structural (concerning the topology of the network) and

functional (concerning the operation and use of the network):

Structural Differences:

• RBFNs have a single hidden layer. MLPs can have more than one hidden layers.

• Hidden units in RBFNs are different from the output units. MLP hidden units are

similar to the output units.

Figure 3.3: Illustration of input space dissection performed by the RBF and MLP

networks [54].

Functional Differences:

• RBFNs construct local approximations to non-linear input-output mappings,

while MLPs construct global approximations.


42

• The output layer of an RBFN is always linear, while the MLP output layer can be

non-linear depending on the application.

• RBF hidden units calculate the Euclidean norm between the input vector and their

centre, while MLP hidden units compute the inner product of the input vector and

their synaptic weight vector.

• MLPs exploit the logistic non-linearity to create combinations of hyperplanes to

dissect pattern space into separable regions. RBFNs dissect pattern space by

modelling clusters of data directly and, therefore, are more concerned with data

distributions (Fig. 3.3) [54].

3.5 Suitability of RBFNs for Grade Estimation RBFNs, as most of the ANN structures, have certain properties that establish them as

a natural choice for grade estimation. However, RBFNs also have a number of

additional useful properties that give them an advantage over other ANN

architectures for this specific problem.

The first of these properties, and possibly the most important one, is that RBFNs

construct local approximations to input-output mappings. It is well known that a

mineral ore deposit is a localised phenomenon. Modelling of a deposit’s grade in 3D

space using drillhole data can be considered to be a problem of hypersurface

reconstruction in 3D space, with this hypersurface consisting of a number of zones

that need to be locally approximated. Deposits commonly present a localised

behaviour; i.e. points within one area of a deposit close to each other tend to have

similar grades. Clearly, this area very rarely extends to the entire deposit and,

therefore, the approach of fitting RBFs in specific locations can be advantageous.

These locations are found by clustering of the drillhole data in order to identify these

areas of similar ore grade behaviour.


43

RBFNs provide an approach to dealing with ill-posed problems due the

properties that they inherit from regularisation theory. Grade estimation is an ill-

posed problem, even though the underlying phenomenon – the orebody deposition –

is well-posed. As was shown in Par. 3.2.3, reconstructing a deposit’s grade as a

hypersurface in the space derived from the drillhole data information, is an ill-posed

problem, hence RBFNs should be the choice of ANN for this task.

RBFNs also allow the calculation of reliability measures, such as the

extrapolation measure and confidence limit. Due to the localised nature of

approximation performed by RBFNs, it is possible to measure the local data density

for a given point x in the input space as an index of extrapolation [52]. Confidence

limits for the model prediction can also be calculated from the local confidence

intervals developed for each RBF unit using a weighted average of the latter. These

reliability measures were first introduced by Leonard et al. [71, 70] incorporated in a

new ANN architecture that computes its own reliability, called the Validity Index

network (VI). Leonard et al. used a two-stage approach based on data densities

derived using Parzen windows [75], and an interpolation formula used for

determining the densities at arbitrary test points. These measures are now standard to

most of the commercial neural network simulators that provide RBFN development

options. Further examination of the use of reliability measures will be presented in

Chapter 7 with the discussion over the development of the GEMNET II system.

Finally, another advantage of RBFNs over other ANN architectures that is

derived from their theoretical properties, is their speed of development. In the case of

low input dimensionality, RBFNs’ learning is expected to be a lot faster than in any

other ANN architecture used for the same problem. The author approached ore grade

estimation using an input space of maximum four dimensions (Easting, Northing,


44

Elevation, and sample Length), a number low enough for the networks to be very fast

to develop.

In later chapters, the suitability of RBFNs for the problem of grade estimation

will be further demonstrated using experimental results on a large number of case

studies.

Applications of Artificial Neural Networks to Mining

71

4. Mining Applications of Artificial Neural Networks

4.1 Overview Artificial Intelligence (AI) tools have been in use for years in a number of mining

related applications. Expert and knowledge based systems, probably the most popular

AI tools, have found their way into a number of computer-based systems supporting

everyday mining operations as well as production of mining equipment. In recent

years, AI has provided tools for optimizing operations and equipment selection,

problems involving large amounts of information that humans cannot easily cope

with in the process of decision-making. These AI systems together with an ever-

increasing number of sophisticated purpose-built computer software packages have

created a very favorable environment for the introduction of yet another powerful AI

tool, the Artificial Neural Networks.

In the ‘90s the mining industry has been introduced to a number of ANN

based systems, some of them finding their way to a fully commercialized product, as

will be illustrated by some examples in this chapter. It should be noted however that

these examples are very few considering the total number of applications at the

research level, and the overall research effort carried out at universities and research

institutes around the world.

The applications described in this chapter are divided in two groups. The first

group will include examples of ANN systems for Exploration and Resource

Estimation. These systems have many common points with the GEMNet II system

developed by the author, and more importantly share the same aims. The second

group of applications includes examples considering the remaining mining problems.

This grouping does not mean in anyway that Exploration and Resource Estimation is

the most important of the mining tasks or that there are more ANN systems targeted


72

to this field of mining. The grouping as well as the selection of the examples was

purely based on the relevance of the applications to the subject of this thesis.

4.2 ANN Systems for Exploration and Resource Estimation

4.2.1 General Exploration and resource estimation commonly involves the prediction of various

parameters characterizing a mineral deposit or a reservoir. The input data usually

comes in the form of samples with known positions in 3D space. The majority of the

ANN systems developed for these predictive tasks are based on the relationship

between modelled parameters and sample locations. The most common practice when

developing the training patterns set for an ANN, is to generate input-output pairs with

the input being the sample location and the desired output being the value of the

modeled parameter at that location. In other words, most of the ANN systems treat

the modeling of the unknown parameters as a problem of function approximation in

the sample co-ordinates space.

Some other systems, like GEMNet II, go a step further to exploit information

hidden in the relationship between neighboring samples. The estimation of a

parameter at a specific location in 3D space is, in this case, depending on information

from samples around that location. In fact, GEMNet II is trying to use both this and

the above approach wherever possible.

Most of the systems described over the next paragraph work in 2D input space

(Easting, Northing). They also share the same ANN architecture, usually based on the

MLP or RBF network.


73

4.2.2 Sample Location Based Systems The first example is an MLP based ANN for ore grade/resource estimation developed

by Wu and Zhou [112]. The network architecture, as shown in Fig. 4.1, is an MLP

with four layers: an input layer, two hidden layers, and one output layer. The network

receives two inputs, the Easting and Northing of samples. The two hidden layers are

identical and have 28 units each. It is a relatively large network considering the

dimension of the input space (2D). However, the developers have used a fast learning

algorithm called the Dynamic Quick-Propagation (DQP) [113] that is based on the

quick-propagation algorithm [24] and a system for the determination of the hidden

layer size called Dynamic Node Creation [4]. The size of the network was, therefore,

determined through a learning process and should not be a cause for consideration.

Figure 4.1: ANN for ore grade/resource estimation by Wu and Zhou [112].

This ANN has been tested on assay composites from a copper deposit. A set of 51

drillhole composites has been used to train the network over an area of 3600 square

1st Hidden Layer: 7 x 4

2nd Hidden Layer: 7 x 4


74

meters. The results of the trained network have been compared with results from the

polygonal method (manual and computer based), inverse distance, and kriging. These

results were based on Hughes, Davis, and Davey [36]. Unfortunately, there was no

comparison of the grade/resources estimates with actual values. This limitation tends

to be a very common problem in most of these studies.

Similar to the above network, is the ANN developed recently by Yama and

Lineberry [115], which is based again on the MLP architecture but uses the original

back-propagation learning algorithm. This network has one hidden layer with 50

hidden units instead of two. This difference brings back the question of network

complexity, i.e. whether to use a single but large hidden layer or multiple but small

layers. It seems that most of the researchers in the field choose a single hidden layer

mainly because of the reduced computational overhead as well as a reduction in the

required quantity for training samples.

Yama and Lineberry used sulphur data from 1152 samples from a 7315 x

4572-m coal property in northern West Virginia. It should be noticed that the use of

real data in similar studies is very rare. The property was divided into 25 regions (914

x 914-m) due to computer memory limitations. For every region, a network was

trained using the Easting and Northing as inputs and the sulfur values as output. All

the data values were normalized before they were used for training and testing of the

networks. The data were normally distributed, a property that usually causes the

networks to give outputs close to the mean value. Presenting the tails of the

distribution more often to the network and with a higher learning rate has reduced this

effect. The results obtained from the ANNs were compared with results from kriging.


75

Clarici et al. [16] also described a similar approach of a single hidden layer

network earlier. In that study though, only one neural network was used for the entire

sampling area.

Moving from 2D to 3D input space, Caiti and Parisini [13] have used RBF

networks to interpolate geophysical properties of ocean sediments, e.g. porosity,

density, and grain size. The choice of RBF networks was based on their strong

theoretical foundation especially in function approximation. GEMNet II is based on

RBF networks and, in fact, for very similar reasons to those discussed in the previous

chapter and will be further analysed in the following. Caiti and Parisini used the

Gaussian as the basis function of the interpolating network. They suggested, as many

others, that any discontinuities of the interpolated property can be handled by a

smooth, continuous approximation network provided with enough information close

to the discontinuity. The choice of RBF centers has been based on the number of

values on the z-axis. As they very logically identified, there are normally many

samples on the z-axis and less on the x-y plane due to the sampling techniques used.

In the case of large number of samples on the z-axis, the RBF centers are mobile, in

other words their positions can change with learning. However, in the case of small

number of samples on the z-axis, the RBF centers are fixed, i.e. their positions remain

unchanged during training and the network is updated by adding extra RBFs

whenever a new sample is presented.

Density data from cores in an area of the Tyrrhenian Abyssal Plain, in the

Mediterranean Sea have been used as input data for the training and testing of the

network. Part of the data has been kept out of the training procedure and then used to

test the trained network’s prediction accuracy.


76

One of the very few examples of ANN system being developed to a fully

commercial product, is Neural Technologies’ Prospect Explorer. It is a complete

system offering data analysis, visualization, and detection of anomalies as well as

analysis of the relationships between them. The system is based on a neural structure

called AMAN (Advanced Modular Adaptive Network), shown in Fig. 4.2 [70].

AMAN is not a type of neural network. It is a complex system consisting of different

types of networks, which are trained, in both supervised and unsupervised mode. The

user has a choice of networks and learning strategies depending on the problem at

hand. As shown in Fig. 4.2, AMAN can be described by the following:

• A set of hierarchically arranged networks: a problem is divided to sub-problems

and a network is assigned to each one of them.

• The type of the individual networks can be chosen to suit the nature of the

specific sub-problem.

• The controller, called ‘supervisor’, can then handle the outputs of the individual

networks to form a final result for the problem.

Figure 4.2: General structure of the AMAN neural system.


77

AMAN as part of the Prospect Explorer can help to automate the detection of

anomalies from large quantities of survey data. Prospect Explorer provides the

following functions:

• Anomaly Detection: an interpolated grid forms the basis of a color map showing

areas of potential anomalies. This map can be used as a guide for further analysis.

• Cluster Identification: regions of survey data sharing common types of survey

results are identified.

• Correlation Analysis: layers of interpolated data can be correlated to illustrate

the relationship between the values of different types of data.

• Fuzzy Search: pattern-searching tool to analyze how closely regions match a

search specification supplied by the user.

• Relationship Explorer: similar to correlation analysis, but performed at specific

geographic locations.

Prospect Explorer has been used with success in a reasonably complex

exploration task that took place at the Girliambone region in New South Wales,

Australia. This case study involved several layers of data from a copper mine area of

110 square kilometers. The system has successfully identified the already known

deposits in the area as well as some unknown.

Cortez et al. [18] presented a hybrid system combining ANN technology with

geostatistics for grade/resources estimation. Their system, called NNRK (‘Neural

Network estimation of the drift and Residuals’ Kriging’), is based on a network with

3 inputs (the sample’s X, Y, Z co-ordinates), 6 hidden units and one output, the

respective Zn assay [18]. As shown on Fig. 4.3, the chosen ANN is very simple


78

compared with the larger networks described in he previous examples. This ANN is

used for the identification of the underlying large-scale structure (trend modelling),

while residual analysis is performed at sampled locations by stationary geostatistical

methods that model local spatial correlations. Final estimates are given as a sum of

both estimations. The developers have chosen the use of geostatistics to support the

ANN estimations because of the results they obtained from a preliminary study

showing that the back-propagation network could not follow local variations of grade.

In the NNRK system, these are handled by ‘residual kriging’.

Figure 4.3: Back-propagation network used in the NNRK hybrid system.

The hybrid system has been applied to a case study from a large Portuguese zinc

deposit. As shown in Fig. 4.4, the data are quite spread in 3D space, a situation very

common in case studies with real data. The data came from a drilling programme of

the Feitais deposit, a massive orebody belonging to the Aljustrel group of mines in

South Portugal. The dataset consisting of 768 samples was split in two parts. The

validation set included 160 samples, about 20% of the total. The rest was used for

training the ANN, a process that involved 3000 iterations.


79

Figure 4.4: Drillhole data used for testing the performance of the NNRK system [18].

The results obtained by the NNRK methodology are compared with those

produced by the ANN and kriging in the following table:

Table 4.1: Comparison of NNRK, ANN, and kriging estimates.

Populations n m (mean) σ2 σ/m

Sampled data 238 3.314 3.988 0.603 NNRK estim. 238 3.516 2.347 0.436 ANN estim. 238 3.493 0.141 0.108 Krig. estim. 238 3.461 1.281 0.370

The results presented show that the combination of ANN and kriging can improve

considerably over the results that can possibly be obtained from each one of the

methodologies individually. It should be noticed though that the back-propagation

network used in this study is only capable of performing global approximations

leading to smooth estimates. The number of hidden units in this network is also

surprisingly low considering the dimensionality of the input space.


80

4.2.3 Sample Neighborhood Based Systems All of the systems above try to reconstruct the ore grade surface from the sample co-

ordinates. This strategy works very well when this surface is fairly continuous and

there are enough samples covering the considered area. It also works better when

done in 2D rather than 3D – a single network seems to be producing outputs close to

the average value when faced with a very complex deposit in 3D and sometimes even

in 2D. Wu and Zhou [112] created a large network (56 hidden units) to perform grade

estimation on a 2D dataset of a fairly continuous copper deposit.

Quite reasonably, some researchers tried to take advantage of the information

hidden in the relationship between neighboring samples. This approach is followed in

general terms by the most advanced existing methods for grade estimation like

inverse distance weighting and kriging. Most of the examples following this approach

choose as neighbours the samples closest to the estimation point and treat the

problem of ore grade estimation as a mapping between the surrounding grades and

the grade at the estimation point (Fig. 4.5). The samples are normally arranged on a

grid, and the inputs are formed from the eight nodes surrounding the estimation point.

A very good example of this technique is given by Williams [110]. The main

assumption made in this example is that there is a strong correlation between gold

grades and magnetic data. The technique was applied in 2D space. Naturally, building

the grid of magnetic data from scattered samples by interpolation can introduce errors

due to smoothing. This is a serious disadvantage of all methods that require data

arranged in grids.


81

Figure 4.5: 2D approach of learning from neighbour samples arranged on a regular grid.

This single network approach has an additional limitation: the use of a single network

over the entire area of interest leads to the assumption that the learnt mapping

between the neighbour samples and the grade can be applied globally. In other words,

the method leads to a global approximation of ore grades.

Going a step further, some researchers have introduced multiple networks to

overcome these limitations. These modular systems consist of more than one network

each responsible for learning a different area of the deposit. The GEMNet system

developed by Burnett [12] is a very good example of a modular neural network

system for grade/resource estimation. Figure 4.6 illustrates the principle of

GEMNet’s operation. The deposit is divided into overlapping zones. The selection of

zones was arbitrary, which is a point where improvement could be made. In each

zone, a different network was trained and the final estimate for every point was taken

as the average of the networks trained in the specific area. As zones were

overlapping, there was almost always more than one network giving estimates.

Having more than one estimate led to the introduction of a reliability measure based


82

on the variance of the individual estimates – an indicator that can be used as a guide

for the reliability of the final estimate. This indicator was also used in the GEMNet II

system with minor changes (Chapter 7).

Figure 4.6: Modular network approach implemented in the GEMNet system [12].


83

A similar modular approach has been introduced by Geva et al. [27] for

function approximation. In both cases, the developers used multiple MLP networks

acting in a very similar manner with the RBFs of a single RBF network. The results

obtained by both Burnett and Geva have supported the choice of RBF networks as the

building unit for the GEMNet II system. However, GEMNet II is quite different in

the way these networks are used to provide a combined grade estimate, as it will be

shown in Chapter 7.

Actual Grade vs. GEMNet Prediction for Training Points

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6

Actual Grade (%Cu)

GEM

Net

Gra

de (%

Cu)

Figure 4.7: Scatter diagram of GEMNet estimates on a copper deposit [12].

GEMNet has been tested on simple function approximation problems, as well

as simulated ore deposits. Even though most of the case studies were using 2D data,

the results obtained were very encouraging and suggested that further research work

should be carried out to assess the effectiveness of the modular approach. Figures 4.7

and 4.8 show the results from a 2D study with GEMNet.


84

0 50 100 150 200 250 300 350 400 450 500 550 600

Departure (m)

0

50

100

150

200

250

300

350

400

450

500

550

600

Latit

ude

(m)

0.0000.0020.0040.0060.0080.0100.0120.0140.0160.0180.0200.0220.0240.0260.0280.0300.0320.0340.0360.038

0 50 100 150 200 250 300 350 400 450 500 550 600

Departure (m)

0

50

100

150

200

250

300

350

400

450

500

550

600

Latit

ude

(m)

0.000.050.100.150.200.250.300.350.400.450.500.550.600.650.700.750.800.850.900.951.001.051.101.151.20

GEMNet ReliabilityIndicator

Grade (% Copper)

High Reliability

Low Reliability

- Training Data Point

Figure 4.8: Contour maps of GEMNet reliability indicator and grade estimates of a copper

deposit [12].


85

4.2.4 Conclusions The discussion in this section has examined some of the most important examples of

neural network based systems for ore grade/resource estimation. A number of

techniques have been developed that differ mostly in the number of networks used

and the way data is presented to them.

As with the conventional methods for ore grade estimation, it is fairly safe to

say that there is no universally applicable solution to the problem. This is particularly

true when the neural network system is based on a single network. These systems

varied considerably in their architecture from one study to the other. The number of

hidden units changed even though the dimensionality of the problem remained

constant. Systems with modular structure, i.e. multiple networks, are more flexible in

the way they adjust to a specific deposit.

Both the sample co-ordinates and the sample neighbourhood based systems

can have their advantages and disadvantages depending on the deposit at hand. One

would expect the first to be better suited to continuous deposits were the grade can be

considered to be a hypersurface in the sample co-ordinates input space (a simple

surface in the case of 2D samples). The results obtained from the described studies

prove this to a certain degree.

On the other hand, complex deposits presenting a localised behaviour cannot

be modelled well by systems producing global approximations unless there are large

amounts of data to describe the local variations, a case that is very rare. These

deposits call for more flexible structures that can construct local approximations of

grade. Therefore, modular systems can be the choice for modelling complex deposits

using 2D or 3D data.


86

4.3 ANN Systems for Other Mining Applications

4.3.1 Overview A number of other mining related problems have been approached using ANN

technology. These problems commonly relate to pattern classification, prediction and

optimisation. ANNs have been successfully applied to these areas and are therefore

suitable for similar mining problems.

In the following paragraphs a brief description of such problems and their

ANN solution is described. The applications shown range from geophysics to plant

optimisation and illustrate the fact that ANN systems can be useful to a very large

number of problems.

4.3.2 Geophysics Geophysics is a relatively new area for ANN systems. However, in the last few years

ANNs have become a very popular tool in the interpretation of seismic and

geophysical data from various sources.

Garcia et al. [26] have used a MLP (Fig. 4.9) trained using back-propagation

for the inversion of lateral electrode well logs. Inversion represents the process of

constructing an earth model from the log data. The data used for training the network

were derived from a finite difference method that simulated the lateral log. The

trained network was tested using real data and the results were compared with those

from an automated inversion model. The study has shown promising results and has

presented the advantages of the use of ANN for the specific problem.


87

Figure 4.9: Back-propagation network used for lateral log inversion [26]. Connections

between layers are not shown.

In a similar fashion, Rogers et al. [85] used a MLP network for the prediction

of lithology from well logs. Malki and Baldwin [56] compared the results produced

by neural networks trained using well logs from different service companies. More

specifically, networks were trained using data from one service company and tested

on data from another, and the study was repeated using training data from both

companies and tested on data from each one individually. The results have shown that

better performance is obtained when using data from both service companies.

Wanstedt et al. [107] applied neural networks to the interpretation of

geophysical logs for orebody delineation. The data used for the development and

testing of their approach were taken from the Zinkgruvan mine in Sweden. The

network used was quite small – three layers with 3 inputs, 7 hidden units, and 1

output. The inputs were the gamma-ray, density, and susceptibility, and the output

was the ore grade (Zn, Pb, or Ag). The study reports good results in estimating the

grades and consequently interpreting the lithology (Fig. 4.10). Unfortunately no

numerical measurement of the network’s performance is provided.

51 Node Input & Output Layer – One for every 2ft interval of a 100ft log

40 Node Hidden Layer


88

Figure 4.10: Estimated grades and assays (red and blue) vs. actual (black) (107).

Murat et al. [67] used a MLP for the identification of the first arrival on a

seismogram. Roessler [84] used NETS, a neural network simulator written at

NASA/Johnson Space Center to develop a neural network for analysing wave arrivals

from seismic waves transmitted from one borehole and received from another. The

network was trained on a binary pixel image of the seismic trace data. The input layer

consisted of a large array (97 x 41 = 3977) of input nodes, the hidden layer had 50

units, and the output layer had two units. The network was trained to produce a

binary pattern in its outputs, i.e. the outputs were either 1 or 0. The different

combinations of outputs were indicative of the relative position of the first arrival to

the current positive lobe. Once again, no numerical measurement of the networks

performance during training and testing was provided in the study.

Barhen and Reister [6] developed DeepNet, a system based on the MLP that

predicts well pseudo logs from seismic data across an oil field. DeepNet combines a

very fast learning algorithm, systematic incorporation of uncertainties in the learning

process, and a global optimisation algorithm that addresses the optimality of the


89

learning process. The system has been successfully applied in the Pompano field in

the Gulf of Mexico.

4.3.3 Rock Engineering King et al. [43] have developed an unsupervised neural network for the discovering

of patterns in roof bolter drill data. The network successfully classified 617 drill

patterns to just 9 or 16 unique features representing major geologic features of a mine

roof. The patterns consisted of the penetration rate, thrust, drill speed, and torque. A

system consisting of this network and an expert system was developed for the

evaluation of coal mine roof supports [95].

Millar et al. [63] used self organising networks to model the complex

behaviour of rock masses by classifying input variables related to the rock stability

into two groups: failure or stability.

Walter [106] used Kohonen networks for the classification of mine roof strata

into one of 32 strength classes. The developed system can provide an estimate of

strength within two seconds giving the drill operator a warning almost in real time

when a potentially dangerous layer is reached.

4.3.4 Mineral Processing Neural networks have been successfully applied to a number of pattern classification

problems. Particle shape and size analysis seems to be a natural field of application

for ANNs and specially for unsupervised techniques.

Maxwell et al. [59] developed an ANN based system for particle size analysis

based on video images. The system analyses images from material on a conveyor and

predicts the particle size distribution.

Oja and Nyström [72] applied self-organising maps for particle shape

quantification. Image analysis is performed to mineral slurry particles by use of a


90

SOM which extracts the features affecting the behaviour of powders and slurries. The

training data set consisted of 3000 binary images of 500 particles. The produced map

size was 12 x 10. The developed SOM was tested on 360 particle images with

success. The test showed that the SOM was capable of clustering differently minerals

that did not have strong shape features.

Deventer et al. [104] used again the SOM for on-line visualisation of flotation

performance. The structure of the froth is quantified by the neighbouring grey level

dependence matrix. The SOM had a map size of 20 x 20 and there were three

classifications of Zn grade peaks as being positive (Class_+1), zero (Class_0), or

negative (Class_-1) for each of the image features. The classification was based on a

number of image features. The developed SOM was to be used as part of an

automated computer vision system for the control of flotation circuits.

Petersen and Lorenzen [76] applied the SOM to the modelling of gold

liberation from diagnostic leaching data. The data came from seven different gold

mines in South Africa. The ores from the mines were fed to mills and the ore samples

were screened into three size intervals. One of the fractions was further screened into

six size fractions giving a total of eight fractions. Representative samples were then

fed to a ball mill, and the product was screened into the same six size fractions. On

each of the fractions, diagnostic leaching was performed for each of the ore types.

The percentage of gold deportment and percentage of gangue, the percentage of free

gold in each fraction, the head grade, and the mass distribution were projected to a 10

x 10 map. The clustering produced was well defined for the different sample sources

(gold mines).


91

4.3.5 Remote Sensing Probably one of the most popular areas of neural network application, remote sensing

presents problems which are ideal for architectures such as the SOM, the LVQ, or

even the standard MLP. The examples given here, even though not directly linked to

mining activities, demonstrate the potential of ANNs in this field.

Bischof et al. [8] used a MLP for the multispectral classification of Landsat

images. These images came from a Landsat Thematic Mapper (TM) and were 512 x

512 pixels in size. They were also analysed into 7 spectral channels (bands) which

were used as the inputs to the network (13 units for each band representing different

intervals from 0 to 255). The network then had to learn to classify the 7 band values

to one of four types of land (built-up land, forest, water, and agricultural land), each

represented by an output of the network. Even though this architecture gave good

results, the developers extended the network to include a 7 x 7 pixel map of texture

from band 5. Naturally the number of hidden units was increased from 5 to 8 units.

The results from this extended architecture were better than the non-extended one in

all types of land.

Gopal and Woodcock [29] used a MLP for the detection of forest change from

Landsat TM images between 1988 and 1991. A 10-input vector of 10 TM bands (5

from 1998 and 5 from 1991) is used with the single output being the absolute or the

relative change. The results obtained with the developed MLP were better than those

obtained with the conventional method for this task.

Poulton and Zaverton [78] give a comparative study between different neural

network architectures used for classification of TM images. The architectures

compare were the back-propagation network, LVQ, counter-propagation network,

functional link, probabilistic network, and the SOM. From the tests performed, they

concluded that the LVQ architecture was the most flexible and robust one. They also


92

suggested the use of ANNs for the analysis of geochemical and geophysical data,

location of favorable prospects using GIS data, lithologic mapping from remote

sensing data, and estimation of parameters in a similar way with kriging.

Krasnopolsky [48] used a MLP for the retrieval of multiple geophysical

parameters from satellite data. These parameters were the surface wind speed,

columnar water vapor, columnar liquid water, and sea surface temperature (the four

outputs of the MLP). The MLP had five inputs taken from five Special Sensor

Microwave Imager brightness temperatures. The hidden layer had 12 units. The

simultaneous retrieval of multiple parameters improved the retrieval of each one

individually allowing physically coherent and consistent geophysical fields to be

produced.

Xiao and Chandrasekar [114] used a MLP for rainfall estimation from radar

observations. More specifically, two networks have been developed, one using

reflectivity as the only input, and the other using both reflectivity and differential

reflectivity as the inputs. The networks were trained on data obtained from a multi-

parameter radar and raingages from the Kennedy Space Center. The trained networks

were then used to estimate rainfall for four days during the summer of 1991. The

trainning patterns consisted of a square grid (3 x 3km) of reflectivity values as well as

distances from the grid nodes to the point of estimation. The raingage values were

used as the target outputs. The trained network estimates and raingage values have

shown good agreement at all sites.


93

4.3.6 Process Control-Optimisation and Equipment Selection Process control and optimisation tends to be a tedious task involving large amounts of

data from very different sources. ANNs are ideal for handling such tasks and this is

why many researchers in the field of process control turned to them for developing

solutions. Process control and optimisation of mineral processing plants as well as the

mining process itself are a special case of these tasks and can therefore be approached

by neural networks.

Van der Walt et al. [103] used the MLP for the simulation of Resin-in-pulp

process for gold recovery. Flament et al. [25] used the MLP for the identification of

the dynamics of a mineral grinding circuit and the development of a control strategy.

Bradford [10] used neural networks in a number of studies modelling the behaviour

of different parts of a mineral processing plant.

Ryman-Tubb and Bolt of Neural Mining Solutions Pty Ltd [91] describe the

use of the AMAN architecture (described before) for integrated process system

modelling and optimisation. The suggested areas of application include froth

flotation, carbon-in-pulp (CIP), milling, and others. Their case study presented a real-

life example based on a multi-stage copper extraction process. The trained networks

(MLPs) were used for the following:

• Prediction of stripped copper cathode from electrowinning

• Prediction of raw material usage

• Identification of key plant parameters

• Analysis of the effect of plant input parameters

• Economic optimisation to determine cost-effective control settings


94

The developers claimed the following benefits from the ANN approach:

• Decreased raw material costs

• Increased copper production

• Optimised planning of new and existing heap operations

• Ability to implement “Just-in-time” purchasing policy

• Planning of new heaps

• Reduce reliance on individual and human operation

Finally, Schofield [94] investigated the use of neural networks as well as other AI

tools for the selection of surface mining equipment.

4.4 Conclusions Quite clearly, the spectrum of neural network applications in mining is very wide.

This is demonstrated by a number of exciting and very promising studies by a number

of people from different scientific fields. The examples presented in this chapter

support the choice of ANNs as the basis for developing solutions to mining problems

were conventional techniques fail in one way or another. Mining is always about time

and money and so far neural networks have shown that they can be very good in both

terms. The systems described in the above examples were fast, reliable and most of

the times provided a very stable theoretical background on which the validity of the

proposed solution is based.

The general trend in the mining industry for automation to the greatest degree

calls for technologies such as the ANNs that can utilise large amounts of data for the

development of models which otherwise are very difficult or sometimes even

impossible to identify. The speed of ANNs – at least in application mode – also


95

allows the development of real- or almost real-time systems, which can recognize

quickly potential problems or even danger during a certain process.

Another advantage of ANNs is in the minimisation of the necessary

assumptions for a given problem. Especially in the case of grade estimation, this

attribute proves very valuable. The examples of ANN application to grade estimation

given earlier in this chapter supported this and other advantages of neural networks.

The ambition of the author is to implement these advantages into an integrated neural

network system for grade estimation.

Development of a Modular Neural Network System for Ore Grade Estimation

96

5. Development of a Modular Neural Network System for Grade Estimation

5.1 Introduction Before moving into the in-depth analysis of the integrated GEMNet II system for

grade estimation, it is necessary to go through the development steps that led to the

final architecture. Many things have changed in the developed architecture since the

beginning of this project. The number of networks, their topological characteristics,

the learning algorithm, the error measures, and even the inputs and dimensionality of

the input space were changing or, one could say, evolving as more tests were run and

the author gained more insight to the numerous algorithms and developments in the

field of artificial neural networks. Going through these steps helps to understand the

reasoning behind the developed system and how the original aims were met.

GEMNet II was named as the successor to the original GEMNet system [12]

also developed at the AIMS Research Unit. The author was very fortunate to have a

starting point well ahead of any research carried out elsewhere, something that

inevitably set the aims for the development of GEMNet II at quite a high level.

GEMNet II was developed with the real life situations in mind from day one. The

main aim was to find a reliable and robust architecture that required no significant

interaction with the user in order to provide accurate grade estimation results. After

the identification of this architecture and the proof of its validity through a number of

case studies, the next aim would be to integrate the architecture in a user-friendly

system that would allow straightforward application with no important parameters to

be set by the user. The system should also be capable of removing the ‘black box’

attribute neural networks are famous for, an attribute completely unacceptable in the


97

mining industry especially when it comes to grade estimates on which decisions

involving large amounts of financial resources will be based.

In this chapter, the development of the modular neural network architecture

for grade estimation will be described. Mathematica from Wolfram Research [111]

was used for the development of all prototype systems as it was found to be a very

resourceful environment providing all the necessary tools for understanding and

validating different neural network architectures.

Two main principles – hypotheses have been accepted during the development

of the system: grade estimation can be approached as a hypersurface reconstruction

problem in the spatial co-ordinates input vector space, and grades are the numerical

representation of a localised phenomenon (deposit) - grades themselves present

localised behaviour. As will be seen later, there are a number of implications brought

by these hypotheses that have a great effect on the design of GEMNet II.

The author has carried out a large number of preliminary tests on various

neural network architectures and learning algorithms as part of his MSc project [42].

These tests were based entirely on simulated 2D data arranged on a square grid. The

networks were trained on the grid nodes using the grade at a given node as the

required output and the grade at the eight (or the four closest) surrounding nodes as

inputs. There was no information provided about the spatial location of the input

samples or even the location of the required output. All together, the approach was

very similar to image analysis techniques using computer vision with the image being

in this case the grade surface. The results of these case studies and their comparison

with results from kriging showed great promise – in fact, the developed neural

networks performed much better than kriging in most of the cases. However, there

was no guarantee that this would happen with real data and of course the whole


98

approach was not at all applicable to real data due to the inflexible arrangement of the

inputs (fixed on a regular grid).

The most important issue raised from the above project regarded the formation

of the input space, i.e. which input parameters should be used or how should the task

of grade estimation be decomposed into smaller tasks that would be easier to approach

using neural networks. In the next paragraph, the shift from fixed-on-a-grid inputs to

completely floating-in-space sampling inputs will be described in two-dimensional

sampling space.

5.2 Forming the Input Space from 2D Samples It is generally accepted that the input space characteristics as well as its components

play a very important role in the performance of neural networks. The input

dimensionality, as was discussed in earlier chapters, controls to a great extent the

overall complexity of the neural network topology as well as the amount of training

data required to bring the network performance to acceptable levels. Therefore it is

very important to select the inputs from the available data in a way that will help

reduce the complexity of the network and at the same time provide the right

information for the network to be trained on.

The input space also defines the way of approaching the required task, in this

case grade estimation. Using the sample co-ordinates, for example, in two dimensions

(easting and northing) as inputs to a network with the output being the grade of the

sample means that grade is treated as a surface in the co-ordinate space. This approach

seems to be the most popular among researchers dealing with this problem.

As explained in the previous chapter, another approach is to use samples close

to the estimation point as the source of grade input data. Usually, the samples are

arranged on a regular grid, which makes things a lot easier. If they are not arranged on


99

a grid, then the grid is constructed by applying a polygonal or inverse distance

calculation on the original data, which naturally introduces smoothing errors. The

inputs are in this case the grades of the neighbour nodes and the output is the grade at

the point of estimation (also on the grid). Neighbour nodes can be considered to be the

eight nodes surrounding the estimation point or the four that belong to the same grid

lines passing from the estimation point.

The above approach gives very good results on simulated data and regular

sampling schemes where the smoothing errors introduced from gridding original data

are relatively low. Applying this approach though directly to real data normally

obtained with an irregular sampling scheme leads to the network learning a very

smooth distribution (the distribution of the polygonal or inverse distance grid nodes)

of grades that does not represent the reality. It should be noticed that the polygonal

and the inverse distance method assume that the modelled surface is continuous.

Clearly, there is a need to develop a way of presenting to the networks

information from neighbour samples that honours their relative location to the point of

estimation. In other words, the aim is to form the input space in a way that includes

both the surrounding grade values and their relative position in space.

A very common way of choosing samples surrounding the point of estimation

used by most of the conventional methods is to use octant or quadrant search (Fig.

5.1). The area surrounding the point of estimation is divided into eight (or four)

sectors and a number of samples is chosen from each one of them. This technique

ensures that samples are selected from all directions in 2D space and not only from

the direction where there are more samples and closer.


100

Quadrant Octant

Estimation point Selected sample Not selected sample Node

Figure 5.1: Illustration of quadrant and octant search method (special case where only one

sample is allowed per sector). Respective grid nodes are also shown.

Dividing the area around the estimation point into octants (or quadrants)

provides a way to expand the inflexible input scheme using grid nodes to a scheme

that can accept samples floating in 2D space. The inputs are now the grades at

neighbour samples any distance away from the estimation point and not from the

surrounding grid nodes (Fig. 5.1). There is no need for gridding the original data using

any interpolation method that normally introduces errors, therefore the network is

modelling the original distribution of grades. The use of octant search (or quadrant)

also allows the use of the same neural network architecture as in the case of gridded

samples.

There is however one fundamental difference between the two approaches. In

the case of samples arranged on a regular grid, the distance of the inputs from the

point of estimation remains constant throughout the sampling area. Using an octant

search means that the samples are now at a varying distance from the estimation point


101

and therefore there is a need to include distance information as part of the input space.

This requirement is also derived from the hypothesis that grades present localised

behaviour. Therefore it is necessary for the neural network to ‘know’ the distance of

any input sample relative to the point of estimation.

0 100 200 300 4000

100

200

300

400a. Actual Grades b. MNN Grades

0 100 200 300 4000

100

200

300

400

26.00

27.00

28.00

29.00

30.00

31.00

32.00

33.00

34.00

35.00

36.00

37.00

38.00

39.00

Figure 5.2: Estimation results from neural network architecture developed for use with

gridded data. The use of irregular data has an obvious effect in the performance of the system.

The author has initially tested the neural network architecture used for gridded

data directly to original data arranged irregularly in 2D space. The results were, as

expected, not as good as when using gridded data. Clearly learning a distribution

based on inverse distance estimates arranged on a grid is far easier than trying to learn

the original data distribution. Figure 5.2 shows contour maps from this test using data

from an iron ore deposit. The results from this test have shown clearly that it is

necessary to provide distance information to the network in order to improve its

modelling capacity in the case of irregular data. It should be noticed that at this stage

the problem of grade estimation is still approached by the use of a single network with


102

multiple inputs (eight or four) depending on the search method used – octant or

quadrant.

In order to provide distance information to the network, one input is added per

sample, i.e. for each of the eight octants (or four quadrants) there are two inputs: the

neighbour sample grade and its distance from the estimation point. This leads to a

total of 16 inputs (or eight for quadrant search). The increase in the number of inputs

inevitably leads to an increase of the number of hidden units required to handle the

complexity of the input space. Figure 5.3 shows two neural networks with 16 and 8

inputs used to accept data from an octant and quadrant search respectively.

Figure 5.3: Neural network architectures receiving inputs from a quadrant search (left) and

from an octant search (right). The number of hidden units in the right network is lower than in

the left because the number of weights is higher.

The idea behind the use of two networks with different input dimensionality

was based on the fact that not all estimation points have an adequate number of

neighbour samples to complete the training patterns when using octant search. In

other words, when there are less than eight neighbour samples around the estimation


103

point, quadrant search and the smaller network is to be used for the estimation.

Naturally, the quadrant search based network can be trained on all locations where the

octant search based network is trained.

In order to get even closer to a real situation, the developed architecture should

be able to handle estimation points at the edges or even outside the sampling area. In

these areas there is not enough information to generate complete patterns for any of

the two networks, i.e. both octant and quadrant search fail to find any neighbour

samples. For this reason, a third neural network is introduced to provide estimates at

these points. This network can only depend on data at the point of estimation and

therefore the commonly used input scheme of sample easting and northing is used.

At this stage, the developed neural network architecture for grade estimation

has become modular, in the sense that there are multiple networks providing estimates

but each on different estimation points from the other. These three networks are in

essence trying to reconstruct the grade hypersurface in their own input vector space.

In other words, no matter if they are only used in specific estimation points, they are

still trained on the entire sampling area, at least the part of it that provides enough

information for their training patterns.

As shown in Figure 5.4, and compared with the results shown in the previous

figure, this architecture provides considerably better estimation performance. The next

question is naturally whether this performance can be further improved. As it was

mentioned earlier in this chapter, grades tend to present localised behaviour, i.e.

samples close to each other tend to have similar grade values. This similarity normally

decreases with the distance between the samples. The effect of this fact is that it is

very difficult to approach grade estimation as a global approximation problem. For

this reason a number of researchers have been led to the use of modular neural


104

networks that construct local approximations of grade. The architecture described so

far in this chapter, even though modular, still tries to approximate the entire

distribution of grades through each network. It should be noticed at this point that in

the case of radial basis function networks, the modelled surface is being reconstructed

by a series of locally trained basis functions, which gives an answer to this problem.

0 100 200 300 4000

100

200

300

400a. Actual Grades b. MNN Grades

0 100 200 300 4000

100

200

300

400

26.00

27.00

28.00

29.00

30.00

31.00

32.00

33.00

34.00

35.00

36.00

37.00

38.00

39.00

Figure 5.4: Improvement in estimation by the introduction of the neighbour sample distance

in the input vector.

The author carried the solution even further by breaking the problem of grade

surface reconstruction from neighbour points into smaller tasks that can be easier to

approach by a single neural network. More specifically, structural analysis in

geostatistics has been the paradigm for this problem decomposition. In structural

analysis, one tries to find the model of grade variability in certain directions in space.

The derived models are then used to modify the interpolation method and the sample

selection routine. Unfortunately, this is where one of the main disadvantages of

geostatistics appears, as structural analysis, and more specifically variography,

requires skills and time and also depends on the knowledge of the modelled

parameter. The author aimed at overcoming these problems, while still taking


105

advantage of the benefits of structural analysis, by employing neural networks to learn

the spatial variability from exploration data.

In order to learn the spatial variability of grade, the two networks with the

inputs receiving information from neighbour samples were replaced by a number of

networks trained on neighbour samples coming from a single direction in space. In

other words, there are eight networks with two inputs (neighbour grade and distance

from estimation point) where there was one with 16 inputs, or four networks with two

inputs where there was one with eight. There is now one network per sector (octant or

quadrant) learning the variability of grade in that direction. As expected, it is far

easier for a single network to learn the variability in one direction than in all

directions. It is also easier to control the learning process and to monitor the results of

training.

The results obtained with this architecture are very promising [41]. This is the

final architecture developed by the author to handle exploration data from a two-

dimensional sampling scheme (Fig. 5.5). It became part of the Modular Neural

Network System (MNNS) described in later paragraphs of this chapter. The MNNS

could be considered as the prototype version of GEMNet II. There were several case

studies run using the MNNS and data from simulated and real deposits. These are

discussed in detail in Chapter 7.


106

I/O Data Set used for Training, Validation and Testing

Octant Search Quadrant Search

Octant RBFNModule16 Inputs8 RBFNs1 Output

Quadrant RBFNModule8 Inputs4 RBFNs1 Output

X-Y-Grade MLPModule2 Inputs1 MLP

1 Output

Output

TrainingTestingSectorOutput

Data Types

Modular Neural Network System for Ore Grade Estimation

Figure 5.5: Modular neural network architecture developed for grade estimation from 2D

samples [41].

After the design of the input space follows the development of the neural

network topology as well as the learning algorithm. These are explained in the

following paragraphs.

5.3 Development of the Neural Network Topologies

5.3.1 Overview From the discussion in the previous paragraph it becomes clear that the topology of

the neural networks used in the developing stages of MNNS has gone through many

changes. Apart from the input layers already discussed, the hidden layer has also been

changing - the number of hidden units, the type of hidden units, and their activation


107

and output functions. Different error measures were also tested. Overall, only one

aspect of the neural networks did not change and that is the number of output units.

The output layer of all neural networks developed had one unit providing the grade

estimate.

There are two types of neural networks that were predominantly tested during

the development of the MNNS. These are the Multi-Layered Perceptron (MLP) and

the Radial Basis Function (RBF) network. The choice between them was not easy as

the MLP is very popular in function approximation problems and there is a very good

background of theory and practical examples. In theory both architectures can

produce very good results given time and training information. However, the RBFN

has a great advantage over the MLP in terms of speed of development, which was

more than verified during testing. Also the MLP produces global approximations and

in order to get the same effect with the local approximations of the RBF it is

necessary to complicate the overall architecture by introducing a number of MLPs

trained on localised data. This approach was implemented in the original GEMNet

and seemed to produce good results in small to average 2D deposits.

The RBFN was chosen as the building unit of the MNNS after a number of

tests on both architectures. However the MLP was still used occasionally as an

averaging network for the estimates produced by the various RBFNs, as it will be

discussed later.

5.3.2 The Hidden Layer Designing the hidden layer of a RBFN is a very complex task and also a very

important one for the overall performance of the network. The number of hidden units

depends on the training data and the modelled parameter, and can therefore vary from

one dataset to the next. In the case of the MNNS architecture described here, the


108

problem becomes even more complex as the original drillholes samples dataset is

being processed and presented in three different ways. There is training data for the

octant search networks, training data for the quadrant search networks and finally,

training data for the network trained on the samples’ spatial co-ordinates. In addition

to the original dataset, there is also the patterns consisting of the outputs of all these

networks that become the inputs to the final averaging network.

The optimum number of hidden units can be found in the case of drillhole data

only during training by applying one of the automated node generation or destruction

algorithms. There is a number of algorithms for adjusting the number of hidden units

by training. In the case of the MNNS, a simple training algorithm was employed for

adding hidden units (or RBF centres). The basic steps of the algorithm are as follows:

1. Start with a minimum number of RBF centres;

2. Train the network and calculate the validation error;

3. Add one centre;

4. Repeat step two and compare with previous validation error;

5. If the change in error is too small then stop training;

6. If the change in error is significant and the maximum number of centres has not been

reached, go to step 3;

7. When the number of centres reaches the maximum, exit the algorithm and save the

architecture with the smaller validation error.

Altogether this algorithm finds the number of hidden units that would produce the

minimum validation error and uses the respective topology during estimation. This

algorithm should not be confused with the learning algorithm used for training the

various topologies.


109

Another very important issue concerning the hidden layer in any RBFN is the

positioning of the basis function centres (weights between input and hidden layer).

This normally takes place at the initialisation stage of the learning process, where

some of the network’s free parameters are set to give learning the best start possible.

The initial positioning of the centres is very crucial. Thinking of the error as a

hypersurface in the weights vector space – in this case the centres vector space – it is

fairly easy to understand the importance of starting from a good point on this

hypersurface, as it will help find the minimum-error-producing weights. A number of

centre positioning algorithms are available including Kohonen learning, random

positioning, k-means clustering, and positioning on samples. After rigorous testing,

and for the available testing data it was found that random positioning of the centres

in the input vector space was leading to better performance than any other positioning

algorithm. Again it should be noticed that this was very much depending on the data

used for the studies and indeed, as it will be seen later when using data from a three-

dimensional sampling scheme, the random positioning was found to be inadequate for

more complex data. The author believes that the random positioning is ideal for two-

dimensional datasets with relatively low number of training patterns. Random

positioning is not expected to perform well when the number of centres to be fitted is

considerably lower than the total number of training patterns available.

A more difficult choice was that of the basis function. There was almost no

agreement between the studies with two-dimensional data as to which basis function

helps produce better results. However it seemed that the multi-quadratic and the thin-

plate spline were consistently producing better results and the author was convinced

to use them in further studies. It should be noticed that the choice of basis function is


110

not as crucial for the problem at hand as is the smoothing parameter of the function,

which carries information about the problem.

The smoothing parameter can only be set through experimenting with the

training data and is unique for every study. This is one of the points where user

intervention is required for optimum results. There are no rules of thumb for this

problem and therefore it is required that a number of testing runs are performed in

order to set the smoothing parameter to its ideal value. Generally it is a very quick

process and it is only necessary to take place once per study – if more training

patterns become available there is normally no need to change the smoothing

parameter. It is also possible to use only a representative part of the dataset for this

process if the number of training samples is too large and training is time-consuming.

The final aspect of the hidden units and by far the most important one is the

bias. The bias of every unit is normally set to 1 and then adjusted through training. It

is very important as it changes completely the behaviour of the unit when presented

with data inside its receptive field. Generally, as the bias moves away from the value

of 1 (gets smaller) more hidden units become activated than just the unit whose centre

location corresponds to the current training pattern.

5.3.3 Final Weights and Output In order to complete the RBFN architecture, the weights between the hidden layer and

the network’s single output need to be set. This is achieved by a gradient descent

method similar to that used in the MLPs. The RBFNs used in the MNNS are fully

interconnected, i.e. units from one layer branch out to every unit of the next layer. As

there is only one output unit, the number of weights between hidden and output layer

equals the number of RBF centres in the network. The single output unit simply

performs the summation of the hidden units’ weighted outputs and passes the result


111

through an activation function (like the logistic – sigmoid) that also takes the bias of

the hidden units under consideration.

5.4 Learning from 2D samples

5.4.1 Overview Learning in RBFNs has been discussed in detail in Chapter 3. In this paragraph, the

details of the learning algorithm used in MNNS will be discussed. Attention will be

given to the effects of the problem characteristics on the learning parameters, i.e. how

the learning algorithm is adjusted to perform better with exploration data.

In the MNNS architecture there are three neural network modules each trained

on different patterns derived from the same data (Fig. 5.6). It is therefore necessary to

describe the learning process for each one of the modules individually, as there are

significant differences. The discussion begins with the RBFNs trained using the

patterns formed by an octant search.

Figure 5.6: Partitioning of the original dataset into three parts each one targeted at a different

module of the MNNS.


112

5.4.2 Module 1 – Learning from Octants Module 1 has eight RBFNs each with two inputs (neighbour sample grade and

distance from estimation point), one output (grade at estimation point) and a varying

number of hidden units (RBF centres). Figure 5.7 shows one of these networks.

Module 1 can be seen as a modular network with 16 inputs and eight outputs. These

outputs are averaged to provide a single grade estimate for the module.

Input vectors are normalised, i.e. reduced to vectors of equal length. This is

necessary to ensure that changes of equal scale in different inputs have the same effect

on the network’s performance. The outputs are denormalised to give an estimate in

the original range of values.

Figure 5.7: RBFN used as part of module 1 in MNNS. Training patterns from an octant

search were used to train the network.

The learning process begins with the initialisation of the RBF centres. This

process involves positioning the centres and setting the bias of the basis functions. As

already explained, the centres were chosen randomly in the input space and the bias


113

was usually set to an initial value of 1. The initial centre positions and bias values can

be further optimised during the learning process. However, as it was found during

testing, it is very difficult to train the networks by adjusting all the free parameters

simultaneously. Therefore, training in MNNS was concentrated on one parameter at a

time.

The number of centres, as already discussed, was set by another process,

nesting the RBF learning algorithm. The first parameter to be set by the learning

algorithm is the weights between hidden and output layer. These are found by solving

a problem of least squares using the known output and the output of the network. The

rest of the network’s free parameters (centre location and bias) were set one at a time

by a gradient descent method. Figure 5.8 shows the location of the basis function

centres in the input space. The distance between the current input vector and the

vector of the basis function centres was measured using the Euclidean error distance

measure.

-2.5

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

2.5

-2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5

Figure 5.8: Posting of the basis function centres from the RBFN of Fig. 5.7 in the normalised

input space (X-Grade, Y-Distance).


114

The training patterns set was split in three parts. One third was used for

training, the second for validation, and the third for testing. The training patterns were

randomly selected as members of each part. The learning process stopped when the

maximum number of centres was reached or when the change in the validation error

was less than 0.001%. The architecture with the lower validation error was saved to

be further used during testing and application. In contrast with the networks shown in

Fig.5.3, the networks trained using the octant search had more hidden units than their

counterparts in the quadrant search. One explanation for this is that the input

dimensionality is the same in both cases but the problem in the case of octant search is

more difficult due to less input data than in the quadrant search. The RBFNs in octant

search are required to minimise the validation error for the same mapping but with

less data, and therefore more basis functions are needed to achieve the mapping. The

number of centres varied between 5 and 21 throughout the case studies.


distance of neighbour sample) and the network’s output (target grade) for the RBFN of Fig.

5.7.


115

After training and validation of the networks, testing took place to measure the

generalisation performance and to provide the basis for comparison with other grade

estimation techniques. Figure 5.9 shows an example of a network’s learned mapping

between neighbour sample grade and distance, and grade at point of estimation.

5.4.3 Module 2 – Learning from Quadrants Module 2 has four RBFNs each with two inputs (neighbour grade and distance) and

one output (grade at estimation point). As in the case of Module 1, this module can be

considered as a modular neural network with 8 inputs and 4 outputs. The outputs from

the four networks are averaged to provide a single output. Figure 5.10 shows one of

these networks. The number of basis functions was less than in Module 1 networks

because quadrant search produces more training patterns than octant search from the

same dataset and therefore it is easier for the RBFNs of Module 2 to produce the same

mapping with less hidden units.

Figure 5.10: Example of an RBFN from Module 2.


116

The number of basis functions varied between 2 and 17 throughout the case studies.

The learning process was identical to the one used in Module 1. Figure 5.11 shows

how the centres of the RBFs were located in the normalised input space for the

network in Fig. 5.10 and for a specific case study. Figure 5.12 shows the learned

mapping for the same network, i.e. the learned relationship between the inputs (grade

and distance of the neighbour samples) and the output (grade at estimation point). It

can be seen that generally the network’s output increases with increasing neighbour

grade and decreases with increasing distance of neighbour sample.

-2.5

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

2.5

-2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5

Figure 5.11: Posting of the basis function centres from the RBFN of Fig. 5.10 in the

normalised input space (X-Grade, Y-Distance).


117


distance of neighbour sample) and the network’s output (target grade) for the RBFN of Fig.

5.10.

5.4.4 Module 3 – Learning from Sample 2D Co-ordinates The single network of this module is a Multi-Layer Perceptron with two inputs

(easting and northing of samples) and one output (sample grade). The number of

hidden units, as shown in Fig. 5.13, was 14 but this changed from one study to the

other to achieve better results. The activation function of the hidden units was the

bipolar sigmoid activation function (tanh).


118

Figure 5.13: Module 3 MLP network trained on sample co-ordinates.

Learning was based on the steepest descent algorithm. The steepest descent

method measures the gradient of the error surface after each complete cycle, and

changes the weights in the direction of the steepest gradient. When a minimum is

reached a new gradient is measured, and the weights are changed in the new direction.

The method is improved by the use of the momentum coefficient, and the learning

coefficient. The learning coefficient weights the change in the connections. The

momentum coefficient is a term, which tends to alter the change in the connections in

the direction of the average gradient. This can prevent the learning algorithm

stopping in a local minimum rather than the global minimum. In the MNNS the

learning process is split into four periods each with different number of training cycles

and different learning and momentum coefficients. Table 5.1 shows how these

coefficients are chosen during training.


119

Table 5.1: Learning strategy for Module 3 MLP network.

Period 1 2 3 4

Learning Cf. 0.9 0.7 0.5 0.4

Momentum Cf. 0.1 0.4 0.5 0.6

Cycles 1000 100 100 10000

From the table it is clear that the change in the weights is more rapid at the beginning

of training and it is reduced from one period to the next. In most cases the learning

process is stopped well before the end of the last period. For example, in the case of

the iron ore data discussed before, learning was stopped at period 4, cycle 156.

Generally there are no rules for choosing these coefficients and one has to experiment

in order to find the best strategy for training.

The training patterns were split into three parts for training (55%), validation

(30%), and testing (15%). The validation set was used for guiding the learning

process, i.e. the process stopped when there was no significant change in the

validation error. At that point the topology was saved to be used for testing and

application. Figure 5.14 shows what the network has learned in the case of the iron

ore data. It should be noticed again that this network is used only for estimating

grades at locations where the previous two modules cannot due to lack of data.


120

Figure 5.14: Learned mapping between sample co-ordinates (easting and northing) and

sample grade for MLP network of Module 3.

5.5 Transition from 2D to 3D Data

5.5.1 General Having described the modular neural network architecture for use with two-

dimensional data, it is now necessary to examine how this architecture can be

modified or expanded to accept data from real 3D sampling schemes such as drillhole

data. There are certain issues that need to be considered during this expansion. The

most obvious is the added dimensionality of the samples. Now there are three co-

ordinates defining the location of samples in space – easting, northing, and elevation.

More interesting is perhaps, the fact that samples have now a volume associated with

them. As the assaying procedure is carried out on different drilling core lengths, the


121

samples come in all sorts of lengths and therefore different volumes. This extra

information needs to be considered in the input space of the estimating architecture.

The fact that each drillhole can give more than one samples also complicates things

even further. The neighbour sample search methods have to take this fact into

consideration to avoid choosing too many samples from the same drillhole. The

search methods described before are also purely 2D and cannot be considered as an

option with 3D data especially in the case where the orebody does not follow a

specific 2D plane in space. Therefore a fully 3D search method is necessary to be

developed.

These issues as well as other minor ones will be discussed over the next

paragraphs of this section.

5.5.2 Input Space: Adding the Third Co-ordinate In three-dimensional sampling schemes commonly used in exploration programmes,

samples are located in space by three co-ordinates: easting, northing, and elevation.

As was explained in previous paragraphs, one of the modules of MNNS is an MLP

network trained on the 2D co-ordinates of samples. The same network now needs to

increase its input dimensionality to accommodate the elevation co-ordinate of each

sample. The inputs of the network change from two to three. This obviously affects

the number of weights necessary, i.e. the number of hidden units has to increase.

The networks in the other two modules have the distance of the neighbour

samples as an input. This distance was calculated in 2D space. Now the distance is

calculated in 3D space. The centres of the basis functions were initially positioned

randomly in the input space. This is inadequate in the case of three-dimensional

samples as it was found during testing. The more complex distribution of neighbour


122

sample distances is responsible for this fact. Therefore a different way of centre

positioning needs to be employed.

5.5.3 Input Space: Adding the Sample Volume The sample volume defines what people in geostatistics would call the support of a

particular sample. In drillhole data, samples have a certain length as to the length of

the drillhole itself. In order to cope with the variations in the support of samples, it is

necessary to pass the samples through compositing and use the composites of equal

length in the estimation procedure. This is the case for most of the conventional

methods of estimation including geostatistics.

In the case of the MNNS approach, there is no need to composite the samples

into equal length composites. The architecture is modified to accept the length of the

samples as an extra input to all neural networks involved. Specifically the network

trained on the sample co-ordinates now also accepts the length of the samples - the

inputs increase to four (easting, northing, elevation, and length). The networks trained

on neighbour samples now receive the neighbour sample length as well as its grade

and distance from the estimation point.

A complication of the transition to 3D data relative to the sample volume is

the fact that the estimation is now taking place in 3D as well. Block modelling is the

norm for 3D grade estimation. As was described before, block modelling is based on

blocks with an associated volume. This volume needs to be considered during

estimation for the same reasons that sample length is considered during training and

estimation. The extra input added to the neural networks enables the introduction of

the block volumes during estimation.


123

5.5.4 Search Method: Expanding to Three Dimensions The search methods used in the case of 2D data can not be used with 3D data because

they take no consideration of the third dimension (elevation) which is necessary to

fully define the location of samples in space. The quadrant and octant methods, as

shown in Fig. 5.1, select samples from a plane rather than a 3D sample space. Even if

this plane is rotated in any of the three axes (easting, northing, elevation) these

methods would only be adequate for flat orebodies with not much grade variation in

one of the three dimensions. It is therefore necessary to expand these search methods

to three dimensions.

The author first tried to achieve this by applying the quadrant and octant

search in all three planes defined by the three axes: the XY, XZ, and YZ plane.

Figures 5.15 and 5.16 illustrate how the quadrant and octant search would divide 3D

space into sectors.

Figure 5.15: 3D version of quadrant search.


124

Figure 5.16: 3D version of octant search.

From the figures it becomes clear that the resultant search methods become very

complex and very difficult to comprehend in three dimensions. The total number of

sectors produced is 64 for quadrant and 512 for octant. This means that the MNNS

should have 64 networks trained on quadrant search data and 512 networks trained on

octant data. Even if this was possible in computation terms, there would not be

enough samples to fill each sector and provide training patterns for every network.

Therefore it is necessary to simplify these search methods in order to cope

with the geometrical characteristics of exploration sampling schemes. After

considering a number of schemes, the author decided to use the simple search method

shown in Fig. 5.17. There are only six sectors in this scheme: upper, lower, north,

south, east, and west. These sectors are defined by the intersection of four planes: two

planes vertical to the XZ plane at ±45° dip, and two planes vertical to the YZ plane at


125

±45° dip. In other words, these sectors look like pyramids of square base with their

top at the estimation point.

Figure 5.17: Simplified 3D search method used in the MNNS for sample selection.

The advantage of this search scheme is not just the fact that it is very simple and

affordable in computation terms. With this scheme, the drillhole where the current

training point belongs is always within two opposite sectors. This allows easier

control of the number of samples selected from this drillhole, which can help improve

the results of estimation. Another advantage of this scheme is that it can handle any

inclination of the orebody or the drilling scheme.

The author decided to replace both 2D-search methods (quadrant and octant)

by this simplified 3D method, which means that the MNNS has now just two

modules: one trained on the sample co-ordinates and length and one trained using data


126

from the single search method. This also means that the second module now has only

six networks, one for every sector of this search scheme.

5.6 Complete Prototype of the MNNS The complete modular neural network system for grade estimation using 3D data is

shown in Fig. 5.18. The system comprises three neural network modules responsible

for the estimation and a data processing and control module that generates the training

patterns for the networks by applying the search method described.

Figure 5.18: Diagram showing the structure of the MNNS for 3D data (units are the neural

network modules).

The second module or unit as shown in the figure is a single RBFN trained on the

outputs of the six RBFNs of the first module. This network replaced the simple

averaging of the RBFNs' outputs that was done previously. It was found necessary as

it became clear during testing that some of the RBFNs of the first module were

consistently producing estimates closer to the actual values while others were

consistently far from them. The learning process for this RBFN is identical with that

of the RBFNs in module one. The number of hidden units varied between six and


127

nine. Figure 5.19 shows an example of how this network's output varied depending on

the outputs of the RBFNs in module one.

Figure 5.19: Learned weighting of outputs from module one RBFNs by the RBFN of module

two.

The third module is the modified for 3D data neural network with four inputs

(easting, northing, elevation, and length) and one output (target grade). Unlike in the

case of 2D data where the MLP architecture seemed to perform better, and from early

tests ran using 3D data it became clear that the RBFN reduces the validation error

even further than the MLP and therefore the third module is based on a single RBFN

and not on the MLP as was described before.

The data processing and control module accepts data in ASCII form and

creates training pattern files for the neural networks of the MNNS. The formation of

training patterns is based on the search method described. Basically, for every training

sample in the dataset, one neighbour sample is chosen from every sector – the one

closest to the training sample. The grade of the neighbour sample, its distance from

the training sample and its length are written as inputs on the training pattern file of

the network responsible for the specific sector, while the training sample grade is

written as the require output. Clearly, in some occasions there are no neighbour

samples in some of the sectors. In those cases, the training sample is marked for


128

estimation with module three, which is trained on the training sample co-ordinates.

The network of module 3 is however trained on all samples regardless of the results of

the search process. Figure 5.20 shows an example of this network’s output depending

on its inputs.

Figure 5.20: Learned relationships between sample co-ordinates, length (inputs) and sample

grade (output) from the RBFN of module three.


129

After training is stopped the saved topologies are used for estimation. Initially this

was done on the basis of drillhole samples hidden from the training process for testing

reason. Later the drillhole samples were mostly targeted on the training and validation

process. Cross validation was used for testing the validity of the learned mappings and

for comparing with other grade estimation techniques. Studies carried out with this

architecture [40] supported most of the choices made during the development process

described in this chapter. Even at this prototype stage, the system could perform

reasonably well on a wide variety of data.

5.7 Conclusions In this chapter the development of the modular neural network system (MNNS) for

grade estimation was described. This system with some modifications will become the

core of GEMNet II. As it was explained, the MNNS is trying to approach grade

estimation in two different ways:

1. Using a sample’s co-ordinates and length to construct the picture of grade in 3D

space;

2. Using neighbour samples’ grade, distance and length to construct the picture of

grade in specific directions in space.

This approach ensures that there is an estimate for grade even in places where

sampling density is very low. This approach also takes advantage of the information

hidden in the relationship between neighbour samples and takes under consideration

the support of the samples. Because of that, it can provide estimates that have a

volume associated with them as opposed to point estimates.


130

The MNNS requires a minimum of human interaction – this interaction is

limited to a single parameter of the RBFNs and it does not require any particular

knowledge or skills from the user. The results depend solely on the data at hand – the

estimation process adjusts to the available data.

However, the described system, being in a prototype form, is not very user-

friendly and integration of its results in the process of reserves estimation is difficult.

Therefore it is necessary to integrate the MNNS into a complete resource-modelling

environment in order to get the most out of the system and realise its full potential.

This integration will also allow better comparison with the existing methodologies.

In the next chapter this integration is described as well as a number of minor

modifications to the MNNS architecture. The targeted resource-modelling

environment was one of the leading mining software packages called VULCAN from

Maptek/KRJA Systems Ltd. The integration of the MNNS inside VULCAN led to the

development of GEMNet II.

Case Studies of the Prototype Modular Neural Network System

131

6. Case Studies of the Prototype Modular Neural Network System

6.1 Overview The case studies presented in this chapter were based on the prototype MNNS

architecture. In fact there were two versions of the prototype system, as described in

Chapter 5, one for 2D data and one for 3D. There are two case studies for each one of

them. More specifically, these case studies are:

• 2D iron ore deposit

• 2D copper deposit

• 3D gold deposit

• 3D chromite deposit

The 2D deposits have been extensively used in geostatistical as well as neural

network case studies and are ideal for comparison of different approaches. The 3D

deposits have never been used in a published study.

These studies are part of a larger set of tests ran using the prototype MNNS

architecture. The purpose of those tests was to validate the approach and fine-tune the

architecture. As the 2D datasets were created specifically to demonstrate the validity

of the geostatistical approach, they were ideal for testing MNNS and comparing its

results with those obtained using inverse distance and kriging. The datasets from the

four case studies presented here are given in Appendix B.

It should be noted that finding datasets from real deposits is fairly difficult.

Mining companies are quite reluctant in giving information away. Both in the MNNS


132

studies of this chapter and the GEMNET II studies of the next, the most common types

of deposits are metal.

The performance of the MNNS will be compared with inverse distance and

kriging as these are the most commonly used methods for ore grade estimation in

metal deposits. As the only known ore grade values are those provided in the samples,

a part of the dataset is kept out of the information provided to the various methods for

estimation. In other words, some of the samples become the testing points where the

performance of each method is tested. This clearly compromises the overall

performance of each method but unfortunately there is no other objective way of

testing.

The estimation performance will be expressed in terms of the mean absolute

error on the test set and also with graphs of actual vs. estimated (scatter), histograms

of grade distribution, and contour maps of ore grade.

The datasets were of varying complexity and size and therefore presented a

varying difficulty to the estimation techniques used. Table 6.1 summarises the

characteristics of these datasets.

Table 6.1: Characteristics of datasets from the MNNS case studies. 2D Iron Ore 2D Copper 3D Gold 3D Chromite

Total Samples 91 51 112 94

Area/Volume 160,000m2 360,000m2 42,686,028m3 70,010,800m3

Standard Deviation 4.4798 0.3731 0.5521 7.0019

Average Grade 34.59% Fe 0.4658% Cu 0.9316gr/t Au 15.7223%

Results from inverse distance and kriging were obtained using Surfer from Golden

Software in the 2D case studies, and VULCAN in the 3D case studies.


133

6.2 Case Study 1 – 2D Iron Ore Deposit The dataset used in the first case study of the MNNS architecture is a simulated iron

ore deposit [41]. It is a low-grade sedimentary deposit with an average grade of

34.59% Fe. The 91 samples contained are in essence two groups of data: 50 of them

are samples taken at random over the 160,000m2 (400 x 400) sampling area and the

other 41 are taken on a regular 100m grid (Fig. 6.1).

45.30 30.70 40.00 33.30 33.50

30.40 36.70 27.60 34.70

37.90 40.50 31.80 39.80 35.40

32.40 34.70 34.40 28.90

34.10 31.50 39.10 35.50 34.90

33.70 35.40 36.30 34.50

34.90 27.40 27.50 39.00 32.40

26.20 40.00 29.10 39.30

36.60 34.60 38.90 37.90 35.40

34.30

35.50

28.6029.40

41.50

36.80

33.40 36.0030.20

33.20

33.70

34.30

35.30 31.00

27.40

33.90

37.6039.90

27.2034.20

30.2030.40

39.90

40.00

40.60

33.90

32.50

29.60

30.60

40.40

30.10

35.30 41.40

28.50

40.10

24.40 31.60

39.50

34.8029.90

37.80 29.8037.40

27.40

36.50

40.80

32.90

40.00

44.10

41.40

0.00 50.00 100.00 150.00 200.00 250.00 300.00 350.00 400.000.00

50.00

100.00

150.00

200.00

250.00

300.00

350.00

400.00

Figure 6.1: Posting of input/training samples (blue) and test samples (red) from the iron ore

deposit.

The 50 random samples were used for training and validation of the MNNS

networks. They were also used as input data for inverse distance and kriging. The 41


134

grid samples were used for testing all three approaches. The absolute errors produced

by the three methods were as follows:

Table 6.2: Mean absolute errors from case study 1.

Method Mean Absolute Mean Absolute %

Inverse Distance Squared 2.77 8.26

Kriging 2.64 7.90

MNNS 2.60 7.77

Figure 6.2 shows a scatter diagram of the actual vs. the estimated grades from the

various methods.

Iron Ore Grade Values Data Fit

25

30

35

40

45

50

25 30 35 40 45 50 Actual

Estim

ated

KrigingMNNSID2

Figure 6.2: Scatter diagram of actual vs. estimated iron ore grades.

The MNNS is slightly outperforming kriging and inverse distance in this dataset. It

should be noted though once more that this dataset was generated to suit a

geostatistical study and therefore kriging is expected to give good results. It is quite


135

obvious that all methods tend to underestimate in high-grade areas. The reason for

this, at least in the case of the MNNS, is because these areas are close to the borders

of the deposit where the MLP is providing the estimates. The MLP module seems to

give estimates close to the average grade. The performance of the three methods

becomes even clearer by examining the grade distributions below (Fig. 6.3).

Grade Distributions

0

2

4

6

8

10

12

14

16

26.2 28.9 31.1 33.3 35.6 37.8 More

Bin

Freq

uenc

y ActualKrigingMNNID2

Figure 6.3: Iron ore grade distributions – actual and estimated.

From the above figure it seems that the MNNS generates a smooth distribution similar

to that of the inverse distance. Kriging follows better the shape of the actual

distribution. Generally all three methods perform well. The following contour maps

show exactly how close the methods were to the actual values and to each other.


136

Figure 6.4: Contour maps of iron ore actual and estimated grades.

Kriging and the MNNS seem to perform better in different regions except for a part in

the southwest of the deposit where they both perform badly. Lack of enough training

samples is the main reason for the high error level areas produced. The MNNS system

seems to map better the low-grade area on the northwest region while kriging did

better on the southeast.

6.3 Case Study 2 – 2D Copper Deposit The 2D copper deposit in this study is in essence a level from a theoretical open pit

copper mine [36]. It consists of 51 drillhole composites as shown in Fig. 6.5. These

composites cover an area of 360,000m2 and are concentrate mainly in the central part

0 100 200 300 4000

100

200

300

400

0 100 200 300 4000

100

200

300

400

a. Actual Grades b. MNNS Grades

c. Kriging Grades

%Fe

0 100 200 300 4000

100

200

300

400

2627282930313233343536373839

0 100 200 300 4000

100

200

300

400d. Inverse Distance Grades


137

of that area. This data has been used by Hughes et al. [36], Wu and Zhou [112] and

Burnett [12] for testing different estimation methods.

0.175

0.417 0.489

0.215 0.396 0.685 0.377 0.427 0.140

0.392 0.320 0.717 0.806 0.889 0.475

0.230 0.8330.453 0.719 1.009 0.893 0.089 0.092

0.102

0.915 1.335 0.519 0.072 0.0401.3650.023

0.644

0.258 0.638 1.615 0.765 0.465 0.034

0.476 0.409

0.165 0.063 0.406 0.909 0.012

0.228

0.224 0.188 0.027 0.395

0.225

0.00 100.00 200.00 300.00 400.00 500.00 600.000.00

100.00

200.00

300.00

400.00

500.00

600.00

Figure 6.5: Posting of input/training samples (blue) and test samples (red) from the copper

deposit.

The dataset was split in two parts: 30 composites were used for training the networks

and 21 for testing the performance of the MNNS as well as of the other methods. The

inverse distance and kriging estimates were obtained using the same parameters that

Hughes et al. used in their study [36]. The performance of the three estimators in

terms of the mean absolute error on the test data is given below:


138

Table 6.3: Mean absolute errors from case study 2.

Method Mean Absolute Mean Absolute %

Inverse Distance Squared 0.0226 8.21

Kriging 0.0291 7.18

MNNS 0.0258 4.81

Figure 6.6 shows a scatter diagram of the actual vs. the estimated copper grades from

the various methods.

Copper Grade Values Data Fit

0.0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6

Actual

Estim

ated ID2

KrigingMNNS

Figure 6.6: Scatter diagram of actual vs. estimated copper grades.

Once again, the MNNS is performing well compared to the other two methods.

Inverse distance and kriging appear to have very similar performance with their

estimates being very close. Unfortunately, the locations used to test the performance

of the three methods are simply samples that would otherwise have been used as input

information. Unlike case study 1 where there was a good spread of the test samples, in


139

case study 2 and in most of the studies to follow, input data are used for testing, which

means that the spread of the test points is not always ideal. In these cases, testing

takes the form of cross-validation, where the estimator is trying to recreate sample

points from the remaining data set. The actual as well as the estimated copper grade

distributions are shown in Fig. 6.7.

Grade Distributions

0

1

2

3

4

5

6

7

0.2 0.4 0.6 0.8 1.4 More

Bin

Freq

uenc

y ActualID2KrigingMNNS

Figure 6.7: Copper grade distributions – actual and estimated.

The MNNS in this study tends to slightly overestimate grades close to the average but

generally the estimates are well balanced. The other two methods are also performing

well. The contour maps in Fig. 6.8 illustrate the results of grade estimation. The actual

grade map is limited to the sampling area as there is no information outside it. MNNS

is limited to the testing area. Inverse distance and kriging extend to the borders of the

map but comparison should be limited to the testing area.


140

0.00 50.00 100.00 150.00 200.00 250.00 300.00 350.00 400.00 450.00 500.00 550.00 600.000.00

50.00

100.00

150.00

200.00

250.00

300.00

350.00

400.00

450.00

500.00

550.00

600.00

0.00 50.00 100.00 150.00 200.00 250.00 300.00 350.00 400.00 450.00 500.00 550.00 600.000.00

50.00

100.00

150.00

200.00

250.00

300.00

350.00

400.00

450.00

500.00

550.00

600.00

0.00 50.00 100.00 150.00 200.00 250.00 300.00 350.00 400.00 450.00 500.00 550.00 600.000.00

50.00

100.00

150.00

200.00

250.00

300.00

350.00

400.00

450.00

500.00

550.00

600.00

0.00 50.00 100.00 150.00 200.00 250.00 300.00 350.00 400.00 450.00 500.00 550.00 600.000.00

50.00

100.00

150.00

200.00

250.00

300.00

350.00

400.00

450.00

500.00

550.00

600.00

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1.0

1.1

1.2

1.3

1.4

1.5

Actual MNNS

Kriging Inverse Distance

% Cu

Figure 6.8: Contour maps of copper actual and estimated grades.

Inverse distance is by far the worse method in this case. Kriging is doing better but

fails to split the high-grade area. MNNS tends to underestimate the high-grade area

that kriging models very well in the right side of the map. However, MNNS is better

in finding the shape of the high-grade area as well as splitting it to its parts as they

appear in the actual grade map.

6.4 Case Study 3 – 3D Gold Deposit With the third case study, the transition is made from 2D to 3D data. This transition

means that the 3D version of the MNNS architecture is now used. The sample search

methods are not the 2D octant and quadrant methods, but the 3D search scheme

developed specifically for the MNNS.


141

The data used in this case study is part of a larger dataset from a copper/gold

deposit. The original dataset consists of four orebodies developed along fractures in

metasomatised host rocks, which include gneissic granites, mica schists and

metasomatites. In this study, only one of the orebodies was used. The input and test

data were limited to the drillhole samples located inside this orebody (code named

TQ2). The total number of samples was 112.

As the dataset is now 3D, the visualization of the results of estimation

becomes more difficult. Contour maps can only be used to show sections through the

estimated area. Normally, estimation in 3D deposits is made on a block model basis,

but as the actual grade values of the blocks are unknown, the estimation performance

can only be measured over a part of the input dataset.

The orebody model has been created in VULCAN/Envisage during a

geological modeling study based on lithology. Fig. 6.9 shows a 3D view of the

orebody and drillholes (screenshot from Envisage). It should be noted that in this

study VULCAN is used for providing the inverse distance and kriging estimates and

not as an implementation environment for MNNS. The same study including the

complete dataset with four orebodies is repeated in the next chapter using the fully

integrated in VULCAN system, GEMNET II.


142

Figure 6.9: 3D view of the orebody and drillhole samples used in the 3D gold deposit study.

From the 112 available samples, 42 (37.5%) were used for testing the

performance of the three estimation methods. This means that the MNNS had only 70

samples (62.5%) available to train the various networks. After testing with all three

methods, the actual and estimated average gold grades were:

Table 6.4: Actual and estimated average gold grades. Actual ID2 Kriging MNNS

Average (gr/t) 0.9316 0.6524 0.6581 0.7420

The mean absolute error was quite high in comparison with the previous two studies.

Clearly, a three-dimensional orebody is far more challenging and demanding than a

two-dimensional one. The mean absolute errors for the three methods are given

below:

Table 6.5: Mean absolute errors from case study 3. ID2 Kriging MNNS


143

Mean ABS Error 0.4242 0.3939 0.3162

Mean ABS % 44.10% 40.17% 31.60%

The results for inverse distance and kriging were obtained using cross-validation in

VULCAN. Cross-validation was limited to the 42 test samples used for testing the

MNNS. The following figures (6.10) shows the data fit produced by the three

methods.

Gold Grade Values Data Fit

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6

Actual

Estim

ated ID2

KRIGINGMNNS

Figure 6.10: Scatter diagram of actual vs. estimated gold grades.

It is obvious that none of the methods performs very well. The MNNS, even though it

performs better than the other methods, tends to overestimate grades close to the

average value and underestimate the high-grade samples. This becomes clearer in the

next figure (Fig. 6.11) showing the actual and estimated distributions.


144

Gold Grade Distribution

0

2

4

6

8

10

12

14

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 More

Bin

Freq

uenc

y ActualID2KrigingMNNS

Figure 6.11: Gold grade distributions – actual and estimated.

The distribution shown in the above figure as the actual gold grade distribution refers

only to the test samples and not the entire dataset. However, as it can be seen from the

following graph, this distribution is not very far from the distribution of the entire

dataset. The main differences are in the low and high grade areas were the test set had

less and more samples respectively. This could explain the relatively average

performance of all three methods.


145

Actual Gold Grade Distribution

0

5

10

15

20

25

30

35

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 More

Bin

Freq

uenc

y

Figure 6.12: Gold grades distribution of the complete dataset.

This study, being the first using 3D data from a real deposit, shows how much more

difficult it is for the estimation methods to perform well than in the case of 2D data

from simulated deposits. The performance degradation, at least for the MNNS, can be

considered as a result of the higher input dimensionality and the higher complexity of

the required mapping.

This study also shows that the MNNS in its first application to 3D data has

outperformed both inverse distance and kriging. What is not clear from this study is

the time difference in applying these methods. MNNS required about an hour to

generate the training pattern files, train the networks, and provide estimates. Kriging

required a complete geostatistical study that, depending on how thorough one wants

to be, can take many hours.


146

6.5 Case Study 4 – 3D Chromite Deposit The dataset used in the final study of the MNNS is taken from a larger samples

database of an undeveloped chromite deposit. There are 94 samples from 26 drillholes

in this dataset. There is no geological study and therefore the estimation is not

constrained by geology. Normally, there should be an orebody to limit the samples

used for the estimation as well as the locations were the estimation takes place, but in

this case the dataset is very small and the lack of geological modelling is not expected

to generate problems. The drillholes from the dataset are shown in Fig. 6.13.

Figure 6.13: Drillholes from a 3D chromite deposit. From the 94 samples, 38 were used for testing the three methods while the remaining

56 were used for training the neural networks and as input information for inverse

distance and kriging. The actual and estimated average chromite grades were as

follows:


147

Table 6.6: Actual and estimated average chromite grades. Actual ID2 Kriging MNNS

Average grade

% Chromite 15.7639 14.7639 15.1511 16.3449

The estimation performance of all three methods was good considering the fact that

there was no limitation as to the samples used due to the lack of a geological model.

This means that the methods were able to estimate grades from samples that do not

necessarily belong to the same geological domain. The mean absolute errors are given

below:

Table 6.7: Mean absolute errors from case study 4. ID2 Kriging MNNS

Mean ABS Error 3.7687 3.3996 2.4536

Mean ABS % 21.83% 19.82% 16.19%

Once again, the MNNS is outperforming the other two methods but this time the

difference is clearer as they all perform well. The MNNS is closer to the actual

average chromite grade and produces the smaller absolute errors from the three

methods. This is verified by the data fit graph and grade distribution chart shown in

the following figures.


148

Chromite Grades Data Fit

0

5

10

15

20

25

30

35

0 5 10 15 20 25 30 35

Actual

Estim

ated ID2EST

KRGESTMNNS

Figure 6.14: Scatter diagram of actual vs. estimated chromite grades.

Chromite Grade Distribution

0

2

4

6

8

10

12

14

16

10 15 20 25 30 35

Bin

Freq

uenc

y Actual

ID2

Kriging

MNNS

Figure 6.15: Chromite grade distributions – actual and estimated.


149

From the above graphs it appears that kriging is doing better at the low to middle

grade samples while the MNNS is doing better at high-grade samples. Inverse

distance tends to overestimate low-grade samples and underestimate high-grade ones.

Generally all three methods are performing well.

6.6 Conclusions The prototype 2D and 3D MNNS architectures were tested in this chapter in four very

different case studies. The datasets used were coming from both simulated and real

deposits. Each dataset had a different type of ore as its target quantity with the

common point being the fact that they were all metal. The number of samples in each

case study was relatively low. These were, however, case studies that aimed at the

development of the modular architecture and not at the establishment of the approach

as a valid ore grade estimation technique. Therefore the low number of samples

allowed easy monitoring of the system’s performance and fast development times.

The performance of MNNS, as measured by the produced absolute errors and

estimated grade distributions, compared very well to the performance of inverse

distance and kriging. MNNS seemed to perform well even on datasets that were

designed to demonstrate the validity of the geostatistical approach. Clearly though,

there could be plenty of space for improvement of the geostatistical studies. That is

always of course at the expense of time and effort.

The speed of development and the independency of the approach to the

knowledge and skills of the user have been demonstrated by these case studies. The

quality of the estimates also has shown that the MNNS architecture is a step in the

right direction for ore grade estimation using artificial neural networks.

GEMNET II - An Integrated System for Grade Estimation

150

7. GEMNET II – An Integrated System for Grade Estimation

7.1 Overview In this chapter the discussion continues with the analysis of GEMNET II, the

integrated system for grade estimation developed by the author and based on the

Modular Neural Network System described in the previous chapter. GEMNET II is

mainly written in C and uses parts of the SNNS, the Stuttgart Neural Network

Simulator from the University of Stuttgart, Germany [97]. The GEMNET II core

program is a data processing and control module written in C that processes the

samples file as well as the block model file. The core program also makes external

calls to parts of the SNNS simulator. These parts are the main simulator kernel, the

batch execution language (BATCHMAN), and the C code extraction program

(SNNS2C) that converts the trained neural network topologies to C functions. The

development of neural networks is controlled by a number of scripts written in the

SNNS batch language, which is very similar to AWK and C.

GEMNET II is integrated within VULCAN, a leading software package for

resource modelling. The control of the system is done through ENVISAGE,

VULCAN's graphical editor that provides the graphical user interface for GEMNET II.

The interface between GEMNET II and VULCAN is based on a number of scripts

written in a very popular scripting language called Perl. Specifically for VULCAN

there are a number of extensions to Perl, which are called Lava extensions. These give

access to graphical objects and routines in ENVISAGE, which are very useful for

integrating external programs like GEMNET II.


151

The integration of GEMNET II with SNNS and VULCAN provides the

following additional functionality that was missing from the MNNS as a standalone

system:

• Ability to try practically every neural network architecture without having to

modify the core of the system;

• Faster training and application of neural networks during the estimation process;

• A graphical user interface that is easy to learn and use;

• Direct access to an integrated modelling environment allowing the incorporation

of the estimation results in a larger scale modelling operation;

• Estimation based on the advanced block modelling that VULCAN provides;

• 3D visualisation of the drillhole samples and the targeted block model;

• 3D visualisation of the estimation results and validation of the training process;

• Straightforward comparison of GEMNET II with other estimation packages and

techniques incorporated in VULCAN, like the geostatistical packages GSLIB,

Geostokos, and ISATIS;

• Data management based on VULCAN's project file structure;

• Estimation reliability measures.

The MNNS core has also been modified to improve the estimation process and

provide a number of reliability measures. The next section gives the details of the core

architecture and shows how it was implemented using the SNNS simulator. It should

be noted that there were some changes in the names of modules in the MNNS.


152

7.2 Core Architecture and Operation

7.2.1 Exploration Data Processing and Control Module This is the main part of GEMNET II. It is a program written entirely in C with the

code being compatible with both Microsoft Windows based PCs and UNIX based

workstations. It is responsible for processing the drillhole samples file and the block

model centroids file, normalisation of the data, generation of training patterns for the

various networks, and for making all the necessary external calls to the neural

network simulator (SNNS). Once the development of neural networks is completed

and the C code extracts have been compiled, this module carries on with the

estimation process. Figure 7.1 shows schematically the operation of this module.

The first operation of the data processing and control module is to read the

samples file and place the sample co-ordinates and assay values (grades) in a number

of arrays. This is done to increase the speed of the search process later on. The

samples file is normally a map file generated by VULCAN’s compositing function.

The map file contains a header describing the file structure and records consisting of

sample ids, sample co-ordinates, and assay values.

The values of the arrays are normalised so that all co-ordinates and assay

values vary between zero and one. As was explained before, this ensures that the

effects of the range of values are eliminated from the neural network training process.

The normalisation information (minimum, maximum, and range values) is stored in a

file that will be used later to restore the initial values and to ensure that the estimates

will also be in the correct range of values. Figure 7.2 shows the normalisation

information as reported by GEMNET II in VULCAN. The contents of the normalised

arrays (sample co-ordinates and grade) are written in a file used for training the

second module’s network. As it was mentioned before this network is trained on the


153

entire dataset but is used to provide estimates only where there are not enough

neighbour samples for the networks of the first module.

Figure 7.1: Simplified block diagram showing the operational steps of the data processing

and control module in GEMNET II.


154

Figure 7.2: Normalisation information panel.

The next step is the application of the search method. Each sample is taken as

the centre of the search scheme. The space around the centre sample is divided into

the six sectors described in the previous chapter. The centre sample is in essence the

training point for the RBF networks of the neural network modules. All the remaining

samples are assigned to one of the sectors depending on their relative location to the

centre of the search. It should be clear that as the discussion is about samples with an

associated volume, their location is identified as the centroid of the volume. The

normalised distance of each neighbour sample to the centre is calculated and stored

together with the normalised neighbour sample and centre sample grades in one of six

files, one for every sector.

At the end of the search process, there are six files each containing a different

number of samples depending on the geometrical characteristics of the drilling

(sampling) scheme. In fact the number of training patterns is equal between opposite

sectors, e.g. the north sector has equal patterns with the south sector, etc. Therefore

the networks of the second module in GEMNET II are trained on different number of

samples, while in the MNNS the number of samples was constant. This is because in

the MNNS only one neighbour was selected from every sector for every sample while


155

in GEMNET II the number of samples depends only on the available samples in a

sector. This is a fundamental difference between the two implementations of the

modular architecture. The networks in the MNNS were not provided with all the

available information on the effects of the sample distance as they were trained on

only one neighbour sample per sector. In GEMNET II the six RBF networks are

trained on all the information available in order to build a more complete model of the

distance-grade relationship.

However, there is one implication brought by the above change. The final

network trained on the outputs of the first module’s networks needs to be trained on

complete patterns. As each network is trained on different samples there is no

synchronisation in their training process, i.e. these networks are trained sequentially

and on different centre samples. This problem is rectified by the use of test files.

Together with the six training files produced by the search process, there are six test

files that contain patterns formed using the closest neighbour in each sector and only

for the centres where all sectors have at least one neighbour sample. This way the

trained networks can be synchronised and provide individual estimates for the same

centre samples, which can then be used for training the final module network.

All the pattern files created need to be converted in a format compatible with

the neural network simulator used (SNNS). This is fairly easy to do as SNNS reads

ASCII pattern files with a very straightforward structure. An example of a training

pattern file generated during a GEMNET II case study is given in Appendix A.

The data processing module operation proceeds with the processing of the

block model centroids file. Again this file is generated in VULCAN using a block

model export option that calculates the centroid co-ordinates and volume of each


156

block. The centroid co-ordinates are real world co-ordinates and not relative to the

origin of the block model.

The block model centroids are normalised and passed one at a time to the

centre of the same search scheme used for the drillhole samples. This normalisation

uses the same parameters used for the normalisation of the drillhole samples to ensure

that their relative locations are preserved. The search process is exactly the same only

this time the place of the search centre has been taken by block centroids and only one

neighbour sample is selected – the nearest – from each sector. The neighbours are

again drillhole samples. Each block is flagged depending on the existence of a

neighbour sample in each sector. There is one flag for each sector, which is turned to

one if there is a neighbour or zero if there is not. These flags are written in a file

sequentially and are used during the estimation process to control the usage of the

individual networks. This will be discussed later when the estimation process is

described.

The grade, distance, and length of the neighbour samples from the six sectors

are written in an input pattern file for the first module’s networks. The centroids of the

blocks are written in another input pattern file for the second module’s network. As it

was mentioned, the choice between module one and two during estimation is

controlled by the file containing the flags from the search process.

After the processing of the block model is completed, the data processing and

control module continues with the most important aspect of the operation of GEMNET

II: the neural network development. The module makes a number of external calls to

the SNNS executables. These calls are arranged in command line batch files. The first

set of calls is targeted at BATCHMAN, the SNNS batch language for neural network

development. The calls include as arguments the name of the batch program to be


157

executed as well as a log file name where all the messages from the development

process are to be stored.

The batch programs are written in the SNNS batch language that is very

similar to AWK and C. The batch language provides access to every function of the

SNNS kernel: all the neural network architectures and learning algorithms. The batch

programs that come with GEMNET II control the development of all the employed

neural networks. The beauty of this approach is that by simply changing the batch

program, one has complete control over the learning process. As the batch program is

just a text file, an external process, such as VULCAN’s graphical user interface can

easily alter it. This way the complete control of GEMNET II neural network

development is passed to the interface with VULCAN. An example of a batch

program from GEMNET II is given in Appendix A.

The first batch programs train the networks of module one and two using the

training patterns. The log files are written for these networks during this process.

After training terminates, the test patterns are presented to these networks to provide

synchronised outputs, i.e. individual estimates for the same samples. These outputs

are written into ‘results’ files, which are subsequently used for generating the training

patterns for the final module network. The last of the batch programs trains the final

module network. Once this process terminates, the neural network development is

complete.

The trained networks at this stage are in the form of SNNS network files –

ASCII files containing the network topology and the weights and biases after training.

These networks now need to be converted into C functions to be used during the

estimation process. The data processing and control module makes the necessary calls

to the C code extraction utility provided with SNNS, the SNNS2C. This utility creates


158

both the header (.h) file and the code (.c) file from the network file. Examples of a

trained file as well as the respective header and C code file are given in Appendix A.

The module calls the SNNS2C to convert all the trained networks to C code. All that

is left then in order to use the networks is to compile them and link the headers with

the application. Upon completion of this process, GEMNET II is ready to provide

grade estimates in unknown locations.

The final operation of the data processing and control module is grade

estimation on a block model basis. The program reads the flags file and uses module

two whenever the flag value is one or module one whenever it is zero. The input

pattern files generated during the block model processing described above provide the

input values for the network function calls. The final module network is then called

using the outputs of module one and two network functions. The data processing and

control module de-normalises the final estimate and the block model centroids and

writes them in an estimates file. Together with the centroid co-ordinates and grade

estimate, the module also writes the variance of the individual estimates from module

one and two networks as well as the flags showing which networks are responsible for

the estimate. These extra parameters are used to validate the estimation process and

identify any problematic areas or networks.

After the estimation process is complete the data processing and control

module terminates. The main and most important part of GEMNET II operation is

complete. The main contributing parts of this operation are shown in Fig. 7.3. It is

becoming more and more clear that during this operation there is only a minimum of

human interaction required.


159

Figure 7.3: Interaction between GEMNET II and other parts of the integrated system

during operation of the data processing and control module.

7.2.2 Module Two – Modeling Grade’s Spatial Distribution The second neural network module in GEMNET II consists of the RBF network, as

described in the MNNS architecture, as well as the batch program that controls its

learning process. It is presented before the first module for consistency reasons. In

contrast with MNNS, the learning process is not part of the main program but it is

implemented in the SNNS batch language.


160

There is no difference in the RBF network topology for this module between

the MNNS and GEMNET II. However, there are major differences in the learning

process for this network. Most of the case studies ran using GEMNET II involved

considerably large datasets – more than a thousand samples. The learning process had

to be improved to cope with the abundance of training data.

One of the most important changes was in the initialisation of the network. In

MNNS, this was simply done by randomly placing the RBF centres in the input space.

In GEMNET II this was found to be inadequate due to the large number of samples

defining the input space. A more ‘intelligent’ way of locating the centres has been

employed: Kohonen learning. Before the network is trained and its weights adjusted,

the input patterns are clustered using a process of self-organisation known as

Kohonen learning (Chapter 2). This process ensures that the input samples are

clustered according to their statistical properties and an RBF centre is allocated to

each cluster. The random positioning of the centres is still taking place right before

this process to accelerate the initialisation stage of the development. The clustering

process is accelerated, as its starting point is a random spread of centres in the input

space.

Initialisation continues with the weights between hidden and output layer as

well as the bias of the hidden units. The initialised network topology is saved in a

network file for further examination in the validation stage.

Following the initialisation of the network’s input-hidden layer weights (centre

positioning), two learning stages take place. As was mentioned before, RBF learning

has to concentrate on one free parameter at a time. The learning process becomes

unstable if more than one parameters are allowed to change. Therefore, a separate

learning process is allocated for the hidden-output layer weights and the bias of the


161

hidden units. The learning parameters are set to experimental values that were found

after a large number of tests. The learning process for these two parameters is

identical to the one used in MNNS. Training is stopped again when the change in the

network’s output error becomes very small. The trained network topology is saved in

a network file.

The final operation of the batch program is to pass the test pattern file through

the network and write the results in a text file. This file can be used for generating a

scatter plot of actual vs. estimated grade for the specific network. This will be shown

later when the validation tools provided by GEMNET II are described.

During this development process all the messages coming from the simulator

are stored in a log file that can be opened with a text editor for examination.

Examining the log file as well as the initialised and trained network files can draw

useful conclusions about the effectiveness of the training process. The author used

these files as a guide for setting the learning parameters and the required number of

cycles. The network files provide a very useful piece of information: the location of

the RBF centres in the normalised input space (Fig. 7.4). This will prove to be very

important for validating the network’s learning and estimation performance.


162

Figure 7.4: RBF centres from second module located in 3D space. Drillholes and

modelled orebody are also shown.

7.2.3 Module One – Modelling Grade’s Spatial Variability The changes in the learning process for the RBF networks of the first module are

exactly the same with the second module. The initialisation procedure makes use of

Kohonen learning for locating the RBF centres in the input space. From the discussion

on the data processing and control module it is clear that the six RBF networks of

module two are trained separately and in sequence. The learning procedure is

identical. However there is one problem that became clear during testing. Because of

the geometry commonly found in most sampling schemes the drillholes are arranged

in sections typically perpendicular to the orebody. This can lead to some sectors of the

RBF t


163

search scheme being overcrowded while others having a low number of samples. As

there is no way of knowing in advance which sectors will be overcrowded and which

not the training of the networks can be unbalanced, i.e. some networks have many

samples to learn but the same number of training cycles to do it with others who have

only a few training samples.

The solution to this problem is a number of filters introduced between module

two networks and the data processing and control module. These filters allow samples

inside a distance range to pass as training patterns to the networks, while they hold

samples that are further than a certain range. It should be noted that the criterion is the

distance range, i.e. percentage of the maximum distance between samples, and not

absolute distance. By adjusting the search range the number of samples can be limited

and the networks can be trained on similar number of training samples.

A very interesting issue with the first module networks is the visualisation of

the RBF centres. In the second module, the input space is the ‘real’ 3D space defined

by the drillhole samples’ co-ordinates and therefore visualisation of the RBF centres

is straightforward. In the first module networks though, the input space is not the 3D

space of real world co-ordinates, but the hyperspace defined by the distance, grade,

and length of neighbour samples. In order to visualise the RBF centres, this space is

constructed in Envisage using the training input patterns. A new mapping window is

constructed by substituting the three co-ordinates (easting, northing, and elevation)

with the grade, distance, and length of samples. The training samples and RBF centres

can then be visualised in this hyperspace (Fig.7.5).


164

Figure 7.5: RBF centres of west sector RBF network and respective training samples in the input pattern hyperspace (X-Grade, Y-Distance, Z-Length).

It is somehow difficult to understand the way samples are placed in this

hyperspace as well as how the RBF centres are located. However, after careful

examination of images like the one in Fig. 7.5, the distribution of samples becomes

clearer. A very interesting finding is that samples being chosen as neighbours in a

specific direction appear to form lines of constant X-Grade and varying Y-Distance.

This of course should have been expected, but pictures like this help to understand

even further the characteristics of the input space.

7.2.4 Final Module – Providing a Single Grade Estimate The final module consists of a single RBF network responsible for weighting

the individual estimates of the first and second module networks. This network does

not model the grade in an input vector space. It simply tries to model the relationship

between the responses of the first and second module networks and the actual grade


165

values. This network is completely ‘unaware’ of sample co-ordinates or neighbour

sample grades, distances, and lengths. The only information provided to this network

is the required output (actual grade at estimation point) and the estimates of the

individual networks.

The purpose of this network is to replace the simple averaging that was the

way of providing a single estimate from the various networks in the earlier

architectures. During testing it was found that the final estimate can be brought even

closer to the actual value by weighting the individual network estimates. One could

argue about the use of an artificial neural network for this task, and in fact the author

had many recommendations by other researchers in the field of AI that did not suggest

the use of an ANN or specifically an RBF network. However, the RBF network of the

final module proved to be at least good enough for this weighting task and with this

project being dominated by the use of ANNs, the author did not look any further. It

should be noted though that different ANN architectures were tested.

The RBF network of the final module is shown in Fig. 7.6. It is a simple 3D

representation of this network and the location of the RBF hidden units has nothing to

do with the positioning of the RBF centres before or after training.


166

Figure 7.6: Final module’s RBF network.

A training process very similar to that of the other neural modules determined

the number of RBF centres and their location in the input space. Unfortunately, due to

the high dimensionality of this network’s input space (6D) it is not possible to use

Envisage or any other graphical environment for the direct visualisation of the RBF

centres and training samples in the correct input space. It is only possible to examine

the learned model using any three of the six inputs at a time.

The training process for this network involves the results of the previous

networks on the test samples and not on their training samples. This was necessary to

allow complete freedom in the number of samples used for training the first module’s

networks. However, the author believes that this could be a source of inefficiency for

the complete architecture as this is the final RBF network that controls the final

estimate produced. If the test samples are not representative of the dataset then the

RBF network of the final module could have difficulties in providing reliable results.

This is an aspect of GEMNET II’ operation that needs monitoring. The author suggests


167

that the distributions of grade estimates from the various first and second module

networks are compared with the final module network estimates.

The validation of the system’s operation during neural network development

as well as during grade estimation has been a consideration of the author since the

beginning of GEMNET II development. This fact led to the development of validation

tools specific to GEMNET II and implemented using VULCAN’s graphical

capabilities. These are the subject of the next section of this chapter.

7.3 Validation

7.3.1 Training and Validation Errors The first and most common way of measuring a neural network’s performance

is by calculating its estimation error on the training or validation pattern set. The

training error is less important as it reflects the performance of the network on

samples that it was trained to perform well. In other words, the training error is not a

good measure of a network’s performance. However, the training error can indicate

problems in the learning process that can be due to inadequate number of samples or

training cycles or both. If a network cannot reach an acceptable error level regardless

of the number of training cycles, then the learning algorithm needs to be modified or

the number of samples to be increased. One has to monitor the progress of the training

error curve cycle after cycle in order to conclude as to the origin of high training

errors.

A more representative and reliable measure of a network’s performance is the

validation error. A good learning algorithm should normally be based on the

validation error to guide the weight changes but even if this is not the case, a

validation pattern set can help build confidence on the learned mappings. In the case

of GEMNET II and samples from drillholes, generating a validation set and using it


168

for measuring its performance is not an easy task. In geostatistics and other more

conventional methods the developed estimation technique is validated using the

process of cross-validation. Cross-validation is in essence the regeneration of the

samples by hiding one at a time and trying to estimate it using the remaining samples.

In the case of the neural networks in GEMNET II, this is what the training process

does. In other words, cross-validation is not applicable in the case of GEMNET II

because it can give very misleading results.

On the other hand, by hiding samples from the training process to use them as

a validation set automatically means that GEMNET II has less samples to train the

networks and therefore less chances of producing good results on the validation set.

This is especially applicable when the system is dealing with a very complex orebody

that requires as many samples as possible to describe its grade behaviour in space.

With this consideration in mind, the author suggests that a validation set be

generated at first to measure the networks’ generalisation performance. If the

validation errors are acceptable then the networks should be retrained using the same

training process but including the samples of the validation set to ensure that the best

possible mappings are generated.

7.3.2 Reliability Indicator The learning process in GEMNET II is a relatively more complex process than in other

systems as it involves a very modular neural network structure. It is important to see

the final estimate produced as the result of the weighting of individual estimates.

Therefore by measuring the variance of these estimates one can conclude as to the

reliability of the final estimate. In other words, the higher the agreement between the

individual estimates the higher the reliability of the final estimate and vice versa. The

variance of the individual estimates will be mirrored by the weight values of the final


169

network. A combination of very high and very low weight values in the final network

express the difficulty of the final network in getting close to the actual grades.

The variance of the first and second module networks’ estimates has been used

as the basis of a reliability measure or reliability indicator. This is calculated during

the estimation process. In VULCAN, the user has to add an extra variable to the block

model to be used by GEMNET II for storing the reliability indicator for each block

estimated. After the estimation process, the block model can be visualised in 3D or in

sections with a colour scheme based on the reliability indicator (Fig. 7.7). This way it

is possible to identify areas where GEMNET II has difficulties to provide an estimate.

The reliability indicator though cannot lead by itself to the origin of the problem or

even quantify it. It is strictly an indicator, i.e. a guide that can help identify problems.

Figure 7.7: Block model coloured by the reliability indicator in GEMNET II.


170

7.3.3 Module Index Another useful source of information is the flags stored in the flags file during the

processing of the block model centroids file by the data processing and control

module. This file consists of records with six flags each, one for every sector. The flag

values are one for sectors with neighbour samples and zero for empty sectors. These

values are used during estimation for choosing between the first (sector flag = 1) or

the second module networks (sector flag = 0).

These flags are stored in the block model. Specific variables have to be set in

the model to contain the flag values. The block model can then be visualised in

Envisage using a colour scheme that depends on the flag values or module index (Fig.

7.8). By combining the module index and the reliability indicator, it is easy to identify

the networks that can present problems during estimation.

Figure 7.8: Block model coloured by module index in GEMNET II. Cyan blocks represent

first module estimates while red blocks represent second module estimates.


171

7.3.4 RBF Centres Visualisation The RBF centres location in the input vector space is absolutely crucial to the

performance of an RBF network. The RBF centres visualisation tool has been

developed specifically for GEMNET II in Envisage and allows the displaying of both

the centres and the training samples of any RBF network from the modular

architecture (Fig. 7.9). This option loads the RBF centres using a special symbol on

the screen and also the training samples as crosses. The correct input space is used,

i.e. the 3D real world co-ordinates space for the second module and the neighbour

sample grade, distance, and length input space for the first module.

Figure 7.9: First module RBF centres visualisation in GEMNET II. Drillholes and orebody

model are also shown.


172

Clearly this is an option for the users who will know the basics of the system’s

operation; otherwise it will not be very useful. By looking at the positions of the RBF

centres, one can decide whether the network initialisation procedure is efficient and

whether the learned mapping is reliable. A well spread distribution of centres in the

input space with a high density of centres in areas where grade seems to present a

complex behaviour suggest that the network has been properly developed. High

density of centres in areas with very few or even no samples means that the

initialisation and training process needs to be modified. Usually an increase of the

number of initialisation or training cycles is required, or an increase of the learning

parameters.

7.4 Integration

7.4.1 Neural Network Simulator Development of neural networks in GEMNET II is based on the Stuttgart Neural

Network Simulator (SNNS) developed at the Institute for Parallel and Distributed

High Performance Systems (IPVR) at the University of Stuttgart, Germany. SNNS

was originally developed for the UNIX operating system but was recently ported to

the Microsoft Windows 95/NT environment. It is still based on X Windows and

requires an X Server in Windows 95/NT for the graphical user interface. Figure 7.10

shows a schematic diagram of its main components.


173

Figure 7.10: Diagram of the main components of SNNS. The four main components of SNNS are the simulator kernel, graphical user

interface, batch execution language (BATCHMAN), and network C code extraction

tool (SNNS2C). The graphical user interface is not used in GEMNET II as this is

provided by Envisage in VULCAN. The other three parts - mainly BATCHMAN and

SNNS2C - are extensively used. The simulator kernel includes a number of functions

for:

• Network manipulation

• Network structure definition

• Cell (processing element) definition and manipulation

• Learning

• Pattern manipulation

• Pattern propagation

• Network and pattern file handling

• Error calculations

• Memory management


174

The batch execution language in SNNS, BATCHMAN, has been modelled after

languages such as AWK, Pascal, Modula2 and C. BATCHMAN provides a

command line or scripting interface to the simulator kernel. It is possible to send

commands directly in interactive mode using the interpreter or execute complete

batch scripts by calling BATCHMAN with the batch script file name as an

argument. The structure of the batch scripts or programs is not predetermined.

There are a number of system variables available for monitoring the development

of the networks. These can be used during training to create more advanced

training algorithms. The available system variables are:

Table 7.1: System variables available in BATCHMAN.

SSE Sum of squared differences of each output neuron

MSE SSE divided by the number of training patterns

SSEPU SSE divided by the number of output neurons

CYCLES Number of cycles passed

PAT Number of patterns in the current pattern set

EXIT_CODE Exit status of an external call

SIGNAL Integer value of a caught signal during execution

There is a total of eight batch programs in GEMNET II for the development of the

eight RBF networks. These programs are very similar to each other and generally

follow the same steps:

1. Load untrained network file and training and testing pattern files

2. Initialise the network using Kohonen learning


175

3. Write the initialised network to a file

4. Train the network’s hidden-output layer weights

5. Train the network’s hidden units’ bias

6. Write the trained network to a file

7. Test the network using the test pattern file and write the results to a file

BATCHMAN is called from the data processing and control module using the scripts and a name for the training log file. BATCHMAN runs the scripts and writes all the messages during the steps described above to the log file. After all eight scripts have been executed, control is passed back to the data processing and control module. The user can open the log files with a text editor to get more information about any possible problems as well as the training and validation errors. From the execution of each script the following files are created:

<network name>ini.net: initialised topology (e.g. eastini.net)

<network name>tr.net: trained topology (e.g. northtr.net)

<network name>.log: training log file (e.g. east.log)

<network name>.res: results of testing (e.g. east.res)

The other SNNS tool used in GEMNET II is the network compiler SNNS2C.

This tool compiles a network file into an executable C source code. There are

limitations as to the network types and other SNNS features supported by SNNS2C,

but fortunately none of them causes any problems to the GEMNET II modular

network architecture. SNNS2C supports all the necessary features for GEMNET II.

The input to SNNS2C is the trained network file as created from

BATCHMAN after executing the batch scripts of GEMNET II. SNNS2C generates

ANSI-C source code and header files. The generated code is compiled separately. The

header files are linked to the data processing and control module. This way the

produced network C functions are linked to GEMNET II and can be called during

grade estimation. During network compilation SNNS2C goes through the following

steps:


176

1. Network loading: the network file is loaded with the function from the

simulator kernel.

2. Dividing network into layers: individual units are grouped into layers with

the same type and activation function.

3. Layers sorting: the layers are sorted in topological order.

4. Network writing: The generated network structure, activation functions and

pattern propagation is written to the C source file.

Altogether, SNNS proved to be very useful for the development of neural

networks in GEMNET II. The flexibility provided by the batch execution language

and the very large library of network types, activation functions, and learning

algorithms provided by the simulator kernel allowed quick and easy testing of

different learning strategies and network architectures. It would be very time

consuming, if not impossible, to do the same development and testing without the

simulator, using hard-coded neural networks and learning algorithms.

7.4.2 Interface with VULCAN – 3D Visualization Grade estimation is part of a much larger process that involves other tasks such as

geological modelling and reserves estimation. In order to exploit the full potential of

GEMNET II, it has to be integrated in this larger process of mineral deposit evaluation

[39]. This was achieved using VULCAN, one of the leading earth resources

modelling packages available for the mining industry.

VULCAN is a modular package, i.e. it consists of a core module (VULCAN

Modeller) and a number of specialised modules like the MineModellers,

GeoModellers, SurveyModeller, and Chronos (scheduler) (Fig. 7.11). VULCAN can


177

be customised to include the functionality required by specific projects and for that

reason this system has all the necessary features that allow third party software to be

interfaced to it.

VULCAN’s user interface, Envisage, is an advanced 3D modelling

environment that provides advanced 3D CAD and visualisation as well as

triangulation modelling, grid mesh modelling, and contouring [57].

VULCAN’s GeoModellers provide functions for drilling, borehole

visualisation, channel sampling, geological modelling, geostatistics, block and grid

modelling, stratigraphic modelling, and other tasks. For geostatistics, the

GeostatModeller can be interfaced to the GSLIB, Geostokos, and ISATIS

geostatistical packages. Block models can be visualised in 3D and manipulated in

many different ways. GEMNET II relies on the importing and exporting functions

available for block models in VULCAN as well as the drillhole compositing

functions.

Envisage provides customised user menus, i.e. users can create their own

menus that look and act exactly like the rest of the GUI and can provide the functions

that the user wants. These functions can be directly linked to a Perl script

(VULCAN’s supported scripting language), which means that users can add

functionality to the system. GEMNET II is interfaced to VULCAN by a number of

scripts written in Perl and utilising the extensions for VULCAN, called Lava.


178

Figure 7.11: Modules and extensions of VULCAN.

The structure of the user interface is shown in Fig. 7.12. The menu for

GEMNET II includes options for setting the estimation parameters, network

topologies and learning, and validation.


179

Figure 7.12: Menu structure of GEMNET II in Envisage.

There is a main menu and two sub-menus for the setup and validation. All options

lead to panels that accept user input from the keyboard. These panels (Fig. 7.13)

access the options available with GEMNET II and allow the user to do the following

things:

1. Select samples file and block model

2. Modify the learning method and network topologies

3. Run GEMNET II with the saved specifications

4. Display the block model using the reliability indicator or the module index

5. Display the input samples and RBF centres in the correct input space

GEMNET II also requires functions already built into Envisage. These include:


180

1. Drillhole compositing

2. Block model ASCII import/export functions

3. Block model display functions

Figure 7.13: GEMNET II panels in Envisage.

After the user selects the input and output files for the estimation process,

GEMNET II can start the network development. The data processing and control

module is called using the Run option from the main menu. A console window is

opened and GEMNET II begins with the processing of the samples and the generation

of the training pattern files (Fig. 7.14).


181

Figure 7.14: Console window with messages from GEMNET II operation.

The data processing and control module continues its operation in the

background while the user can carry on using Envisage. Once the network

development is complete and the networks are compiled, grade estimation takes place.

The results are written to a file selected by the user. This file can then be imported to

the block model. The user can then validate the estimation process using the tools

described and compare the results with other studies using geostatistics within the

Envisage environment.

VULCAN’s online help is based on a web browser and HTML files for each

and every option. A number of pages were added to provide help for GEMNET II. The

help is context based, i.e. it depends on the function that the user is trying to access

(Fig. 7.15).


182

Figure 7.15: GEMNET II online help.

The system operates in a very similar manner to other functions in Envisage,

which means that users can get familiarised with GEMNET II in a very short period of

time.

7.5 Conclusions In this chapter an in-depth discussion was given on GEMNET II, the integrated system

for grade estimation based on artificial neural networks. The benefits of the approach

were explained and in particular the advantages of the integration with the neural

network simulator, SNNS, and the resources modelling package, VULCAN.

Even though GEMNET II is based on the basic MNNS architecture described

in the previous chapter, there are many improvements that help GEMNET II be a

much more usable system.


183

The system has many advanced features that can establish it as a commercial

product. It provides validation tools that can help build confidence to the estimates

while it removes most of the problems found in other grade estimation techniques.

GEMNET II makes very few assumptions about the grade distribution. Its operation

does not depend on the user’s knowledge of geology, geostatistics, or even neural

networks. It should be noted though that knowledge of neural networks could improve

sometimes the results but not significantly. Generally, the system adjusts to the data

presented to it to achieve the best possible estimation.

Even though it is based on artificial neural networks, GEMNET II is not a

‘black box’ approach. The technique is fairly understandable as it is based on

established principles of grade spatial behaviour. The validation tools provided with

GEMNET II and the exhaustive monitoring of the network development also help the

user to understand how it works and why. In the next chapter the validity of the

approach will be proved through a number of case studies using real 3D data from

different deposits around the world.

GEMNET II Application – Case Studies

185

8. GEMNET II Application – Case Studies

8.1 Overview The case studies presented in this chapter are the final tests of the GEMNET II

architecture. Their purpose was to demonstrate the full potential of the approach and

provide a complete comparison with other estimation techniques. They are presented

in order of increasing complexity and difficulty. The number of available samples

increases as well as the structural complexity of the deposits.

The data used in these case studies come from real deposits. In some of them

the 3D co-ordinates of the samples have been changed without affecting their relative

locations for confidentiality purposes. The number of case studies was limited to four

as in the previous chapter. The selected case studies are the most representative of

GEMNET II performance while they are quite different between them. These studies

are also ideal for geostatistics and in fact have been used for demonstrating

grade/reserves estimation using computer software. However, no results have ever

been published using this data other than the papers written by the author during this

project.

The deposits in the four case studies that follow present a complex 3D

structure. They all come with a complex geological model, which is used for

constraining the estimation process. The geological model in some cases becomes

more complicated by the presence of faults and other discontinuities. This factor

makes grade estimation an even more challenging task.

In all of the case studies, a complete geostatistical study has been performed,

the results of which are presented in this chapter together with the study of GEMNET

II application. Unfortunately, the author was not able to get authorization for

publishing results from case studies other than copper/gold deposits. There seems to


186

be an abundance of real copper/gold data available from fully exploited or

undeveloped deposits. The same does not apply for other metals and minerals.

The four copper/gold deposits used for testing the estimation performance of

GEMNET II have very little in common. Except from the type and possibly the way

they have been formed, these deposits present a very different 3D picture and a very

different estimation task. Their size and geometry varies significantly as does the

grade distribution suggested by the available samples.

The available samples for each of the four deposits vary in number

considerably. The drilling geometry is also different as is the assaying procedure.

These differences ensured that GEMNET II would be tested on very different

conditions and data and that the results would reflect its performance over a wide

range of problems. Table 8.1 gives the main characteristics of the four deposits used

in this chapter. The data from them are given in the accompanying CD-ROM.

Table 8.1: Main characteristics of the four deposits used for testing the final GEMNET II

architecture.

Name MAC_DEMO THOR SME GEOST_GOLD

Number of Samples 1361 3612 10,656 30,211

Estimated Grades Au, Cu Au Cu Au

Number of Orebodies 1 4 5 1

As can be seen from the table, the deposits are given code names. These names are

used as a replacement of their original name and location for confidentiality purposes.

The same computer system has been used for all four case studies. It was a

Pentium II 300MHz with 128Mb RAM and 1Gb of virtual memory space running

under Microsoft Windows NT 4.0. The time required to complete each case study has


187

been affected by the specifications of the system and therefore comparison with other

similar studies should not be made unless these specifications are the same. The

geostatistical studies were also performed using the same computer.

GEMNET II was running from VULCAN/Envisage version 3.3. Geostatistics

were running also from the same environment using GSLIB. Therefore the same

computational overhead from VULCAN has been present while the various

approaches were tested.

The measures of performance for the three approaches compared were the

mean absolute error, the data fit diagram (scatter plot), and the estimated vs. actual

grade distribution diagram. These performance measures were based on samples

taken out of each dataset that were not provided as input information for any of the

three techniques. In other words, these were unknown samples for the estimators but

not for the performance measures. This was considered by the author as a more

objective way of comparing the various techniques, as the actual values of those

samples were known as opposed to block estimates of unknown actual grade. For

GEMNET II, the reliability indicator values are also shown in slices through the

estimated block model. It should be noted again that the reliability indicator is only a

guide to the quality of the produced estimates from GEMNET II and not a precise

performance measure.

For the case studies were the deposit consists of more than one orebodies, the

samples were split into groups, one for each orebody. The same was applied for the

block model. In each run only data from inside an orebody are used and only blocks

inside the same orebody are being estimated. As a result, two of the case studies

(THOR and SME) are much more complicated and took a lot more time to complete.


188

Finally, the basis of comparison for the various approaches was the results on

the test set for GEMNET II described in Chapter 7, and cross-validation results for

inverse distance and kriging on the same test set. Cross-validation was performed

using again GSLIB inside VULCAN.

8.2 Case Study 1 – Copper/Gold Deposit 1 The dataset from the first copper/gold deposit consists of 44 drillholes containing

1361 samples in total. From these only 227 samples are within the geological model

of the orebody as can be seen in Fig. 8.1 below. These samples are used for the

estimation process.

Figure 8.1: Orebody and drillholes from copper/gold deposit 1.


189

The number of samples is quite small and the 3D model of the orebody fairly simple

making this case study a relatively easy task. The following table gives the statistics

for the data used in this case study:

Table 8.2: Statistics of data from copper/gold deposit 1. Number of

Samples Average Grade

Standard

Deviation

Coefficient of

Variance

Number of

Estimated Blocks

Au 227 2.34g/t 3.1758 1.0339 5,698

Cu 227 4.01% 4.7184 0.9424 5,698

The co-ordinates of the samples have been transformed from their original values for

confidentiality purposes. The relative positions of the samples have not been changed.

The data processing and control module of GEMNET II generated 1109, 2156, and

1905 patterns in the west-east, north-south, and upper-lower sectors respectively.

Copper Grade Values Data Fit

0

2

4

6

8

10

12

0 2 4 6 8 10 12

Actual

Estim

ated ID2

KrigingGEMNet II

Figure 8.2: Scatter diagram of actual vs. estimated copper grades from copper/gold deposit 1.


190

First the three methods were tested using the copper grade data. The mean

absolute errors produced were 18.9% for GEMNET II, 20.06% for inverse distance

squared, and 19.68% for spherical kriging. Clearly, there was not much difference in

this case between the different approaches. The data fit diagram of Fig. 8.2 shows

exactly how close they were.

Copper Grade Distribution

0

2

4

6

8

10

12

14

16

18

20

2 4 6 8 10 12 14 More

Bin

Freq

uenc

y Actual

ID2

Kriging

GEMNet II

Figure 8.3: Copper grade distributions from copper/gold deposit 1.

Unlike the absolute errors which suggest that GEMNET II is doing slightly better than

kriging and inverse distance, the estimated distributions shown in Fig. 8.2 show that

inverse distance is following better the actual distribution of copper grades, with

GEMNET II and kriging presenting very similar distributions. GEMNET II tends to

underestimate high-grade samples but the overall estimation is not biased. On the

other hand, inverse distance seems to overestimate low-grade samples.

The time requirements for the application of the three methods were quite

different, even though geostatistics were fairly straightforward in this case. GEMNET

II required 50 minutes to process the samples and block model centroids, develop the


191

networks and perform grade estimation. The geostatistical study required about 3

hours to complete. The time spent for grade estimation using inverse distance and

kriging, once the geostatistical study was complete, was about 15 minutes. Even

though there is a difference, this study is not ideal for demonstrating the benefits from

the speed of GEMNET II application. The difference in time requirements between

geostatistics and GEMNET II will be demonstrated in the following case studies

which present a much more complicated structural picture.

In the second part of the study, the techniques were tested using gold grades

from the same samples. The time requirements were identical to the first part. The

errors produced were quite similar as well: 18.78% for GEMNET II, 22.47% for

inverse distance squared, and 20.47% for spherical kriging. Figure 8.4 shows the data

fit diagram of the estimates and Figure 8.5 the estimated and actual gold grade

distributions.

Gold Grade Values Data Fit

0

1

2

3

4

5

6

7

0 1 2 3 4 5 6 7

Actual

Estim

ated ID2

Kriging

GEMNet II

Figure 8.4: Scatter diagram of actual vs. estimated gold grades from copper/gold deposit 1.


192


0

2

4

6

8

10

12

14

16

18

20

0.8 1.2 1.6 2 2.4 3 4 5 More

Bin

Freq

uenc

y ActualID2KrigingGEMNet II

Figure 8.5: Gold grade distributions from copper/gold deposit 1.

Quite clearly, GEMNET II tends to underestimate high-grade samples once again,

even though this time it seems to be doing a bit better than in the case of copper

grades. Generally the behaviour of the three estimators is very similar for both the

estimated grades, copper and gold. The following table shows the estimated average

grades for gold and copper.

Table 8.3: Actual and estimated average copper and gold grades from copper/gold deposit 1. Actual ID2 Kriging GEMNet II

Au 2.34 2.54 2.26 1.96

Cu 4.01 3.69 3.72 3.41

The estimation performance of GEMNET II can be monitored through the

validation tools developed within VULCAN. These are mainly the reliability indicator

and module index. The RBF centers visualization tool also provides some insight to

the process of neural network development for grade estimation in GEMNET II. The


193

following figures illustrate sections through the block model of the copper/gold

deposit in this case study coloured according to the reliability indicator (Fig. 8.6), and

the module index (Fig. 8.7). Figure 8.8 also illustrates the positions of RBF centers

from various networks in their respective input pattern space.


indicator values for the gold grade estimation of copper/gold deposit 1.


194


index for gold and copper grade estimation of copper/gold deposit 1.

From sections like those in Fig. 8.6 one can identify areas where the

estimation process with GEMNET II is problematic. These areas are usually close to

the edges of the modeled orebody or around faults and other discontinuities. In this

case the low reliability area is indicated at the middle part of the orebody. This was

expected before the estimation process due to a dyke that is intersecting the orebody

and exactly the same location. The sections of Fig. 8.7 show which module is

responsible for providing the estimate and can help optimize the estimation process in

conjunction with the reliability indicator sections.


195

Figure 8.8: RBF centers locations and training patterns from module 1 networks, north (top)

and east (bottom).


196

The visualization of RBF centers from various networks can help to understand how

the system performs grade estimation and in particular how it clusters the training

patterns. A good spread of the centers in the input space, as in this case study, means

that the neural network development is responding properly to the data at hand.


grade estimates for the copper/gold deposit 1.


197

The results from grade estimation are shown in the Fig. 8.9 as sections through

the estimated block model. It should be noted that the real grade values for the blocks

are unknown and therefore it is not possible to compare the estimated values with the

actual. It is also of little use to compare the block estimates from the three approaches.

8.3 Case Study 2 – Copper/Gold Deposit 2 The dataset of this case study is a superset of that used in the third case study

described in chapter 6. It is a public domain set from a large undeveloped copper/gold

deposit. It consists of four orebodies as shown in Fig. 8.10.These orebodies occur in

the form of chains of lenses (fractions of the deposit) developed along shear fractures

in metasomatised host rocks, which include gneissic granites, mica schists and

metasomatites. The set contains 77 drillholes providing a total of 3600 observations

on lithology, bleaching, structure and assays. Figure 8.10 shows the drillholes

together with the lenses in the area. The networks in GEMNET II are trained and

tested on each lens individually, i.e. only samples inside the volume of a single lens

are used to train and test the networks each time.

Figure 8.10: Orebodies and drillholes from copper/gold deposit 2.


198

The data processing and control module searched the dataset for each orebody and

formed training patterns, the number of which varies from one orebody to the next.

The results of the training pattern generation as well as other information about the

data used are shown in Table 8.4 below:

Table 8.4: Samples and block model file information and training pattern generation results for copper/gold deposit 2.

Orebody TQ1 TQ1A TQ2 TQ3

Samples Included 689 382 133 484

Blocks Included 8,003 8,188 2,040 16,596

Patterns – WE 38,023 3,086 649 7,842

Patterns – NS 16,117 2,829 283 99

Patterns - UL 9,514 7,182 897 12,342

It becomes clear from the table above that the higher the number of the available

drillhole samples the higher the number of training patterns produced. This however

depends on the sampling geometry and can vary between sectors. In the case of the

orebody TQ1 for example, and specifically for the West-East sector, the number of

generated training patterns is fairly high (38,023). This inevitably leads to longer

training time requirements for the specific networks. In fact in some cases the time

requirements are so high that it is practically impossible to train the networks with all

of the available patterns. This is the reason why the data are filtered by a distance

criterion, i.e. percentage of maximum distance between samples. This criterion, as

introduced in the previous chapter, has nothing in common with the structural analysis

and the range values in variography. The maximum distance ranges in GEMNET II


199

are set to limit the number of training pattern per network depending on the hardware

limitations only.

GEMNET II was applied to each orebody individually. The required

development and application time varied between orebodies as did the produced mean

absolute errors on the test set. The following table shows statistical information on the

four orebodies as well as the estimation performance results from the three estimators.

Table 8.5: Statistics from copper/gold deposit 2 and estimation performance results.

Orebody TQ1 TQ1A TQ2 TQ3

Coefficient of Variance 1.0612 1.0612 1.0615 1.0611

Actual Avg. Grade 0.9109 g/t 0.9272 g/t 0.7339 g/t 1.1354 g/t

ID2 Avg. Grade 0.8571 g/t 0.8610 g/t 0.6843 g/t 1.0719 g/t

Kriging Avg. Grade 0.8577 g/t 0.8683 g/t 0.6794 g/t 1.0587 g/t

GEMNET II Avg. Grade 0.8374 g/t 0.8273 g/t 0.6271 g/t 1.0245 g/t

ID2 ABS % 22.40 % 20.68 % 31.69 % 19.85 %

Kriging ABS % 18.61 % 16.92 % 25.30 % 17.83 %

GEMNET II ABS % 15.64 % 16.73 % 25.51 % 14.92 %

Grade estimation by GEMNET II lasted over one hour for the first orebody and similar

times for the other three. The time spent on the geostatistical study was a bit more

difficult to calculate as the author spent days to complete the variography and perform

kriging and inverse distance. The geostatistical study was carried out once for the

entire deposit. The results of estimation from the three methods are shown in the

following figures (Fig. 8.11 to 8.18).


200

Gold Grades Data Fit

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

Actual

Estim

ated ID2

KrigingGEMNet II

Figure 8.11: Scatter diagram of actual vs. estimated gold grades from zone TQ1 of

copper/gold deposit 2.


0

10

20

30

40

50

60

70

80

90

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 More

Bin

Freq

uenc


Figure 8.12: Gold grade distributions from zone TQ1 of copper/gold deposit 2.


201

All three methods perform well. Inverse distance is producing a very smooth

distribution of grades, while kriging and GEMNET II try to follow the peaks a bit

better. GEMNET II also tends to underestimate high-grade samples, something that

has been quite consistent through the various studies.


0

0.5

1

1.5

2

2.5

0 0.5 1 1.5 2 2.5

Actual

Estim

ated ID2

KrigingGEMNet II

Figure 8.13: Scatter diagram of actual vs. estimated gold grades from zone TQ1A of



0

5

10

15

20

25

30

35

40

45

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 More

Bin

Freq

uenc


Figure 8.14: Gold grade distributions from zone TQ1A of copper/gold deposit 2.


202


0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

Actual

Estim

ated ID2

KrigingGEMNet II



Gold Grade Distributions

0

2

4

6

8

10

12

14

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 More

Bin

Freq

uenc



In the TQ2 zone, GEMNET II presents severe underestimation of high-grade samples

and overestimation of average grade samples. Inverse distance fails to follow the

actual distribution while kriging seems to be performing better overall. This zone is


203

quite different from the other three in that it has very few samples and low average

grade.

Gold Grade Estimates Fit

0

0.5

1

1.5

2

2.5

0 0.5 1 1.5 2 2.5

Actual

Estim

ated ID2

KrigingGEMNet II




0

10

20

30

40

50

60

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 More

Bin

Freq

uenc




204

It is quite notable from the diagrams presented that all four zones seem to have a very

difficult distribution of gold grades. The distribution graphs are all split in two areas

with a very low population around 1.2 g/t. This could be due to the geological

modelling as the samples are selected within the modelled orebodies. If these extend

between two or more actual geological zones then each of the four datasets could

include samples from two or more different populations. As an effect of that, the

expected good performance from all three approaches is not realised and the produced

absolute errors are quite high.

The estimation process with GEMNET II was validated using the same tools as

in the previous case study. The following figures show slices through the block model

coloured according to the reliability indicator, module index, and estimated grades.

Screenshots from the RBF centre location tool are also shown. It should be noted that

the block model shown includes all four zones, which can be identified by the sub-

blocking.


205

Figure 8.19: Plan section (top) and cross section (bottom) of block model coloured by

reliability indicator values for the gold grade estimation of copper/gold deposit 2.


206


module index for gold and copper grade estimation of copper/gold deposit 2.


207

Figure 8.21: RBF centers locations and training patterns from module 1 network north (top)

and module 2 network (bottom) in copper/gold deposit 2.


208




209

The results from grade estimation are shown in Fig. 8.22 as sections through

the estimated block model. Once again the real grade values for the blocks are

unknown and therefore it is not possible to compare the estimated values with the

actual. The time requirements for the three approaches were significantly different in

this case study. The complete geostatistical study including all four zones lasted over

a week. In this time, the author spent time driving the software and examining the

results. On the other hand, GEMNET II required a total of about eight hours to

develop the networks and complete the grade estimation. Quite clearly, the advantage

of GEMNET II in time requirements is significant and more importantly the results

from GEMNET II did not depend on the author’s knowledge of the given dataset.

This case study has also shown the importance of geological modelling in the

process of grade estimation. If the samples selected as input information for the

estimation process are not part of the same geological domain, none of the techniques

will be able to perform well. GEMNET II is not meant to replace the very important

stage of geological modelling.

8.4 Case Study 3 – Copper/Gold Deposit 3 This case study is very similar to the previous one in that the deposit consists of

several (five) orebodies. These orebodies come in the form of almost parallel veins.

The models of the orebodies have been constructed in VULCAN as part of a

geological study. The five orebodies and the associated drillholes are shown in Fig.

8.23.


210

Figure 8.23: Plan and side views of copper/gold deposit 3 orebodies. Drillholes and extents

of block model are also shown.

The pattern generation process for the development of neural networks had to be

adjusted for the elongated shape of the orebodies and the drilling scheme.

Specifically, the patterns for module two networks in the east-west direction had to be

limited to those within 10% of the maximum distance between samples. This was

necessary, as the total number of possible patterns was too high for the hardware-

software combination (more than 100,000 patterns). The following table gives

information about the pattern generation process for the five zones.


211

Table 8.6: Samples and block model file information and training pattern generation results

for copper/gold deposit 3.

Zone TQ1 TQ1A TQ3 TQ4 TQ7

Samples 1,912 829 1,144 534 330

Blocks 4,280 2,425 4,291 344 2,254

Patterns WE 6,018 1,013 100 94 744

Patterns NS 9,244 1,573 239 335 348

Patterns UL 4,945 1,865 4,908 1,288 585

All three methods were tested on the copper grade data from the five zones. It was not

possible to test their performance on gold grades due to problems with the specific

drillhole database. In the graphs following below, the data fit and estimated

distributions are shown as before. The output of the validation tools for GEMNET II is

given at the end of the case study and for the entire block model. The results from the

five zones are given in the following table. Again, the geostatistical study for the

entire deposit took at least a week to complete, while GEMNET II required about 12

hours to complete the estimation of copper grades.

Table 8.7: Statistics from copper/gold deposit 3 and estimation performance results.

Orebody TQ1 TQ1A TQ3 TQ4 TQ7

Actual Avg. Cu Grade 1.0187 1.0309 1.1006 0.7798 0.6468

ID2 Avg. Cu Grade 0.9963 1.0327 1.0451 0.7580 0.6083

Kriging Avg. Cu Grade 1.0096 0.9623 1.0540 0.7487 0.5744

G. II Avg. Cu Grade 1.0196 0.9498 1.0072 0.7654 0.6594

ID2 ABS % (Cu) 17.96 % 16.12 % 17.70 % 23.64 % 16.23 %

Kriging ABS % (Cu) 14.73 % 15.00 % 14.77 % 14.74 % 14.61 %

GEMNET II ABS % (Cu) 16.32 % 17.16 % 14.67 % 12.80 % 12.39 %


212

From the above table it is clear that the three techniques are performing better than in

the previous case study, even though there are certain similarities between the two

specially in the geological modelling and drilling scheme. The improvement in

performance can be associated with a much better geological model, which means

better separation of the sample groups between the five zones.

The following graphs and slices through the deposit’s block model are

grouped by zone starting with zone TQ1. As before, the data fit and distribution

graphs are given first and following are the block model slices showing the reliability

indicator, module index, and estimated grade values.

Copper Grades Data Fit - TQ1

0

0.5

1

1.5

2

2.5

3

3.5

4

0 0.5 1 1.5 2 2.5 3 3.5 4

Actual

Estim

ated ID2

KrigingGEMNet II

Figure 8.24: Scatter diagram of actual vs. estimated copper grades from zone TQ1 of



213

Copper Grade Distribution - TQ1

0

10

20

30

40

50

60

70

80

0.5 1 1.5 2 2.5 3 3.5 4 More

Bin

Freq

uenc


Figure 8.25: Copper grade distributions from zone TQ1 of copper/gold deposit 3.

Copper Grades Data Fit - TQ1A

0

0.5

1

1.5

2

2.5

3

3.5

0 0.5 1 1.5 2 2.5 3 3.5

Actual

Estim

ated ID2

KrigingGEMNet II

Figure 8.26: Scatter diagram of actual vs. estimated copper grades from zone TQ1A of copper/gold deposit 3.


214


0

5

10

15

20

25

30

35

40

0.5 1 1.5 2 2.5 3 3.5 More

Bin

Freq

uenc


Figure 8.27: Copper grade distributions from zone TQ1A of copper/gold deposit 3.

Copper Grades Data Fit

0

0.5

1

1.5

2

2.5

0 0.5 1 1.5 2 2.5Actual

Estim

ated ID2

KrigingGEMNet II




215


0

5

10

15

20

25

30

35

40

45

0.5 1 1.5 2 2.5 3 3.5 More

Bin

Freq

uenc




0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6

Actual

Estim

ated ID2

KrigingGEMNet II




216


0

5

10

15

20

25

30

35

40

0.5 0.7 0.9 1.1 1.3 1.5 More

Bin

Freq

uenc




0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6

Actual

Estim

ated ID2

KrigingGEMNet II




217


0

1

2

3

4

5

6

7

0.2 0.4 0.6 0.8 1 1.2 1.4 More

Bin

Freq

uenc

y Actual

ID2

Kriging

GEMNet II


From the above graphs it is clear that GEMNET II tends underestimate high-

grade samples in the three high-grade zones (TQ1, TQ1A, TQ3) while it shows some

overestimation of the low-grade samples in the low-grade zone TQ7. Its performance

is consistent through the rest of the distribution. Generally, it shows to be less affected

by high-grade samples than the other two techniques, which can prove very useful.

GEMNET II is meant to be a robust technique that can accept data of unknown quality

and still provide sensible results. Its performance is verified by the absolute errors,

which were always close to if not better than those of kriging.

The underestimation of high-grade samples can also be explained by the

geometry of the zones. This geometry controls the number of samples being selected

as neighbours for the networks of module two in GEMNET II. As the zones are fairly

narrow and long, some of the sectors are consistently empty and the network of

module one provides the estimate for those. This network, as was explained before,


218

tends to give estimates close to the overall average grade and hence the

underestimation of high-grade samples.

Inverse distance weighting performed exceptionally well in this case study

compared to both kriging and GEMNET II, considering how simple the method really

is. However, it was benefited by a complete geostatistical study that improved the

search method for the sample selection. The performance of kriging was once again

very good as in the previous studies.

The following figures show slices through the block model of the copper/gold

deposit 3 coloured by the reliability indicator, module index, and estimated copper

grade values.


reliability indicator values for the copper grade estimates of copper/gold deposit 3.


219

It should be noted that the block model was modified to reflect the geological

environment of the deposit as modelled by the geologist (not the author!). For this

reason there are blocks that have been deleted from the model as shown in the figures.

The block model consists of major blocks and sub-blocks inside them that follow

better the zones and other geological entities. As the estimation only takes place

inside the zones, the blocks outside them retain the default values, which is why the

majority of the blocks in the slices have the same colour.


module index values for the copper grade estimates of copper/gold deposit 3.

As explained before, the zones are very narrow and the system is choosing the module

one network (red blocks) in most cases. In the cases where module two networks are


220

also used, the reliability indicator shows disagreement between the individual

estimates. This is caused mainly by the module one network that still contributes to

the final estimate by filling empty sectors.

Figure 8.36: Plan section (top) and cross section (bottom) of block model coloured by copper


8.5 Case Study 4 – Copper/Gold Deposit 4 The final case study for GEMNET II tested its limits in terms of the speed and

computational overhead. The dataset used is relatively large at least for a case study

(over 30,000 samples!). It is very different from the previous three not only in the

number of samples but also in the sampling density and the complexity of the

orebody. As shown in Fig. 8.37, it is a massive copper/gold deposit that has

undergone an extensive exploration programme.


221

Figure 8.37: Orebody, drillholes, and block model extents from copper/gold deposit 4.

The dataset used in this case study consists of only the underground drillholes as these

were intersecting the orebody. There are over 300 underground drillholes from

existing underground workings that were used to delineate the orebody and prove the

reserves.

As expected, this was the longest case study in terms of the time required by

GEMNET II to complete the estimation process. More specifically, GEMNET II

required a total of 14 hours for the entire process. It is quite interesting though to give

the breakdown of this time period as to the various processes involved. The

generation of training patterns for the neural networks took most of this time (10


222

hours!). Processing of the block model require around one hour and a half, and the

actual estimation process only 30 minutes. The time requirements in this case study

are very similar to those from other neural network applications that involve large

amount of data.

The author did not perform the geostatistical study. It has been performed by a

geologist at Maptek who has definitely done a better job than the author could have

done himself. Unfortunately there is no information on the time spent during the

geostatistical study, but the author believes that it would be at least a matter of days.

The following table summarises the results from the application of all three

techniques to the data of this case study. Once again, it should be noted that inverse

distance weighting has benefited from the geostatistical study that improved

significantly the results obtained with this technique.

Table 8.8: Summary of estimation results from copper/gold deposit 4. Average Grade ABS Error %

Actual 4.1316 -

Inverse Distance Weighting 3.9264 19.78 %

Kriging 3.9014 14.46 %

GEMNet II 3.8907 15.04 %

The performance of the three estimators becomes clearer by examining the

data fit and distribution graphs given in the following figures. All three techniques

performed well.


223


0

2

4

6

8

10

12

0 2 4 6 8 10 12

Actual

Estim

ated ID2

KrigingGEMNet II

Figure 8.38: Scatter diagram of actual vs. estimated gold grades from copper/gold deposit 4.


0

100

200

300

400

500

600

2 4 6 8 10 More

Bin

Freq

uenc


Figure 8.39: Gold grade distributions from copper/gold deposit 4.


224

Quite clearly, GEMNET II is performing well in low and average grades with some

underestimation of high-grade samples, which however is less than in the previous

cases.

Unlike the previous case studies, slices through the block model are given in

3D view illustrating the capabilities of the graphical environment (Envisage) and the

benefits from the integration of GEMNET II. As usual, the block model slices are

coloured according to the reliability indicator, module index, and estimated gold

grade values.

Figure 8.40: 3D view of sections through the block model coloured by the reliability

indicator values from copper/gold deposit 4. Orebody model is also shown.


225

Figure 8.41: 3D view of block model sections coloured by module index values from


Figure 8.42: 3D view of block model sections coloured by estimated gold grades from



226

8.6 Conclusions The case studies in this chapter have demonstrated the use of GEMNET II as an

integrated grade estimation system. The case studies were presented in order of

increasing complexity and were chosen to illustrate the usability as well as the

performance of GEMNET II. They were also chosen to provide the basis for

comparison with the two most popular advanced estimation techniques, inverse

distance weighting and kriging.

The four case studies were completed without major problems especially in

the application of GEMNET II. The author did not use any additional information or

knowledge for its application other that the grades provided by the drillhole samples.

The same cannot be said for the other two methods that required a geostatistical study

that normally took days to complete. On the other hand, the author did make use of

the validation tools developed for GEMNET II to examine the system’s operation and

draw conclusions as to potential problems. These validation tools were used to fine-

tune the modular neural network architecture that comprises the core of GEMNET II.

The benefits of integration with VULCAN, the resource-modelling package,

were also demonstrated. The advanced graphical functions provided by its graphical

environment, Envisage, allowed the visualisation of the results from GEMNET II and

the development and use of specialised validation tools.

In all four case studies, GEMNET II performed very well even in comparison

with the other already established techniques. The results obtained have shown that it

is a reliable and fast grade estimation system. GEMNET II has shown its potential as a

valid alternative that can handle large amounts of data quickly and without being

prone to extreme values. However, these case studies have clearly demonstrated that


227

GEMNET II, as any other advanced grade estimation system, still shows dependency

on the results of geological modelling.

Finally, it should be noted once again that the case studies presented have been

limited by the fact that most of the available to the author data were confidential.

GEMNET II has been applied with success to a number of other deposits including

potash, zinc, and iron ore deposits. Unfortunately, the author did not have the right to

publish the results from these studies.

Conclusions and Further Research

185

9. Conclusions and Further Research

9.1 Conclusions Grade estimation is the most computationally intensive stage of a mineral deposit

evaluation. It is also one of the most critical ones, as the results obtained at this stage

will determine to a great extent the profitability of a mining project. In other words,

decisions that involve large amounts of financial resources are depending on the

results of the grade estimation process.

Grade estimation is mainly a process of interpolation from exploration data.

There is a high cost associated with exploration in mineral deposits and for this

reason, the amount of available data is usually low in comparison to the required

estimated area.

Depending on the complexity of the given deposit and the required accuracy

of the estimates, different techniques are currently used with the most advanced being

the techniques provided by the geostatistical methodology. These techniques have

been developed to reflect the geological picture of the deposit in space. When

effectively applied, these techniques can provide very accurate results.

However, the geostatistical methodology, being very complex, requires

knowledge and expertise to be effectively applied. This knowledge and expertise is

often not present and in some cases the people who apply geostatistics have

insufficient experience in the field. As a result, the grade estimates produced are not

accurate and the mining industry very often doubts the reliability of the method. It is

generally accepted by geostatisticians that given the same data, different people will

almost certainly produce different estimates using geostatistics.

Geostatistics is also based on assumptions for the distribution of grades, which

in many cases are acceptable. There are, however, deposits where the required


186

assumptions cannot be made. Unfortunately, there are cases when people who apply

geostatistics do not consider this fact. In some cases it is also very difficult to

understand if the required assumptions can be valid for the given deposit. The above

problems have led scientists in search for alternative methods.

In recent years the application of Artificial Intelligence (AI) tools in the

mining industry has become more common, especially in system control applications

and decision-making. One of the most important AI tools, Artificial Neural Networks

(ANNs), has been applied with success to problems that involve large amounts of data

of unknown quality.

ANNs are computing structures based on simplified models of biological

neural systems such as the human brain. They develop solutions to problems by

‘learning’ a required response from examples. One of the problems that ANNs are

very successful in providing solutions is function approximation. Grade estimation

can be considered as a problem of approximating an unknown function from

examples provided from exploration data.

There are different ways of forming examples for training ANNs from

exploration data. Examples usually come in the form of input-output patterns, with the

output being the modelled variable. In the case of grade estimation, inputs can be the

sample co-ordinates in space, or other measurements and the output is normally the

grade. The choice of input parameters dictates the vector space where the grade will

be approximated. This choice is essential for the estimation process using ANNs.

The choice of a type of ANN for grade estimation is limited to those following

the supervised paradigm explained above. There are two main candidate types of

ANNs for function approximation problems. These are the Multi-Layered Perceptron

(MLP) and the Radial Basis Function network (RBF). The RBF network seems to be


187

a better choice for grade estimation as it constructs local approximations as opposed

to global approximations of the MLP. It is generally accepted that grade is a localised

variable and therefore RBF networks are ideal for its estimation.

The objectives of the research presented in this thesis were to go a step further

into the development of a neural network based estimation techniques than other

researchers have done in the past. More specifically, the developed ANN based

system for grade estimation should be able to handle 3D exploration data from real

deposits and perform estimation on 3D block model basis. The estimation process

itself should honour the distribution of grades in 3D space and take into account the

spatial variability of grades in different directions in space.

The developed system, GEMNET II, is integrated with one of the leading

packages for earth resources modelling, VULCAN. The potential benefits of

integration were exploited to the maximum extent. GEMNET II takes advantage of

VULCAN’s graphical environment and capabilities to present its estimation results in

3D. A number of validation tools measuring the reliability of the produced estimates

as well as showing useful information on the estimation process itself have been

developed using these graphical capabilities. As a result, GEMNET II is not just an

ANN based interpolator but a complete system for grade estimation that can be

integrated in the larger mine planning and design process.

The reliability and estimation performance of GEMNET II has been verified by

a number of case studies, some of them presented in this thesis. From the results

obtained in most of these studies and in comparison to results obtained using

geostatistics, it becomes clear that GEMNET II is a valid alternative that can turn the

great potential of ANNs in the field of grade estimation to a complete system.


188

GEMNET II is user-independent, i.e. its results do not depend on user input or

modifications of the estimation technique. This however does not mean that the user

has no control over the estimation process, or that GEMNET II can override the

geological modelling that should precede any grade estimation. GEMNET II still

depends on the given data – in fact its performance depends solely on them. The user

can therefore improve its performance by controlling the data used for building the

examples for training the networks in GEMNET II. The validation tools provided by

the system, can aid the user in this task by indicating areas where GEMNET II is

facing difficulties in giving accurate results.

The work presented in this thesis shows that ANNs can be used to develop

solutions for grade estimation problems and that ANNs as approximators do not lack

the mathematical background, a misconception done by many people in the mining

industry. ANNs have a very rich theoretical background that spans over many

different scientific fields. Their application in a problem like grade estimation is not a

‘black box’ approach, i.e. their results and overall operation can be validated and

justified. GEMNET II is a good example of how ANNs can be successfully used to

develop a grade estimation solution.

9.2 Further Research Artificial Neural Networks is a rapidly evolving field. This means that there is an

almost constant development of new architectures and learning algorithms. Therefore,

there will always be new ANNs to try for the problem of grade estimation. Regarding

GEMNET II, there have been many improvements to the standard Radial Basis

Function Network since the beginning of the work presented in this thesis. The most

important ones concern the intelligent control of the number of RBFs required for a

given problem as well as the design and adaptation of their shape. There is no reason


189

why the hyper-spherical shape of the RBFs is ideal for all problems. Many researchers

had tried other basis functions like the rectangular basis functions.

As the architecture of the GEMNET II is modular, i.e. consists of several

neural networks, there will always be space for improvement. The number of

networks, the 3D search method, and most importantly the way the individual

estimates are used to form a single estimate can be areas of further work. The author

suggests that a more flexible search method that adapts to the given sampling

geometry could improve the estimation performance as well as the speed of training

pattern generation. A new search method of course would result to a change of the

number of networks – varying number of sectors leads to varying number of networks

trained on sector data.

The way that individual estimates are used to generate a single estimate for

each point can also be further optimised. In GEMNET II, an RBF network is

responsible for averaging the individual estimates but this is not necessarily the only

way for achieving this. New networks can be tested and perhaps another solution can

be found that will not be based on ANNs.

The effect of using various ANN modules on the block model estimates needs

to be examined. The use of different ANNs for different blocks can be a source of

inconsistencies in the estimates produced and can possibly introduce a bias.

The author believes that using GEMNET II and especially the validation tools

provided can help in investigating ways of improving the system. The integration with

VULCAN can be taken even further. Direct access of the block model, and possibly

allowing the use of grid models as the basis of grade estimation will significantly

increase the speed of the system.


190

The use of a neural network simulator like SNNS helped in the development

stages of GEMNET II. Once the architecture is finalised, there is no reason why the

system should still depend on a simulator for the development of neural networks.

Including the network code into the core of the system will have significant effects on

the speed of training and application of the networks. However, this should not be

done to the expense of the flexibility to change critical parameters of the learning

algorithm or the RBFs.

The validation tools can be further developed to include options for more

accurate measurement of the estimates’ reliability, such as confidence intervals. An

indication of when the networks are extrapolating would also be useful to indicate

areas where the sampling is insufficient.

The performance of GEMNET II in terms of the block model estimates needs

to be investigated. As it is very difficult to have the actual block model grades from

real deposits, other cases should be examined such as simulated deposits in order to

examine the behaviour of the system while estimating volumes larger than those of

drillhole samples. The effect of the sample support input to the system needs further

investigation.

Finally, it should be noted that a system such as GEMNET II based on artificial

neural networks will require time to gain the acceptance of the mining industry. One

should not forget how difficult it was and how much time it required for geostatistics

to be established and widely used three decades ago. Allowing as many people as

possible to experience the use of GEMNET II and make their own conlcusions is the

only way to establish it as a valid alternative method for grade estimation and

probably the best way towards further improvements.

Appendix A – File Structures

239


A1. SNNS Network Description File SNNS network definition file V1.4-3D generated at Tue Sep 28 11:56:29 1999 network name : east source files : no. of units : 44 no. of connections : 160 no. of unit types : 0 no. of site types : 0 learning function : RadialBasisLearning update function : Topological_Order unit default section : act | bias | st | subnet | layer | act func | out func ---------|----------|----|--------|-------|------------------|------------- 0.00000 | 0.00000 | h | 0 | 1 | Act_RBF_Gaussian | Out_Identity ---------|----------|----|--------|-------|------------------|------------- unit definition section : no. | typeName | unitName | act | bias | st | position | act func | out func | sites ----|----------|----------|----------|----------|----|----------|--------------|----------|------- 1 | | Grade | 0.02936 | 0.00000 | i | 2, 2,72 | Act_Identity | | 2 | | Distance | 0.44031 | 0.00000 | i | 3, 2,72 | Act_Identity | | 3 | | Length | 0.00646 | 0.00000 | i | 4, 2,72 | Act_Identity | | 4 | | c1 | 0.97190 | 0.93847 | h | 1, 7,68 ||| 5 | | c2 | 0.87191 | 0.85472 | h | 2, 7,68 ||| 6 | | c3 | 0.96511 | 0.70236 | h | 3, 7,68 ||| 7 | | c4 | 0.85429 | 0.89278 | h | 4, 7,68 ||| 8 | | c5 | 0.87341 | 0.91482 | h | 5, 7,68 ||| 9 | | c6 | 0.83433 | 0.94826 | h | 1, 7,69 ||| 10 | | c7 | 0.85495 | 0.89164 | h | 2, 7,69 ||| 11 | | c8 | 0.85175 | 0.90161 | h | 3, 7,69 ||| 12 | | c9 | 0.85231 | 0.88927 | h | 4, 7,69 ||| 13 | | c10 | 0.82321 | 0.95233 | h | 5, 7,69 ||| 14 | | c11 | 0.84729 | 0.88713 | h | 1, 7,70 ||| 15 | | c12 | 0.85888 | 0.90323 | h | 2, 7,70 ||| 16 | | c13 | 0.83049 | 0.85461 | h | 3, 7,70 ||| 17 | | c14 | 0.86441 | 0.89943 | h | 4, 7,70 ||| 18 | | c15 | 0.85100 | 0.88699 | h | 5, 7,70 ||| 19 | | c16 | 0.84793 | 0.92424 | h | 1, 7,71 ||| 20 | | c17 | 0.84574 | 0.86460 | h | 2, 7,71 ||| 21 | | c18 | 0.85321 | 0.89859 | h | 3, 7,71 ||| 22 | | c19 | 0.81511 | 0.96778 | h | 4, 7,71 ||| 23 | | c20 | 0.85326 | 0.90119 | h | 5, 7,71 ||| 24 | | c21 | 0.86209 | 0.88879 | h | 1, 7,72 ||| 25 | | c22 | 0.86017 | 0.88878 | h | 2, 7,72 ||| 26 | | c23 | 0.85986 | 0.89859 | h | 3, 7,72 ||| 27 | | c24 | 0.86807 | 0.93349 | h | 4, 7,72 ||| 28 | | c25 | 0.86555 | 0.89128 | h | 5, 7,72 ||| 29 | | c26 | 0.87574 | 0.92786 | h | 1, 7,73 ||| 30 | | c27 | 0.86566 | 0.90208 | h | 2, 7,73 ||| 31 | | c28 | 0.87349 | 0.87303 | h | 3, 7,73 ||| 32 | | c29 | 0.95634 | 0.91290 | h | 4, 7,73 ||| 33 | | c30 | 0.99132 | 0.96239 | h | 5, 7,73 ||| 34 | | c31 | 0.54082 | 0.81074 | h | 1, 7,74 ||| 35 | | c32 | 0.85517 | 0.90117 | h | 2, 7,74 |||


240

36 | | c33 | 0.82977 | 0.97302 | h | 3, 7,74 ||| 37 | | c34 | 0.86362 | 0.92169 | h | 4, 7,74 ||| 38 | | c35 | 0.87332 | 0.93760 | h | 5, 7,74 ||| 39 | | c36 | 0.84698 | 0.92475 | h | 1, 7,75 ||| 40 | | c37 | 0.89793 | 0.92879 | h | 2, 7,75 ||| 41 | | c38 | 0.92686 | 0.90759 | h | 3, 7,75 ||| 42 | | c39 | 0.84971 | 0.90679 | h | 4, 7,75 ||| 43 | | c40 | 1.00000 | 0.67025 | h | 5, 7,75 ||| 44 | | Target | 0.10965 | 0.42555 | o | 3,12,72 | Act_Logistic | | ----|----------|----------|----------|----------|----|----------|--------------|----------|------- connection definition section : target | site | source:weight -------|------|--------------------------------------------------------------------------------------------------------------------- 4 | | 1: 0.10449, 2: 0.59754, 3: 0.00881 5 | | 1: 0.07599, 2: 0.04265, 3: 0.01435 6 | | 1: 0.06131, 2: 0.21773, 3: 0.00430 7 | | 1: 0.03022, 2: 0.02039, 3: 0.01435 8 | | 1: 0.00691, 2: 0.05634, 3: 0.00287 9 | | 1: 0.15199, 2: 0.02084, 3: 0.01076 10 | | 1: 0.04750, 2: 0.02152, 3: 0.00001 11 | | 1: 0.06909, 2: 0.02042, 3: 0.01578 12 | | 1: 0.03022, 2: 0.01647, 3: 0.01435 13 | | 1: 0.18998, 2: 0.01791, 3: 0.01435 14 | | 1: 0.17349, 2: 0.03287, 3: 0.01004 15 | | 1: 0.03800, 2: 0.03009, 3: 0.01435 16 | | 1: 0.23143, 2: 0.02020, 3: 0.00861 17 | | 1: 0.17349, 2: 0.06453, 3: 0.01004 18 | | 1: 0.11744, 2: 0.02308, 3: 0.01435 19 | | 1: 0.01036, 2: 0.01827, 3: 0.00646 20 | | 1: 0.15026, 2: 0.01704, 3: 0.00574 21 | | 1: 0.05354, 2: 0.02071, 3: 0.01004 22 | | 1: 0.20812, 2: 0.01691, 3: 0.00359 23 | | 1: 0.04836, 2: 0.02118, 3: 0.01435 24 | | 1: 0.03627, 2: 0.03181, 3: 0.00003 25 | | 1: 0.08981, 2: 0.03318, 3: 0.01435 26 | | 1: 0.18048, 2: 0.05929, 3: 0.01004 27 | | 1: 0.13126, 2: 0.06458, 3: 0.00859 28 | | 1: 0.02159, 2: 0.04027, 3: 0.05021 29 | | 1: 0.02159, 2: 0.06224, 3: 0.00574 30 | | 1: 0.05354, 2: 0.04115, 3: 0.00716 31 | | 1: 0.15026, 2: 0.06573, 3: 0.00574 32 | | 1: 0.11744, 2: 0.23762, 3: 0.01435 33 | | 1: 0.09499, 2: 0.37245, 3: 0.01865 34 | | 1: 0.90000, 2: 0.44864, 3: 0.01548 35 | | 1: 0.12867, 2: 0.03576, 3: 0.01578 36 | | 1: 0.15285, 2: 0.02016, 3: 0.00359 37 | | 1: 0.15976, 2: 0.06343, 3: 0.01291 38 | | 1: 0.11054, 2: 0.06900, 3: 0.00717 39 | | 1: 0.22884, 2: 0.06650, 3: 0.01433 40 | | 1: 0.18998, 2: 0.14022, 3: 0.01435 41 | | 1: 0.06304, 2: 0.15300, 3: 0.00430 42 | | 1: 0.10190, 2: 0.02282, 3: 0.00001 43 | | 1: 0.02936, 2: 0.44031, 3: 0.00646 44 | | 4: 1.97859, 5:49.71570, 6:59.90763, 7:-37.35740, 8:-16.30947, 9:37.28542, 10:-39.35506, 11:12.70683, 12:-30.57939, 13:22.76179, 14:-13.37884, 15:-13.06231, 16:-15.15001, 17: 5.64048, 18:-39.55381, 19:50.59846, 20:-33.14001, 21:-9.88406, 22:22.56762, 23:14.93855, 24:39.54713, 25:32.85326, 26:-6.63768, 27:-38.57726, 28:12.00233, 29:-22.77133, 30:-3.05694, 31:42.40548, 32:-5.19528, 33:27.19900, 34:-2.25054, 35:-47.43574, 36:48.85722, 37:-50.62162, 38:-30.12116, 39:16.00671, 40:-21.60845, 41:-2.33463, 42:17.54285, 43:-39.69540 -------|------|---------------------------------------------------------------------------------------------------------------------


241

A2. SNNS Network Pattern File SNNS pattern definition file V3.2 generated at Tue Jun 16 11:15:00 1998 No. of patterns : 8618 No. of input units : 3 No. of output units : 1 0.001727 0.021182 0.002740 0.041451 0.063040 0.021057 0.010760 0.041451 0.158895 0.020829 0.014347 0.041451 0.140760 0.020625 0.009326 0.041451 0.293610 0.019989 0.012195 0.041451 0.014680 0.019852 0.008608 0.041451 0.162349 0.019708 0.014347 0.041451 0.071675 0.019544 0.014347 0.041451 0.004318 0.019400 0.014347 0.041451 0.008636 0.019272 0.014347 0.041451 0.075993 0.019161 0.014347 0.041451 0.001727 0.021308 0.002740 0.054404 0.063040 0.021180 0.010760 0.054404 0.158895 0.020946 0.014347 0.054404 0.140760 0.020736 0.009326 0.054404 0.293610 0.020079 0.012195 0.054404 0.014680 0.019935 0.008608 0.054404 0.162349 0.019785 0.014347 0.054404 0.071675 0.019613 0.014347 0.054404


242

A3. BATCHMAN Network Development Script # GEMNet II # East Module Training Procedure # Optimized 29/7/1999 # Ioannis Kapageridis 1999 print("GEMNet II - Neural Network Development") print("Module 1 - East Network") loadNet ("east\eastut.net") loadPattern("east\east.pat") loadPattern("east\eastx.pat") setPattern("east\eastx.pat") print ("Number of patterns :",PAT) trainNet() setInitFunc("Randomize_Weights") initNet() setInitFunc("RBF_Weights_Kohonen",1000.0,0.4,0.0) initNet() setInitFunc("RBF_Weights",-0.8,0.8,0.2,0.9,0.0) initNet() saveNet ("east\eastini.net") print("SSE = ",SSE) # hidden unit bias training setLearnFunc("RadialBasisLearning",0.0,0.001,0.0,0.01,0.6) while CYCLES < 500 do trainNet() endwhile print("SSE = ",SSE) # RBF centres training setLearnFunc("RadialBasisLearning",0.001,0.0,0.0,0.01,0.6) while CYCLES < 1000 do trainNet() endwhile print("SSE = ",SSE) # hidden-output layer weights training setLearnFunc("RadialBasisLearning",0.0,0.0,0.001,0.01,0.6) while CYCLES < 1250 do trainNet() endwhile print ("SSE = ",SSE, " MSE = ", MSE) loadPattern("east\eastx.pat") setPattern("east\eastx.pat") saveResult("east\east.res",1,PAT,FALSE,TRUE,"create") saveNet("east\easttr.net")


243

A4. SNNS2C Network C Code Extract /********************************************************* d:\gemnns\east\east.c -------------------------------------------------------- generated at Tue Sep 28 12:33:23 1999 by snns2c ( Bernward Kett 1995 ) *********************************************************/ #include <math.h> #define Act_Logistic(sum, bias) ( (sum+bias<10000.0) ? ( 1.0/(1.0 + exp(-sum-bias) ) ) : 0.0 ) #define Act_Identity(sum, bias) ( sum ) #define Act_RBF_Gaussian(sum2, bias) (exp(-sum2 * bias) ) #define NULL (void *)0 typedef struct UT { float act; /* Activation */ float Bias; /* Bias of the Unit */ int NoOfSources; /* Number of predecessor units */ struct UT **sources; /* predecessor units */ float *weights; /* weights from predecessor units */ } UnitType, *pUnit; /* Forward Declaration for all unit types */ static UnitType Units[45]; /* Sources definition section */ static pUnit Sources[] = { Units + 1, Units + 2, Units + 3, Units + 1, Units + 2, Units + 3, Units + 1, Units + 2, Units + 3, Units + 1, Units + 2, Units + 3, Units + 1, Units + 2, Units + 3, Units + 1, Units + 2, Units + 3, Units + 1, Units + 2, Units + 3, Units + 1, Units + 2, Units + 3, Units + 1, Units + 2, Units + 3, Units + 1, Units + 2, Units + 3, Units + 1, Units + 2, Units + 3, Units + 1, Units + 2, Units + 3, Units + 1, Units + 2, Units + 3, Units + 1, Units + 2, Units + 3, Units + 1, Units + 2, Units + 3, Units + 4, Units + 5, Units + 6, Units + 7, Units + 8, Units + 9, Units + 10, Units + 11, Units + 12, Units + 13, Units + 14, Units + 15, Units + 16, Units + 17, Units + 18, Units + 19, Units + 20, Units + 21, Units + 22, Units + 23, Units + 24, Units + 25, Units + 26, Units + 27, Units + 28, Units + 29, Units + 30, Units + 31, Units + 32, Units + 33, Units + 34, Units + 35, Units + 36, Units + 37, Units + 38, Units + 39, Units + 40, Units + 41, Units + 42, Units + 43, }; /* Weigths definition section */ static float Weights[] = { 0.104490, 0.597540, 0.008810, 0.075990, 0.042650, 0.014350, 0.061310, 0.217730, 0.004300, 0.030220, 0.020390, 0.014350, 0.006910, 0.056340, 0.002870, 0.151990, 0.020840, 0.010760, 0.047500, 0.021520, 0.000010, 0.069090, 0.020420, 0.015780, 0.030220, 0.016470, 0.014350, 0.189980, 0.017910, 0.014350, 0.173490, 0.032870, 0.010040, 0.038000, 0.030090, 0.014350, 0.231430, 0.020200, 0.008610, 0.173490, 0.064530, 0.010040,


244

1.978590, 49.715698, 59.907631, -37.357399, -16.309469, 37.285419, -39.355061, 12.706830, -30.579390, 22.761789, -13.378840, -13.062310, -15.150010, 5.640480, -39.553810, 50.598461, -33.140011, -9.884060, 22.567619, 14.938550, 39.547131, 32.853260, -6.637680, -38.577259, 12.002330, -22.771330, -3.056940, 42.405479, -5.195280, 27.198999, -2.250540, -47.435741, 48.857220, -50.621620, -30.121161, 16.006710, -21.608450, -2.334630, 17.542850, -39.695400, }; /* unit definition section (see also UnitType) */ static UnitType Units[45] = { { 0.0, 0.0, 0, NULL , NULL }, { /* unit 1 (Grade) */ 0.0, 0.000000, 0, &Sources[0] , &Weights[0] , }, { /* unit 2 (Distance) */ 0.0, 0.000000, 0, &Sources[0] , &Weights[0] , }, { /* unit 3 (Length) */ 0.0, 0.000000, 0, &Sources[0] , &Weights[0] , }, { /* unit 4 (c1) */ 0.0, 0.938470, 3, &Sources[0] , &Weights[0] , }, { /* unit 5 (c2) */ 0.0, 0.854720, 3, &Sources[3] , &Weights[3] , }, { /* unit 6 (c3) */ 0.0, 0.702360, 3, &Sources[6] , &Weights[6] , }, { /* unit 7 (c4) */ 0.0, 0.892780, 3, &Sources[9] , &Weights[9] , }, { /* unit 8 (c5) */ 0.0, 0.914820, 3, &Sources[12] , &Weights[12] , }, { /* unit 9 (c6) */ 0.0, 0.948260, 3, &Sources[15] , &Weights[15] , }, { /* unit 10 (c7) */ 0.0, 0.891640, 3, &Sources[18] , &Weights[18] , }, { /* unit 11 (c8) */ 0.0, 0.901610, 3, &Sources[21] , &Weights[21] , }, { /* unit 12 (c9) */ 0.0, 0.889270, 3, &Sources[24] , &Weights[24] , }, { /* unit 13 (c10) */


245

0.0, 0.952330, 3, &Sources[27] , &Weights[27] , }, { /* unit 14 (c11) */ 0.0, 0.887130, 3, &Sources[30] , &Weights[30] , }, { /* unit 15 (c12) */ 0.0, 0.903230, 3, &Sources[33] , &Weights[33] , }, { /* unit 16 (c13) */ 0.0, 0.854610, 3, &Sources[36] , &Weights[36] , }, { /* unit 17 (c14) */ 0.0, 0.899430, 3, &Sources[39] , &Weights[39] , }, { /* unit 18 (c15) */ 0.0, 0.886990, 3, &Sources[42] , &Weights[42] , }, { /* unit 19 (c16) */ 0.0, 0.924240, 3, &Sources[45] , &Weights[45] , }, { /* unit 20 (c17) */ 0.0, 0.864600, 3, &Sources[48] , &Weights[48] , }, int east(float *in, float *out, int init) { int member, source; float sum; enum{OK, Error, Not_Valid}; pUnit unit; /* layer definition section (names & member units) */ static pUnit Input[3] = {Units + 1, Units + 2, Units + 3}; /* members */ static pUnit Hidden1[40] = {Units + 4, Units + 5, Units + 6, Units + 7, Units + 8, Units + 9, Units + 10, Units + 11, Units + 12, Units + 13, Units + 14, Units + 15, Units + 16, Units + 17, Units + 18, Units + 19, Units + 20, Units + 21, Units + 22, Units + 23, Units + 24, Units + 25, Units + 26, Units + 27, Units + 28, Units + 29, Units + 30, Units + 31, Units + 32, Units + 33, Units + 34, Units + 35, Units + 36, Units + 37, Units + 38, Units + 39, Units + 40, Units + 41, Units + 42, Units + 43}; /* members */ static pUnit Output1[1] = {Units + 44}; /* members */ static int Output[1] = {44}; for(member = 0; member < 3; member++) { Input[member]->act = in[member]; } for (member = 0; member < 40; member++) { unit = Hidden1[member]; sum = 0.0; for (source = 0; source < unit->NoOfSources; source++) { static float diff; diff = unit->sources[source]->act - unit->weights[source];


246

sum += diff * diff; } unit->act = Act_RBF_Gaussian(sum, unit->Bias); }; for (member = 0; member < 1; member++) { unit = Output1[member]; sum = 0.0; for (source = 0; source < unit->NoOfSources; source++) { sum += unit->sources[source]->act * unit->weights[source]; } unit->act = Act_Logistic(sum, unit->Bias); }; for(member = 0; member < 1; member++) { out[member] = Units[Output[member]].act; } return(OK); }


247

A5. VULCAN Composites File * * DEFINITION * HEADER_VARIABLES 5 * COMPID C 16 0 key * CTYPE C 12 0 * DATE C 12 0 * TIME C 12 0 * DESCRP C 80 0 * VARIABLES 17 * DHID C 12 0 * MIDX F 12 3 * MIDY F 12 3 * MIDZ F 12 3 * TOPX F 12 3 * TOPY F 12 3 * TOPZ F 12 3 * BOTX F 12 3 * BOTY F 12 3 * BOTZ F 12 3 * LENGTH F 12 3 * FROM F 12 3 * TO F 12 3 * GEOCOD C 12 0 * BOUND C 12 0 * AU F 12 3 * ORE F 2 0 * * HEADER:GOLD STRAIGHT 23-Oct-98 16:40:47 Compositing Run DDFD/A7 1910.318 2088.532 1013.920 1910.649 2088.666 1014.277 1909.987 2088.398 1013.563 1.010 144.720 145.730NONE 0 1.550 0 DDFD/A7 1909.656 2088.264 1013.206 1909.987 2088.398 1013.563 1909.325 2088.131 1012.848 1.010 145.730 146.740NONE 0 3.930 0 DDFD/A7 1908.994 2087.997 1012.491 1909.325 2088.131 1012.848 1908.662 2087.863 1012.134 1.010 146.740 147.750NONE 0 1.370 0 DDFD/A7 1908.331 2087.729 1011.777 1908.662 2087.863 1012.134 1908.000 2087.595 1011.420 1.010 147.750 148.760NONE 0 2.990 0 DDFD/A7 1907.669 2087.462 1011.063 1908.000 2087.595 1011.420 1907.338 2087.328 1010.706 1.010 148.760 149.770NONE 0 1.650 0 DDFD/A7 1907.007 2087.194 1010.349 1907.338 2087.328 1010.706 1906.676 2087.060 1009.992 1.010 149.770 150.780NONE 0 1.070 0 DDFD/A7 1906.345 2086.927 1009.635 1906.676 2087.060 1009.992 1906.014 2086.793 1009.278 1.010 150.780 151.790NONE 0 1.620 0 DDFD/A7 1905.683 2086.659 1008.920 1906.014 2086.793 1009.278 1905.352 2086.525 1008.563 1.010 151.790 152.800NONE 0 2.690 0 DDFD/A7 1905.020 2086.392 1008.206 1905.352 2086.525 1008.563 1904.689 2086.258 1007.849 1.010 152.800 153.810NONE 0 4.230 0 DDFD/A7 1904.358 2086.124 1007.492 1904.689 2086.258 1007.849 1904.027 2085.990 1007.135 1.010 153.810 154.820NONE 0 1.550 0 DDFD/A7 1903.696 2085.856 1006.778 1904.027 2085.990 1007.135 1903.365 2085.723 1006.421 1.010 154.820 155.830NONE 0 9.120 0

Appendix B –Case Study Data

Appendix B – Case Study Data

B1. Case Study 1 – 2D Iron Ore Deposit Easting Northing % Fe

0 170 34.310 40 35.515 135 28.655 145 29.4

125 20 41.5175 50 36.8120 180 33.4160 175 36240 185 30.2260 115 33.2235 15 33.7365 60 34.3285 110 35.3345 115 31335 170 27.4325 195 33.9350 235 37.6290 230 39.9

10 390 27.285 380 34.250 270 30.2

200 280 30.4400 355 39.9360 335 40335 310 40.6

5 195 33.920 105 32.525 155 29.650 40 30.6

155 15 40.4145 125 30.1130 185 35.3175 185 41.4220 90 28.5205 0 40.1265 65 24.4390 65 31.6325 105 39.5310 150 34.8385 165 29.9325 220 37.8375 215 29.8200 230 37.4

55 375 27.4395 245 36.5

165 355 40.8270 285 32.9365 340 40330 320 44.1330 290 41.4

0 0 45.3100 0 30.7200 0 40300 0 33.3400 0 33.5

50 50 30.4150 50 36.7250 50 27.6350 50 34.7

0 100 37.9100 100 40.5200 100 31.8300 100 39.8400 100 35.4

50 150 32.4150 150 34.7250 150 34.4350 150 28.9

0 200 34.1100 200 31.5200 200 39.1300 200 35.5400 200 34.9

50 250 33.7150 250 35.4250 250 36.3350 250 34.5

0 300 34.9100 300 27.4200 300 27.5300 300 39400 300 32.4

50 350 26.2150 350 40250 350 29.1350 350 39.3

0 400 36.6100 400 34.6200 400 38.9300 400 37.9400 400 35.4


B2. Case Study 2 – 2D Copper Deposit Easting Northing Cu

182.88 579.12 0.175243.84 548.64 0.417335.28 548.64 0.489

67.06 487.68 0.215152.4 487.68 0.396

213.36 487.68 0.685274.32 487.68 0.377335.28 487.68 0.427

457.2 487.68 0.1491.44 426.72 0.392152.4 426.72 0.32

210.31 426.72 0.717274.32 426.72 0.806335.28 426.72 0.889396.24 426.72 0.475

152.4 365.76 0.23243.84 365.76 0.833274.32 365.76 0.453335.28 365.76 0.719396.24 365.76 1.009

457.2 365.76 0.893518.16 365.76 0.089579.12 365.76 0.092121.92 335.28 0.102335.28 304.8 0.915396.24 304.8 1.335

457.2 304.8 0.519

518.16 304.8 0.072579.12 304.8 0.04284.68 304.8 1.365220.68 304.8 0.023

152.4 274.32 0.644274.32 243.84 0.258335.28 243.84 0.638396.24 243.84 1.615

457.2 243.84 0.765518.16 243.84 0.465579.12 243.84 0.034115.82 219.46 0.476182.88 213.36 0.409274.32 182.88 0.165335.28 182.88 0.063396.24 182.88 0.406

457.2 182.88 0.909518.16 182.88 0.012

152.4 152.4 0.228274.32 121.92 0.224335.28 121.92 0.188396.24 121.92 0.027

457.2 121.92 0.395335.28 64.01 0.225


B3. Case Study 3 – 3D Gold Deposit Easting Northing Elevation Length Au 78303.29 4776.742 120.257 0.435 0.02878017.81 4631.307 93.487 0.682 0.04578303.38 4776.22 118.688 0.02 0.08978303.09 4777.861 123.62 0.02 0.10477902.53 4564.935 74.358 0.199 0.12378263.42 4744.765 95.17 0.343 0.141

77902.5 4558.01 79.926 0.589 0.15578303.34 4776.443 119.357 0.03 0.15977902.73 4564.614 72.496 0.189 0.16378018.33 4630.683 90.6 0.199 0.17278018.14 4630.912 91.658 0.02 0.1878018.53 4630.444 89.493 0.305 0.18178299.67 4784.724 92.872 0.257 0.199

78303.2 4777.247 121.773 0.03 0.20178299.8 4784.294 91.44 0.218 0.203

77902.63 4564.766 73.378 0.06 0.20778265.54 4740.429 95.17 0.857 0.21177903.05 4564.098 69.507 0.38 0.22577903.25 4563.777 67.645 0.444 0.22778264.56 4742.429 95.17 0.644 0.23477902.54 4557.89 78.833 0.208 0.23778299.74 4784.509 92.156 0.179 0.23877902.84 4564.445 71.516 0.199 0.25177902.94 4564.276 70.536 0.343 0.25477902.58 4557.775 77.79 0.462 0.25678299.52 4785.227 94.594 0.305 0.26678014.15 4644.552 62.082 0.1 0.27478014.21 4644.394 61.507 0.267 0.27778264.09 4743.395 95.17 0.218 0.27878263.03 4745.574 95.17 0.238 0.28178299.59 4785.007 93.828 0.159 0.28578220.04 4727.461 95.17 0.961 0.28778299.46 4785.448 95.36 0.371 0.2978014.27 4644.235 60.931 2.179 0.30578102.56 4676.925 153.74 0.946 0.30578263.72 4744.159 95.17 0.02 0.31377903.98 4562.514 60.492 0.904 0.32177906.55 4550.72 95.17 0.497 0.32578299.38 4785.71 96.27 0.333 0.32678014.19 4644.46 61.747 0.02 0.32777903.37 4563.583 66.518 0.333 0.33678017.97 4631.109 92.573 0.01 0.337

77903.7 4563.003 63.236 0.54 0.34977902.64 4557.615 76.25 0.343 0.36177905.67 4552.12 95.17 0.54 0.36478024.59 4634.531 95.17 0.514 0.36478174.45 4716.776 88.219 0.286 0.36777902.52 4557.944 79.33 0.11 0.368

78219.09 4730.069 95.17 0.389 0.36877903.5 4563.378 65.343 0.286 0.371

77903.15 4563.938 68.576 0.08 0.38178174.1 4717.228 89.604 0.333 0.38177903.8 4562.828 62.256 0.343 0.386

78175 4716.042 86.053 0.228 0.39878264.93 4741.665 95.17 0.199 0.41178100.83 4680.052 153.74 1.55 0.42178219.46 4729.059 95.17 0.286 0.43878220.51 4726.169 95.17 0.659 0.441

78014.1 4644.671 62.514 0.159 0.45777907.08 4549.872 95.17 0.218 0.47578014.03 4644.869 63.233 1.513 0.49778024.98 4633.71 95.17 0.371 0.49977906.02 4551.568 95.17 0.352 0.50678265.25 4741.013 95.17 0.179 0.54778220.76 4725.465 95.17 0.589 0.57177903.89 4562.671 61.374 0.13 0.59678013.93 4645.132 64.193 1.549 0.612

78365.7 4791.595 153.84 0.847 0.61878014.6 4643.391 57.862 0.352 0.623

78014.52 4643.589 58.581 2.154 0.64678103.53 4675.176 153.74 1.56 0.64677904.08 4562.323 59.414 0.565 0.65678014.65 4643.259 57.382 1.527 0.65778219.82 4728.049 95.17 0.296 0.673

78102 4677.931 153.74 1.513 0.69578366.26 4789.672 153.84 0.942 0.70678102.83 4676.444 153.74 0.882 0.71478220.25 4726.874 95.17 0.169 0.73378102.24 4677.494 153.74 0.324 0.7478014.24 4644.328 61.267 0.169 0.74178172.44 4712.58 95.17 0.745 0.74678013.66 4645.782 66.644 1.559 0.7878014.74 4643.008 56.471 1.493 0.782

78101.7 4678.478 153.74 1.558 0.78778172.1 4713.573 95.17 1.559 0.805

78014.57 4643.457 58.102 0.169 0.8178365.98 4790.633 153.84 0.802 0.81278014.38 4643.945 59.876 2.239 0.81578102.44 4677.144 153.74 0.07 0.82278365.44 4792.508 153.84 0.38 0.83778315.56 4779.559 153.84 1.872 0.85278061.19 4660.662 129.789 0.218 0.85378172.68 4711.895 95.17 0.333 0.85478366.52 4788.759 153.84 0.621 0.88778101.42 4678.981 153.74 0.471 0.91278013.82 4645.391 65.153 0.13 0.91978013.32 4646.565 69.625 0.862 0.926


78061.5 4659.817 129.789 0.589 0.93778315.08 4780.977 153.84 0.637 0.94778101.14 4679.484 153.74 0.159 0.96478013.21 4646.83 70.634 0.597 0.96678061.83 4658.924 129.789 1.547 0.96778171.71 4714.707 95.17 0.841 0.96978012.63 4627.031 129.789 1.557 0.98178171.87 4714.235 95.17 0.904 1.042

78103.19 4675.788 153.74 0.573 1.07278013.5 4646.161 68.086 1.518 1.076

78100.45 4680.73 153.74 0.435 1.13678316.04 4778.188 153.84 0.751 1.13678013.73 4645.618 66.019 0.149 1.1478013.42 4646.35 68.807 0.913 1.17878315.32 4780.268 153.84 0.724 1.196


B4. Case Study 4 – 3D Chrome Deposit Easting Northing Elevation Chromite13384.18 22298.82 663.02 7.713197.41 22053.74 702.095 23.813311.75 22093.95 715.03 20.3113311.75 22093.95 705.68 18.1713311.75 22093.95 706.88 28.8613311.75 22093.95 712.23 9.2713311.75 22093.95 708.98 26.0313382.35 22123.39 691.42 15.0513297.75 22223.5 695.37 15.63

13352 22301.75 683.46 1.0113352 22301.75 681.86 13.7413352 22301.75 713.61 7.213352 22301.75 686.16 10.9613352 22301.75 690.66 8.0413352 22301.75 692.06 16.8

13323.25 22291.75 680.34 7.2213328.31 22289.96 706.817 11.2

13333.9 22305.12 702.84 26.513333.9 22302.75 700.472 26.913333.9 22306.71 704.431 12.2

13323.25 22291.75 690.29 15.0113383.08 22294.74 670.338 22.513383.71 22297.08 666.137 15.7

13382.4 22292.2 674.884 16.613381.8 22289.96 678.911 22.9

13386.71 22315.73 701.553 14.6113361.12 22392.97 710.023 9.5513311.75 22093.95 743.53 16.2113311.75 22093.95 730.68 24.2613311.75 22093.95 751.58 14.9513311.75 22093.95 745.08 21.6213330.52 22103.94 741.882 28.813314.85 22125.05 752.94 913313.25 22119.07 759.127 33.113318.21 22137.58 739.964 3713315.14 22126.14 751.808 15.813337.66 22146.56 735.332 20.413318.38 22138.23 739.292 21.613316.42 22139.87 755.798 17.5413316.27 22139.32 756.364 10.8113337.04 22128.29 716.674 14.1913316.98 22141.98 753.606 24.2613338.19 22148.54 733.281 15.713337.92 22147.55 734.306 19.213314.53 22172.26 750.899 7.56

13314.2 22170.72 752.776 7.8113316.2 22180.15 741.285 15.5813316.9 22183.42 737.302 23.69

13341.15 22162.6 733.222 15

13340.76 22161.13 734.743 1713341.41 22163.56 732.233 15.413297.75 22223.5 758.52 9.2113297.75 22223.5 762.92 14.6513297.75 22223.5 724.32 7.5713297.75 22223.5 727.52 6.1413346.36 22221.91 722.733 11.4

13347.4 22225.79 717.945 6.313320 22262.75 727.66 6.3

13330.32 22301.27 732.779 8.1913320 22262.75 730.26 10.71

13326.77 22288.02 746.497 17.5513327.95 22292.43 741.936 19.213384.06 22296.62 755.765 12.3713385.94 22303.66 745.361 15.71

13333.9 22353.24 750.959 8.313330.49 22318.76 742.864 15.3913332.58 22326.58 734.768 10.1413331.03 22320.78 740.778 7.0513333.74 22314.04 719.556 26.3513334.17 22315.65 717.895 12.0813388.94 22314.85 728.815 18.88

13249.5 22380.69 739.71 29.5213250.75 22410.38 739.822 14.2

13249.5 22374.89 733.912 26.613250.75 22411.83 741.272 15.78

13249.5 22377.69 736.705 23.813327.27 22366 750.57 10.2

13358.8 22384.3 719.003 7.8913357.78 22380.51 722.928 8.6213357.41 22379.11 724.377 7.0113335.09 22395.16 720.376 15.3313420.48 22395.98 757.862 30.4

13396.5 22374 740.55 12.5213396.5 22374 754.35 17.4813396.5 22374 742.1 17.1913396.5 22374 748.05 17.0213396.5 22374 749.35 17.9513396.5 22374 751.1 18.31

13429.45 22400.55 731.145 8.913428.73 22401.28 730.12 7.513432.34 23380.89 731.634 16.313431.73 23378.61 734.003 14.613432.71 23382.26 730.22 11.613297.75 22223.5 767.12 15.75

References

253

References

1. Amari, S., Learning Patterns and Pattern Sequences by Self-Organising Nets of Threshold

Elements, IEEE Trans. Computers, C-21 (11), 1197-1206, November 1972.

2. Anderson, J.A., Cognitive Capabilities of a Parallel System. In: Bienenstock, E., et al [eds],

Disordered Systems and Biological Organisation, NATO ASI Series, F20, Springer-Verlag,

New York, 1986

3. Arbib, M.A. (ed), The Handbook of Brain Theory and Neural Networks. MIT Press,

Cambridge, 1995.

4. Ash, T., Dynamic Node Creation in Backpropagation Networks. ICS Report 8901, Institute of

Cognitive Science, University of California, San Diego, California, 1989.

5. Badiozamani, K., Computer Methods. – Mining Engineering Handbook

6. Barhen, J, and Reister, D., DeepNet: an Ultrafast Neural Learning Code for Seismic

Imaging. In: International Joint Conference on Neural Networks (IJCNN ’99), International

Neural Network Society and The Neural Networks Council of IEEE, Washington DC, USA,

1999.

7. Barto, A.G., Reinforcement Learning and Adaptive Critic Methods. In: White, D.A., and

Sofge, D.A., (eds), Handbook of Intelligent Control, pp. 469-491, Van Nostrand Reinhold,

New York, 1992.

8. Bischof, H., Schneider, W., and Pinz, A.J., Multispectral Classification of Landsat Images

Using Neural Networks. IEEE Transactions of Geoscience and Remote Sensing, Vol. 30, No.

3, 1992.

9. Bishop, C.M., Neural Networks for Pattern Recognition. Clarendon Press, Oxford, 1995. 60

10. Bradford, S.H., The Application of Artificial Intelligence to Mineral Processing Control.

Ph.D. Thesis, Department of Mineral Resources Engineering, University of Nottingham, 1994.

11. Broomhead, D.S., and Lowe, D., Multivariable Functional Interpolation and Adaptive

Networks. Complex Systems, Vol. 2, pp 321-355, 1988.

12. Burnett, C.C.H., Application of Neural Networks to Mineral Reserve Estimation. Ph.D.

Thesis, Department of Mineral Resources Engineering, University of Nottingham, 1995.

13. Caiti, A., and Parisini, T., Mapping of Ocean Sediments by Networks of Parallel

Interpolating Units. IEEE Conference on Neural Networks for Ocean Engineering, pp 231-

238, Washington DC, USA, 1991.

14. Chen, S., Nonlinear Time Series Modelling and Prediction Using Gaussian RBF networks

with Enhanced Clustering and RLS Learning. Electronic Letters, Vol. 31, No. 2, pp 117-118,

1995.

15. Chinunrueng, C., and Sequin, C.H., Optimal Adaptive k-means Algorithm with Dynamic

Adjustment of Learning Rate. IEEE Trans. On Neural Networks, Vol. 6, pp 157-169. 1994.

References

254

16. Clarici, E., Owen, D., Durucan, S., and Ravenscroft, P., Recoverable Reserve Estimation

Using a Neural Network. 24th International Symposium on the Application of Computers and

Operations Research in the Minerals Industries (APCOM), Montreal, Quebec, Canada, 1993.

17. Clark, I., Practical Geostatistics. Elsevier, Amsterdam, 1979.

18. Cortez, L.P., Sousa, A.J., and Durao, F.O., Mineral Resources Estimation Using Neural

Networks and Geostatistical Techniques. 27th International Symposium on the Application of

Computers and Operations Research in the Minerals Industries (APCOM), The Institution of

Mining and Metallurgy (IMM), London, 1998.

19. Cybenko, G., Approximation by Superpositions of a Sigmoidal Function. Mathematics of

Control, Signals, and Systems, Vol. 2, pp 303-314, 1989.

20. David, M., Geostatistical Ore Reserve Estimation. Elsevier, Amsterdam, 1977.

21. David, M., Handbook of Applied Advanced Geostatistical Ore Reserve Estimation. Elsevier,

Amsterdam, 1988.

22. Duchon, J., Spline Minimising Rotation-Invariant Semi-norms in Sobolev Spaces. In:

Schempp W., and Zeller, K., (eds), Constructive Theory of Functions of Several Variables,

Lecture Notes in Mathematics, pp 85-100, 1977.

23. Duda, R.O., and Hart, P.E., Pattern Classification and Scene Analysis. Wiley, New York,

1973.

24. Fahlman, S.E., Fast Learning Variations on Backpropagation: An Empirical Study. In:

Touretzky, D.S., Hinton, G., and Sejnowski, T., (eds), Proceedings of 1988 Connectionist

Models Summer School, Morgan Kaufmann Publishers, San Mateo, California, 1988.

25. Flament, F., Thibault, J., and Hodouin, D., Neural Network Based Control of Mineral

Grinding Plants. Minerals Engineering, Vol. 6, No. 3, pp 235-249, 1993.

26. Garcia, G., and Whitman, W.W., Inversion of a Lateral Log Using Neural Networks. SPE

24454, Society of Petroleum Engineers, 1992.

27. Geva, S., and Sitte, J., A Constructive Method for Multivariate Function Approximation by

Multilayer Perceptrons. IEEE Transactions on Neural Networks, Vol. 3, No. 4, 1992.

28. Golub, G.H., and Van Loan, C.G., Matrix Computations, 3rd Edition. Johns Hopkins

University Press, Baltimore, 1996.

29. Gopal, S., and Woodcock, C., Remote Sensing of Forest Change Using Artificial Neural

Networks. IEEE Transactions of Geoscience and Remote Sensing, Vol. 34, No. 2, 1996.

30. Grossberg, S., Studies of Mind and Brain: Neural Principles of Learning, Perception,

Development, Cognition, and Motor Control. Reidel Press, Boston, 1982.

31. Hassoun, M.H., Fundamentals of Artificial Neural Networks. MIT Press, Cambridge, 1995.

14

32. Haykin, .S., Neural Networks – A Comprehensive Foundation. Prentice Hall, New Jersey,

1999.

33. Hebb, D., The Organisation of Behaviour. John Wiley, New York, 1949.

34. Hopfield, J.J., Neural Networks and Physical Systems with Emergent Collective

Computational Abilities. Proc. National Acad. Sci.,79, pp 2554-2558, 1982.

References

255

35. Hopfield, J.J., Neurons with Graded Response Have Collective Computational Properties

Like those of Two-State Neurons. Proc. National Acad. Sci., USA, Vol. 81, 3088-3092.

36. Hughes, W.E., Davis, F.B., and Darey, R.K., Drillhole Interpolation: Mineralised

Interpolation Techniques. In: Crawford, J.T., and Hustrulid, W. A., (eds), Open-Pit Mine

Planning and Design, AIME, New York, pp 51-64, 1979.

37. Isaaks, E.H., and Srivastava, R.M., Applied Geostatistics. Oxford University Press, New

York, 1989.

38. Journel, A.G., and Huijbregts, Ch.J., Mining Geostatistics. Academic Press, London, 1978.

39. Kapageridis I., Denby B., and Hunter G., Integration of a Neural Ore Grade Estimation

Tool In a 3D Resource Modeling Package. In: Proceedings of the International Joint

Conference on Neural Networks (IJCNN ’99), International Neural Network Society and The

Neural Networks Council of IEEE, Washington D.C., 1999.

40. Kapageridis I., Denby B., Neural Network Modelling of Ore Grade Spatial Variability. In:

Proceedings of the International Conference for Artificial Neural Networks (ICANN 98), Vol.

1, pp 209 – 214, Springer-Verlag, Skovde, 1998.

41. Kapageridis I., Denby B., Ore Grade Estimation with Modular Neural Network Systems – a

Case Study. In: Panagiotou G (ed) Information technology in the minerals industry (MineIT

’97). AA Balkema, Rotterdam, 1998.

42. Kapageridis, I.K., Assessment of Neural Network Prediction Techniques for Grade

Estimation. MSc Thesis, AIMS Research Unit, Department of Mineral Resources

Engineering, University of Nottingham, 1996.

43. King, R.L., Hicks, M.A., and Signer, S.P., Using Unsupervised Learning for Feature

Detection in a Coal Mine Roof. Engineering Applications of Artificial Intelligence, Vol. 6,

No. 6, pp 565-573, 1993.

44. Kirsch, A., An Introduction to the Mathematical Theory of Inverse Problems. Springer-

Verlag, New York, 1996.

45. Kohonen, T., Correlation Matrix Memories. IEEE Trans. Computers, Vol. C-21, pp 353-359,

1972.

46. Kohonen, T., Self-Organisation and Associative Memory. Springer-Verlag, Berlin, 1984.

47. Kohonen, T., Self-Organising Maps, 2nd Edition. Springer-Verlag, Berlin, 1995.

48. Krasnopolsky, V., Using NNs to Retrieve Multiple Geophysical Parameters from Satelite

Data. In: International Joint Conference on Neural Networks (IJCNN ’99), International

Neural Network Society and The Neural Networks Council of IEEE, Washington DC, USA,

1999.

49. Krige, D.G., Log-normal – de Wijsian Geostatistics for Ore Evaluation. South African

Institute of Mining and Metallurgy, Johannesburg, 1981.

50. Lang, K.J., and Hinton, G.F., The Development of the Time-Delay Neural Network

Architecture for Speech Recognition. Technical Report CMU-CS-88-152, Carnegie-Mellon

University, Pittsburgh PA, 1988.

References

256

51. Leonard, J.A., Kramer, M.A., and Ungar, L.H., A Neural Network Architecture that

Computes Its Own Reliability. Computers Chem. Engineering, Vol. 16, No. 9, pp 819-835,

1992.

52. Leonard, J.A., Kramer, M.A., and Ungar, L.H., Using Radial Basis Functions to

Approximated a Function and Its Error Bounds. IEEE Transactions on Neural Networks, Vol.

3, No. 4, pp 624-627, 1992.

53. Looney, C.G., Pattern Recognition Using Neural Networks: Theory and Algorithms for

Engineers and Scientists. Oxford University Press, New York, 1997.

54. Lowe, D., Novel ‘Topographic’ Nonlinear Feature Extraction Using Radial Basis Functions

for Concentration Coding in the ‘Artificial Nose’. In: Third IEE International Conference on

Artificial Neural Networks, Conference Publication 349, pp 95-99, Institute of Electrical

Engineers, 1993.

55. Lowe, D., Radial Basis Function Networks. In: Arbib, M.A.(ed), The handbook of Brain

Theory and Neural Networks, pp 930-934, MIT Press, Cambridge, 1995.

56. Malki, H.A., and Baldwin, J.L., On the Comparison Results of the Neural Networks Trained

Using Well-Logs from one Service Company and Tested on Another Service Company’s Data.

In: Simpson, P.K., (ed), Neural Networks Applications, IEEE Technology Update Series, pp

665-668, IEEE, New York, 1996.

57. Maptek, Envisage Core Reference Manual. Maptek/KRJA Systems Ltd, 1998.

58. Matheron, G., The Theory of Regionalised Variables and Its Applications. Les Cahiers du

Centre de Morphologie Mathematique de Fontainebleau, Ecole des Mines de Paris, 211p,

1971.

59. Maxwell, A.P., Denby, B., and Pitts, W., The Application of Neural Networks to Size

Analysis of Minerals on Conveyors. 25th International Symposium on the Application of

Computers and Operations Research in the Minerals Industries (APCOM), Brisbane,

Australia, 1995.

60. McCulloch, W., and Pitts, W., A Logical Calculus of the Ideas Immanent in Nervous

Activity. Bulletin of Mathematical Biophysics, 1943, Vol. 5, pp. 115-133.

61. Meinguet, J., Multivariate Interpolation at Arbitrary Points Made Simple. Journal of Applied

Mathematics and Physics (ZAMP), 30, pp 292-304, 1979.

62. Micchelli, C.A., Interpolation of Scattered Data: Distance Matrices and Conditionally

Positive Definite Functions. Constructive Approximation, Vol. 2, pp 11-22, 1986.

63. Millar, D.L., and Hudson, J.A., Rock Engineering System Performance Monitoring Using

Neural Networks. Preprints of the ‘Artificial Intelligence in the Minerals Sector’ (one day

symposium held at the University of Nottingham), 1993.

64. Minsky, M., Neural Nets and the Brain – Model Problem. Doctoral Dissertation, Princeton

University, Princeton NJ, 1954.

65. Moody, J., and Darken, C.J., Fast Learning in Networks of Locally-Tuned Processing Units.

Neural Computation, Vol. 1, pp 281-294, 1989.

References

257

66. Morozov, V.A., Regularisation Methods for Ill-Posed Problems. CRC Press, Boca Raton, FL,

1993.

67. Murat, M.E., and Rudman, A.J., Automated First Arrival Picking: A Neural Network

Approach. Geophysical Prospecting, Vol. 40, pp 587-604, 1992.

68. Nadaraya, E.A., On Estimating Regression. Theory of Probability and its Applications, Vol.

9, pp 141-142, 1964.

69. Neumann, J. von, Probabilistic Logic and the Synthesis of Reliable Organisms From

Unreliable Components. In: Shannon, C., and McCarthy, J. (eds), Automata Series, Princeton

University Press, Princeton, 1956, pp. 43-98.

70. Neural Mining Solutions, Neural Computing in Mineral Exploration. White Paper, Neural

Technologies, 1996.

71. Noble, A.C., Ore Reserve/Resource Estimation. In: Mining Engineering Handbook, SME.

72. Oja, M., and Nystom, L., The Use of Self-Organising Maps in Particle Shape Quantification.

In: Hoberg, H., and von Blottnitz, H., (eds), Proceedings of the XX International Mineral

Processing Congress, Vol. 1, pp 141-150, Aachen, Germany, 1997.

73. Park, J., and Sandberg, I.W., Approximation and Radial Basis Function Networks. Neural

Computation, Vol. 5, pp 305-316, 1993.

74. Park, J., and Sandberg, I.W., Universal Approximation Using Radial Basis Function

Networks. Neural Computation, Vol. 3, pp 246-257, 1991.

75. Parzen, E., On Estimation of A Probability Density Function and Mode. Ann. Math. Statist.,

Vol. 33, pp 1065-1076, 1962.

76. Petersen, K.R.P., and Lorenzen, L., Gold Liberation Modelling Using Neural Network

Analysis of Diagnostic Leaching Data. In: Hoberg, H., and von Blottnitz, H., (eds),

Proceedings of the XX International Mineral Processing Congress, Vol. 1, pp 391-400,

Aachen, Germany, 1997.

77. Poggio, T., and Girosi, F., Regularisation Algorithms for Learning that Are Equivalent to

Multilayer Networks. Science, Vol. 247, pp 978-982, 1990.

78. Poulton, M., and Zaverton, K., Comparison of Neural Network Paradigms for Classification

of TM Images. 23rd International Symposium on the Application of Computers and Operations

Research in the Minerals Industries (APCOM), Arizona, USA, 1992.

79. Powel, M.D., Approximation Theory and Methods. Cambridge University Press, Cambridge,

1981.

80. Powel, M.D., The Theory of Radial Basis Function Approximation in 1990. In Light, W.,

(ed.), Advances in Numerical Analysis Vol. II: Wavelets, Subdivision Algorithms, and Radial

Basis Functions, pp 105-210, Oxford Science Publications, Oxford, 1992

81. Readdy, L.A., Bolin, D.S., and Mathieson, G.A., Ore Reserve Calculation – Underground

Mining Methods Handbook.

82. Ripley, B.D., Pattern Recognition and Neural Networks. Cambridge University Press,

Cambridge, 1996.

References

258

83. Ripley, B.D., Statistical Ideas for Selecting Network Architectures. In: Kappen, B., and

Gielen, S., (eds), Neural Networks: Artificial Intelligence and Industrial Applications,

Springer, London, 1995.

84. Roesler, K.S., Improved Geo-Sensing Using Artificial Intelligence Techniques for

Tomographic Interpretation. 23rd International Symposium on the Application of Computers

and Operations Research in the Minerals Industries (APCOM), Arizona, USA, 1992.

85. Rogers, S.J., Fang, J.H., Karr, C.L., and Stanley, D.A., Determination of Lithology from

Well Logs Using Neural Networks. Bulletin of the American Association of Petroleum

Geologists, pp 731-739, May 1992.

86. Rojas, R., A Graphical Proof of the Backpropagation Learning Algorithm. In: V. Malyshkin

(ed.), Parallel Computing technologies, PACT 93, Obnisk, Russia.

87. Rojas, R., Neural Networks – A Systematic Introduction. Springer-Verlag, Berlin, 1996.

88. Rosenblatt, F., The Perceptron: A Probabilistic Model for Information Storage and

Organisation in the Brain. Pyschol. Rev., 65, 386-408, 1958.

89. Rumelhart, D., and McClelland, J., Parallel Distributed Processing. MIT Press, Cambridge

MA, 1986.

90. Rumelhart, D.E., and Zipper. D., Feature Discovery by Competitive Learning. Cognitive

Science, Vol. 9, pp.75-112, 1985.

91. Ryman-Tubb, N., and Bolt, G., The Use of Neural Techniques for Integrated Process System

Modelling and Optimisation. White Paper, Neural Technologies Limited, 1996.

92. Schalkoff, R.J., Artificial Neural Networks. McGraw-Hill, Computer Science Series, New

York, 1997.

93. Schalkoff, R.J., Digital Image Processing and Computer Vision. John Wiley & Sons, New

York, 1989.

94. Schofield, D., Surface Mine Design Using Intelligent Computing Techniques. Ph.D. Thesis,

Department of Mineral Resources Engineering, University of Nottingham, 1992.

95. Signer, S.P., and King, R.L., Evaluation of Coal Mine Roof Supports Using Artificial

Intelligence. In: 23rd International Symposium on the Application of Computers and

Operations Research in the Minerals Industries (APCOM), Arizona, USA, 1992.

96. Singh, S.P., (ed.), Approximation Theory, Spline Functions and Applications. Kluwer,

Dordrecht, The Netherlands, 1992.

97. SNNS, Stuttgart Neural Network Simulator Version 4.1 User’s Manual. Report No. 6/95,

Institute for Parallel and Distributed High Performance Systems (IPVR), University of

Stuttgart, 1996.

98. Steibuch, K., Die Lernmatrix. Kybernetik (Biol. Cyber.), 1(1), 36-45, 1961.

99. Stent, G.S., A Physiological Mechanism for Hebb’s Postulate of Learning. Proceedings of the

National Academy of Sciences, USA, Vol. 70, pp. 997-1001, 1973.

100. Stevens, C., Die Nervenzelle, in: Gerhin und Nervensystem, 1988, pp. 2-13.

101. Tikhonov, A.N., and Arsenin, V.Y., Solutions to Ill-Posed Problems. W.H. Winston,

Washington, DC, 1977.

References

259

102. Tikhonov, A.N., On Solving Incorrectly Posed Problems and Method of Regularisation.

Doklady Akademii Nauk USSR, Vol. 151, pp 501-504, 1963.

103. Van der Walt, T.J., van Deventer, J.S.J., Barnard, E., and Oosthuizen, G.D., The

Simulation of Ill-Defined Processing Operations Using Connectionist Networks. 23rd

International Symposium on the Application of Computers and Operations Research in the

Minerals Industries (APCOM), Arizona, USA, pp 881-888, 1992.

104. Van Deventer, J.S.J., Bezuidenhout, M., and Moolman, D.W., On-Line Visualisation of

Flotation Performance Using Neural Computer Vision of the Froth Texture. In: Hoberg, H.,

and von Blottnitz, H., (eds), Proceedings of the XX International Mineral Processing

Congress, Vol. 1, pp 315-326, Aachen, Germany, 1997.

105. Waibel, A., Hanazawa, T., Hinton, G., Shikano, K., and Lang, K.J., Phoneme Recognition

Using Time-Delay Neural Networks. IEEE Transactions on Acoustics, Speech and Signal

Processing, Vol. ASSP-37, pp. 328-339, 1989.

106. Walter, K.U., Neural Network Technology for Strata Strength Characterisation. In:

International Joint Conference on Neural Networks (IJCNN ’99), International Neural

Network Society and The Neural Networks Council of IEEE, Washington DC, USA, 1999.

107. Wanstedt, S., huang, Y., and Malmstrom, L., Using Neural Networks to Interpret

Geophysical Logs in the Zinkgruvan Mine. In: Panagiotou, G., (ed) Information technology in

the minerals industry (MineIT ’97). AA Balkema, Rotterdam, 1998.

108. Watson, G.S., Smooth Regression Analysis. Sankya: The Indian Journal of Statistics, Series

A, Vol. 26, pp 359-372, 1964.

109. Widrow, B., Generalisation and Information Storage in Networks of ADALINE Neurons. In:

Yovits, G.T., (ed.), Self-Organising Systems, Spartan Books, Washington DC, 1962.

110. Williams, P.M., Image Compression for Neural Networks Using Chebyshev Polynomials. In:

Alexander, I., and Taylor, J., (eds), Artificial Neural Networks, pp 1139-1142, 1992.

111. Wolfram, S., Mathematica 2.1 User’s Manual. Wolfram Research, Cambridge University

Press, 1991.

112. Wu, X., and Zhou, Y., Reserve Estimation Using Neural Network Techniques. Computers &

Geosciences, Vol. 9, No. 4, pp 567-575, 1993.

113. Wu, X., Neural Network-Based Material Modelling. Ph.D. Thesis, Dept. Civil Engineering,

University of Illinois, Urbana, Illinois, 1991.

114. Xiao, R., and Chandrasekar, V., Development of a Neural Network Based Algorithm for

Rainfall Estimation from Radar Observations. IEEE Transactions on Geoscience and Remote

Sensing, Vol. 35, No. 1, 1997.

115. Yama, B.R., and Lineberry, G.T., Artificial Neural Network Application for a Predictive

Task in Mining. SME, Mining Engineering, February 1999, pp 59-64.

116. Yee, P.V., Regularised Radial Basis Function Networks: Theory and Applications to

Probability Estimation, Classification, and Time Series Prediction. Ph.D. Thesis, McMaster

University, Hamilton, Ontario, 1998.

References

260

117. Zadeh, L.A., Knowledge Representation in Fuzzy Logic. In: Yager, R.R., and Zadeh, L.A.,

eds, An Introduction to Fuzzy Logic Applications in Intelligent Systems, Kluwer Academic,

Boston, 1992.

118. Zipellius, A., and Engel, A., Statistical Mechanics of Neural Networks. In: Arbib, M.A.(ed),

The handbook of Brain Theory and Neural Networks, pp 930-934, MIT Press, Cambridge,

1995.

Documents

APPLICATION OF ARTIFICIAL NEURAL NETWORK SYSTEMS … · 2nd Regional VULCAN Conference, Maptek/KRJA Systems, Nice, 1999. Abstract iv Acknowledgements I would like to thank Professor