View
224
Download
4
Embed Size (px)
Citation preview
ICT619 Intelligent SystemsICT619 Intelligent Systems
Topic 4: Artificial Neural Topic 4: Artificial Neural NetworksNetworks
ICT619ICT619 22
Artificial Neural NetworksArtificial Neural NetworksPART APART A IntroductionIntroduction An overview of the biological neuronAn overview of the biological neuron The synthetic neuronThe synthetic neuron Structure and operation of an ANNStructure and operation of an ANN Problem solving by an ANNProblem solving by an ANN Learning in ANNsLearning in ANNs ANN modelsANN models ApplicationsApplications
PART BPART B Developing neural network applicationsDeveloping neural network applications Design of the networkDesign of the network Training issuesTraining issues A comparison of ANN and ESA comparison of ANN and ES Hybrid ANN systemsHybrid ANN systems Case StudiesCase Studies
ICT619ICT619 33
Introduction Introduction Artificial Neural Networks (ANN)Artificial Neural Networks (ANN) Also known as Also known as
Neural networksNeural networks Neural computing (or neuro-computing) systemsNeural computing (or neuro-computing) systems Connectionist modelsConnectionist models
ANNs simulate the biological brain for problem solvingANNs simulate the biological brain for problem solving
This represents a totally different approach to machine This represents a totally different approach to machine intelligence from the symbolic logic approachintelligence from the symbolic logic approach
The biological brain is a The biological brain is a massively parallelmassively parallel system of system of interconnected processing elementsinterconnected processing elements
ANNs simulate a similar network of simple processing elements at ANNs simulate a similar network of simple processing elements at a greatly reduced scalea greatly reduced scale
ICT619ICT619 44
Introduction Introduction ANNs adapt themselves using data to learn problem ANNs adapt themselves using data to learn problem
solutionssolutions
ANNs can be particularly effective for problems that are ANNs can be particularly effective for problems that are hard to solve using conventional computing methodshard to solve using conventional computing methods
First developed in the 1950s, slumped in 70sFirst developed in the 1950s, slumped in 70s
Great upsurge in interest in the mid 1980sGreat upsurge in interest in the mid 1980s
Both ANNs and expert systems are non-algorithmic tools Both ANNs and expert systems are non-algorithmic tools for problem solvingfor problem solving
ES rely on the solution being expressed as a set of ES rely on the solution being expressed as a set of heuristics by an expertheuristics by an expert
ANNs learn solely from data.ANNs learn solely from data.
ICT619ICT619 55
ICT619ICT619 66
An overview of the biological An overview of the biological neuronneuron
Estimated 1000 billion Estimated 1000 billion neuronsneurons in the human brain, in the human brain, with each connected to up to 10,000 otherswith each connected to up to 10,000 others
Electrical impulses produced by a neuron travel along Electrical impulses produced by a neuron travel along the the axonaxon
The axon connects to dendrites through The axon connects to dendrites through synaptic synaptic junctionsjunctions
ICT619ICT619 77
An overview of the biological An overview of the biological neuronneuron
Photo: Osaka University
ICT619ICT619 88
An overview of the biological An overview of the biological neuronneuron
A neuron collects the excitation of its inputs and "fires" A neuron collects the excitation of its inputs and "fires" (produces a burst of activity) when the sum of its inputs (produces a burst of activity) when the sum of its inputs exceeds a certain thresholdexceeds a certain threshold
The strengths of a neuron’s inputs are modified The strengths of a neuron’s inputs are modified (enhanced or inhibited) by the synaptic junctions(enhanced or inhibited) by the synaptic junctions
Learning in our brains occurs through a continuous Learning in our brains occurs through a continuous process of new interconnections forming between process of new interconnections forming between neurons, and adjustments at the synaptic junctionsneurons, and adjustments at the synaptic junctions
ICT619ICT619 99
The synthetic neuronThe synthetic neuron
A simple model of the biological neuron, first A simple model of the biological neuron, first proposed in 1943 by McCulloch and Pitts proposed in 1943 by McCulloch and Pitts consists of a summing function with an internal consists of a summing function with an internal threshold, and "weighted" inputs as shown threshold, and "weighted" inputs as shown below.below.
ICT619ICT619 1010
The synthetic neuron (cont’d)The synthetic neuron (cont’d)
For a neuron receiving For a neuron receiving nn inputs, each input inputs, each input x xii ( ( ii ranging ranging from 1 to from 1 to nn) is weighted by multiplying it with a weight ) is weighted by multiplying it with a weight wwii
The sum of theThe sum of the products products wwiixxii gives the net activation gives the net activation value of the neuronvalue of the neuron
The activation value is subjected to a The activation value is subjected to a transfer functiontransfer function to to produce the neuron’s outputproduce the neuron’s output
The weight value of the connection carrying signals from The weight value of the connection carrying signals from a neuron a neuron i i to a neuron to a neuron jj is termed is termed wwijij....
ICT619ICT619 1111
Transfer functionsTransfer functions These compute the output of a node from its net activation. These compute the output of a node from its net activation.
Among the popular transfer functions are:Among the popular transfer functions are: Step functionStep function Signum (or sign) functionSignum (or sign) function Sigmoid functionSigmoid function Hyperbolic tangent functionHyperbolic tangent function
In the In the step functionstep function, the neuron produces an output only when its , the neuron produces an output only when its net activation reaches a minimum value – known as the net activation reaches a minimum value – known as the thresholdthreshold
For a binary neuron For a binary neuron ii, whose output is a 0 or 1 value, the step , whose output is a 0 or 1 value, the step function can be summarised as:function can be summarised as:
Tactivationif
Tactivationifoutput
i
ii 1
0
ICT619ICT619 1212
Transfer functions (cont’d)Transfer functions (cont’d)
The sign function returns a value between -1 and +1. To avoid The sign function returns a value between -1 and +1. To avoid confusion with 'sine' it is often called confusion with 'sine' it is often called signumsignum..
01
01
i
ii activationif
activationifoutput
+1
-1
0
outputi
activationi
ICT619ICT619 1313
Transfer functions (cont’d)Transfer functions (cont’d)
The sigmoidThe sigmoid
The The sigmoidsigmoid transfer function produces a continuous transfer function produces a continuous value in the range 0 to 1value in the range 0 to 1
The parameter The parameter gaingain affects the slope of the function affects the slope of the function around zeroaround zero
ICT619ICT619 1414
Transfer functions (cont’d)Transfer functions (cont’d)
The hyperbolic tangentThe hyperbolic tangent A variant of the sigmoid transfer function A variant of the sigmoid transfer function
Has a shape similar to the sigmoid (like an Has a shape similar to the sigmoid (like an SS), with the ), with the difference being that the value of difference being that the value of outputoutputii ranges ranges between –1 and 1.between –1 and 1.
ii
ii
activationactivation
activationactivation
iee
eeoutput
ICT619ICT619 1515
Structure and operation of an Structure and operation of an ANNANN
The building block of an ANN is the artificial The building block of an ANN is the artificial neuron. It is characterised byneuron. It is characterised by weighted inputsweighted inputs summing and transfer functionsumming and transfer function
The most common architecture of an ANN consists The most common architecture of an ANN consists of two or more layers of artificial neurons or nodes, of two or more layers of artificial neurons or nodes, with each node in a layer connected to every node with each node in a layer connected to every node in the following layer in the following layer
Signals usually flow from the Signals usually flow from the input layerinput layer, which is , which is directly subjected to an input pattern, across one directly subjected to an input pattern, across one or more or more hidden layershidden layers towards the towards the output layeroutput layer..
ICT619ICT619 1616
Structure and operation of an ANNStructure and operation of an ANN
The most popular ANN architecture, known as the multilayer The most popular ANN architecture, known as the multilayer perceptron (shown in diagram above), follows this model. perceptron (shown in diagram above), follows this model.
In some models of the ANN, such as the In some models of the ANN, such as the self-organising mapself-organising map (SOM) (SOM) or or Kohonen netKohonen net, nodes in the same layer may have interconnections , nodes in the same layer may have interconnections among themamong them
In In recurrent networksrecurrent networks, connections can even go backwards to nodes , connections can even go backwards to nodes closer to inputcloser to input
ICT619ICT619 1717
Problem solving by an ANNProblem solving by an ANN
The inputs of an ANN are data values grouped The inputs of an ANN are data values grouped together to form a pattern together to form a pattern
Each data value (component of the pattern vector) is Each data value (component of the pattern vector) is applied to one neuron in the input layer applied to one neuron in the input layer
The output value(s) of node(s) in the output layer The output value(s) of node(s) in the output layer represent some function of the input patternrepresent some function of the input pattern
ICT619ICT619 1818
Problem solving by an ANN (cont’d)Problem solving by an ANN (cont’d)
• In the example above, the ANN maps the input pattern In the example above, the ANN maps the input pattern to either one of two classesto either one of two classes
The ANN produces the output for an accurate The ANN produces the output for an accurate prediction, prediction, onlyonly if the functional relationships between if the functional relationships between the relevant variables, namely the components of the the relevant variables, namely the components of the input pattern, and the corresponding output, have been input pattern, and the corresponding output, have been “learned” by the ANN“learned” by the ANN
Any three-layer ANN can (at least in theory) represent Any three-layer ANN can (at least in theory) represent the functional relationship between an input pattern the functional relationship between an input pattern and its classand its class
It may be difficult in practice for the ANN to learn a It may be difficult in practice for the ANN to learn a given relationshipgiven relationship
ICT619ICT619 1919
Learning in ANN Learning in ANN
Common human learning behaviour: repeatedly going Common human learning behaviour: repeatedly going through same material, making mistakes and learning through same material, making mistakes and learning until able to carry out a given task successfully until able to carry out a given task successfully
Learning by most ANNs is modelled after this type of Learning by most ANNs is modelled after this type of human learninghuman learning
Learned knowledge to solve a given problem is stored Learned knowledge to solve a given problem is stored in the interconnection weights of an ANNin the interconnection weights of an ANN
The process by which an ANN arrives at the right The process by which an ANN arrives at the right values of these weights is known as values of these weights is known as learning learning or or trainingtraining
ICT619ICT619 2020
Learning in ANN (cont’d)Learning in ANN (cont’d)
Learning in ANNs takes place through an Learning in ANNs takes place through an iterative training process during which node iterative training process during which node interconnection weight values are adjustedinterconnection weight values are adjusted
Initial weights, usually small random values, Initial weights, usually small random values, are assigned to the interconnections between are assigned to the interconnections between the ANN nodes.the ANN nodes.
Like knowledge acquisition in ES, learning in Like knowledge acquisition in ES, learning in ANNs can be the most time consuming phase ANNs can be the most time consuming phase in its developmentin its development
ICT619ICT619 2121
Learning in ANNs (cont’d)Learning in ANNs (cont’d)
ANN learning (or training) can be ANN learning (or training) can be supervisedsupervised or or unsupervisedunsupervised
In In supervised trainingsupervised training, , data sets consisting of pairs, each one an input patterns and data sets consisting of pairs, each one an input patterns and
its expected correct output value, are usedits expected correct output value, are used
The weight adjustments during each iteration aim to reduce The weight adjustments during each iteration aim to reduce the “error” (difference between the ANN’s actual output and the “error” (difference between the ANN’s actual output and the expected correct output)the expected correct output)
Eg, a node producing a small negative output when it is Eg, a node producing a small negative output when it is
expected to produce a large positive one, has its positive expected to produce a large positive one, has its positive weight values increased and the negative weight values weight values increased and the negative weight values decreaseddecreased
ICT619ICT619 2222
Learning in ANNsLearning in ANNs
In In supervised trainingsupervised training,, Pairs of sample input value and corresponding output Pairs of sample input value and corresponding output
value are used to train the net repeatedly until the value are used to train the net repeatedly until the output becomes satisfactorily accurateoutput becomes satisfactorily accurate
In In unsupervised trainingunsupervised training, , there is no known expected output used for guiding the there is no known expected output used for guiding the
weight adjustmentsweight adjustments The function to be optimised can be any function of the The function to be optimised can be any function of the
inputs and outputs, usually set by the applicationinputs and outputs, usually set by the application the net adapts itself to align its weight values with the net adapts itself to align its weight values with
training patternstraining patterns This results in groups of nodes responding strongly to This results in groups of nodes responding strongly to
specific groups of similar inputs patterns specific groups of similar inputs patterns
ICT619ICT619 2323
The two states of an ANNThe two states of an ANN
A neural network can be in one of two A neural network can be in one of two states: states: training modetraining mode or or operation modeoperation mode
Most ANNs learn Most ANNs learn off-line off-line and do not change their and do not change their weights once training is finished and they are in weights once training is finished and they are in operationoperation
In an ANN capable of In an ANN capable of on-lineon-line learning, training and learning, training and operation continue togetheroperation continue together
ANN training can be time consuming, but once ANN training can be time consuming, but once trained, the resulting network can be made to run trained, the resulting network can be made to run very efficiently – providing fast responses very efficiently – providing fast responses
ICT619ICT619 2424
ANN modelsANN models
ANNs are supposed to model the structure and ANNs are supposed to model the structure and operation of the biological brain operation of the biological brain
But there are different types of neural networks But there are different types of neural networks depending on the architecture, learning strategy and depending on the architecture, learning strategy and operationoperation
Three of the most well known models are:Three of the most well known models are:1.1. The multilayer perceptronThe multilayer perceptron2.2. The Kohonen network (the Self-Organising Map)The Kohonen network (the Self-Organising Map)3.3. The Hopfield netThe Hopfield net
The Multilayer Perceptron (MLP) isThe Multilayer Perceptron (MLP) is t the most popular he most popular ANN architectureANN architecture
ICT619ICT619 2525
The Multilayer PerceptronThe Multilayer Perceptron Nodes are arranged into an input layer, an output layer Nodes are arranged into an input layer, an output layer
and one or more hidden layersand one or more hidden layers Also known as the Also known as the backpropagationbackpropagation network because of network because of
the use of error values from the output layer in the layers the use of error values from the output layer in the layers before it to calculate weight adjustments during training. before it to calculate weight adjustments during training.
Another name for the MLP is the Another name for the MLP is the feedforwardfeedforward network. network.
ICT619ICT619 2626
MLP learning algorithmMLP learning algorithm
The learning rule for the multilayer perceptron is known The learning rule for the multilayer perceptron is known as "the generalised delta rule" or the "backpropagation as "the generalised delta rule" or the "backpropagation rule"rule"
The generalised delta rule repeatedly calculates an The generalised delta rule repeatedly calculates an errorerror value for each input, which is a function of the value for each input, which is a function of the squared difference between the expected correct squared difference between the expected correct output and the actual outputoutput and the actual output
The calculated error is The calculated error is backpropagatedbackpropagated from one layer from one layer to the previous one, and is used to adjust the weights to the previous one, and is used to adjust the weights between connecting layers between connecting layers
ICT619ICT619 2727
MLP learning algorithm MLP learning algorithm (cont’d)(cont’d)
New weight = Old weight + change calculated from square of errorNew weight = Old weight + change calculated from square of error
Error = difference between desired output and actual outputError = difference between desired output and actual output
Training stops when error becomes acceptable, or Training stops when error becomes acceptable, or after a predetermined number of iterations after a predetermined number of iterations
After training, the modified interconnection weights After training, the modified interconnection weights form a sort of internal representation that enables the form a sort of internal representation that enables the ANN to generate desired outputs when given the ANN to generate desired outputs when given the training inputs – training inputs – or even new inputs that are similar to or even new inputs that are similar to training inputstraining inputs
ThisThis generalisation generalisation is a very important propertyis a very important property
ICT619ICT619 2828
The error landscape in a The error landscape in a multilayer perceptronmultilayer perceptron
For a given pattern For a given pattern pp, the error , the error Ep Ep can be plotted can be plotted against the weights to give the so called against the weights to give the so called error surfaceerror surface
The error surface is a landscape of hills and valleys, The error surface is a landscape of hills and valleys, with points of minimum error corresponding to wells with points of minimum error corresponding to wells and maximum error found on peaks.and maximum error found on peaks.
The generalised delta rule aims to minimise The generalised delta rule aims to minimise EpEp by by adjusting weights so that they correspond to points of adjusting weights so that they correspond to points of lowest errorlowest error
It follows the method of It follows the method of gradient descentgradient descent where the where the changes are made in the steepest downward directionchanges are made in the steepest downward direction
All possible solutions are depressions in the error All possible solutions are depressions in the error surface, known as surface, known as basins of attractionbasins of attraction
ICT619ICT619 2929
The error landscape in a The error landscape in a multilayer perceptronmultilayer perceptron
Ep
i
j
ICT619ICT619 3030
Learning difficulties in Learning difficulties in multilayer perceptrons - local multilayer perceptrons - local minimaminima
The MLP may fail to settle into the global minimum of The MLP may fail to settle into the global minimum of the error surface and instead find itself in one of the the error surface and instead find itself in one of the local minimalocal minima
This is due to the gradient descent strategy followedThis is due to the gradient descent strategy followed
A number of alternative approaches can be taken to A number of alternative approaches can be taken to reduce this possibility:reduce this possibility:
Lowering the Lowering the gain termgain term progressively progressively Used to influence rate at which weight changes are made Used to influence rate at which weight changes are made
during trainingduring training Value by default is 1, but it may be gradually reduced to Value by default is 1, but it may be gradually reduced to
reduce the rate of change as training progressesreduce the rate of change as training progresses
ICT619ICT619 3131
Learning difficulties in Learning difficulties in multilayer perceptronsmultilayer perceptrons(cont’d)(cont’d) Addition of more nodes for better representation of patternsAddition of more nodes for better representation of patterns
Too few nodes (and consequently not enough weights) can cause Too few nodes (and consequently not enough weights) can cause failure of the ANN to learn a patternfailure of the ANN to learn a pattern
Introduction of a Introduction of a momentum termmomentum term Determines effect of past weight changes on current direction of Determines effect of past weight changes on current direction of
movement in weight spacemovement in weight space Momentum term is also a small numerical value in the range 0 -1Momentum term is also a small numerical value in the range 0 -1
Addition of random noise to perturb the ANN out of local minimaAddition of random noise to perturb the ANN out of local minima Usually done by adding small random values to weights.Usually done by adding small random values to weights. Takes the net to a different point in the error space – hopefully out Takes the net to a different point in the error space – hopefully out
of a local minimumof a local minimum
ICT619ICT619 3232
The Kohonen network (the self-The Kohonen network (the self-organising map)organising map)
Biological systems display both supervised and Biological systems display both supervised and unsupervised learning behaviourunsupervised learning behaviour
A neural network with unsupervised learning A neural network with unsupervised learning capability is said to be capability is said to be self-organisingself-organising
During training, the Kohonen net changes its During training, the Kohonen net changes its weights to learn appropriate associations, weights to learn appropriate associations, without any right answers being providedwithout any right answers being provided
ICT619ICT619 3333
The Kohonen network (cont’d)The Kohonen network (cont’d)
The Kohonen net consists of an input layer, that The Kohonen net consists of an input layer, that distributes the inputs to every node in a second layer, distributes the inputs to every node in a second layer, known as the competitive layer. known as the competitive layer.
The competitive (output) layer is usually organised into The competitive (output) layer is usually organised into some 2-D or 3-D surface (feature map)some 2-D or 3-D surface (feature map)
ICT619ICT619 3434
Operation of the Kohonen NetOperation of the Kohonen Net
Each neuron in the competitive layer is connected to other Each neuron in the competitive layer is connected to other neurons in its neighbourhoodneurons in its neighbourhood
Neurons in the competitive layer have excitatory (positively Neurons in the competitive layer have excitatory (positively weighted) connections to immediate neighbours and weighted) connections to immediate neighbours and inhibitory (negatively weighted) connections to more distant inhibitory (negatively weighted) connections to more distant neurons. neurons.
As an input pattern is presented, some of the neurons in the As an input pattern is presented, some of the neurons in the competitive layer are sufficiently activated to produce competitive layer are sufficiently activated to produce outputs, which are fed to other neurons in their outputs, which are fed to other neurons in their neighbourhoodsneighbourhoods
The node with the set of input weights closest to the input The node with the set of input weights closest to the input pattern component values produces the largest output. This pattern component values produces the largest output. This node is termed the node is termed the best matching (or winning) nodebest matching (or winning) node
ICT619ICT619 3535
Operation of the Kohonen NetOperation of the Kohonen Net(cont’d)(cont’d)
During training, input weights of the best matching node and During training, input weights of the best matching node and its neighbours are adjusted to make them resemble the its neighbours are adjusted to make them resemble the input pattern even more closelyinput pattern even more closely
At the completion of training, the best matching node ends At the completion of training, the best matching node ends up with its input weight values aligned with the input pattern up with its input weight values aligned with the input pattern and produces the strongest output whenever that particular and produces the strongest output whenever that particular pattern is presentedpattern is presented
The nodes in the winning node's neighbourhood also have The nodes in the winning node's neighbourhood also have their weights modified to settle down to an average their weights modified to settle down to an average representation of that pattern classrepresentation of that pattern class
As a result, the net is able to represent clusters of similar As a result, the net is able to represent clusters of similar input patterns - a feature found useful for data mining input patterns - a feature found useful for data mining applications, for example.applications, for example.
ICT619ICT619 3636
The Hopfield ModelThe Hopfield Model
The Hopfield net is the most widely The Hopfield net is the most widely known of all the known of all the autoassociativeautoassociative - - pattern completing - ANNspattern completing - ANNs
In autoassociation, a noisy or partially In autoassociation, a noisy or partially
incomplete input pattern causes the incomplete input pattern causes the network to stabilise to a state network to stabilise to a state corresponding to the original patterncorresponding to the original pattern
It is also useful for optimisation tasks.It is also useful for optimisation tasks.
The Hopfield net is a recurrent ANN in The Hopfield net is a recurrent ANN in which the output produced by each which the output produced by each neuron is fed back as input to all other neuron is fed back as input to all other neuronsneurons
Neurons computer a weighted sum Neurons computer a weighted sum with a step transfer function.with a step transfer function.
ICT619ICT619 3737
The Hopfield Model (cont’d)The Hopfield Model (cont’d)
The Hopfield net has no iterative The Hopfield net has no iterative learning algorithm as such. Patterns learning algorithm as such. Patterns (or facts) are simply stored by (or facts) are simply stored by adjusting the weights to lower a term adjusting the weights to lower a term called called network energynetwork energy
During operation, an input pattern is During operation, an input pattern is applied to all neurons simultaneously applied to all neurons simultaneously and the network is left to stabiliseand the network is left to stabilise
Outputs from the neurons in the stable Outputs from the neurons in the stable state form the output of the network. state form the output of the network.
When presented with an input pattern, When presented with an input pattern, the net outputs a stored pattern the net outputs a stored pattern nearest to the presented pattern.nearest to the presented pattern.
ICT619ICT619 3838
When ANNs should be appliedWhen ANNs should be applied
Difficulties with some real-life problems:Difficulties with some real-life problems: Solutions are difficult, if not impossible, to define Solutions are difficult, if not impossible, to define
algorithmically due mainly to the unstructured naturealgorithmically due mainly to the unstructured nature
Too many variables and/or the interactions of relevant Too many variables and/or the interactions of relevant variables not understood wellvariables not understood well
Input data may be partially corrupt or missing, making it Input data may be partially corrupt or missing, making it difficult for a logical sequence of solution steps to difficult for a logical sequence of solution steps to function effectivelyfunction effectively
ICT619ICT619 3939
When ANNs should be applied When ANNs should be applied (cont’d)(cont’d)
The typical ANN attempts to arrive at an answer by The typical ANN attempts to arrive at an answer by learning to identify the right answer through an iterative learning to identify the right answer through an iterative process of self-adaptation or trainingprocess of self-adaptation or training
If there are many factors, with complex interactions If there are many factors, with complex interactions among them, the usual "linear" statistical techniques among them, the usual "linear" statistical techniques may be inappropriatemay be inappropriate
If sufficient data is available, an ANN can find the If sufficient data is available, an ANN can find the relevant functional relationship by means of an relevant functional relationship by means of an adaptive learning procedure from the dataadaptive learning procedure from the data
ICT619ICT619 4040
Current applications of ANNsCurrent applications of ANNs
ANNs are good at recognition and classification tasksANNs are good at recognition and classification tasks
Due to their ability to recognise complex patterns, Due to their ability to recognise complex patterns, ANNs have been widely applied in character, ANNs have been widely applied in character, handwritten text and signature recognition, as well as handwritten text and signature recognition, as well as more complex images such as faces more complex images such as faces
They have also been used successfully for speech They have also been used successfully for speech recognition and synthesisrecognition and synthesis
ANNs are being used in an increasing number of ANNs are being used in an increasing number of applications where high-speed computation of applications where high-speed computation of functions is important, eg, in industrial roboticsfunctions is important, eg, in industrial robotics
ICT619ICT619 4141
Current applications of ANNsCurrent applications of ANNs(cont’d)(cont’d)
One of the more successful applications of ANNs has One of the more successful applications of ANNs has been as a decision support tool in the area of finance been as a decision support tool in the area of finance and bankingand banking
Some examples of commercial applications of ANN Some examples of commercial applications of ANN are:are: Financial market analysis for investment decision makingFinancial market analysis for investment decision making Sales support - targeting customers for telemarketingSales support - targeting customers for telemarketing Bankruptcy predictionBankruptcy prediction Intelligent flexible manufacturing systemsIntelligent flexible manufacturing systems Stock market predictionStock market prediction Resource allocation – scheduling and management of Resource allocation – scheduling and management of
personnel and equipment personnel and equipment
ICT619ICT619 4242
ANN applications - broad ANN applications - broad categoriescategories
According to a survey (Quaddus & Khan, 2002) According to a survey (Quaddus & Khan, 2002) covering the period 1988 up to mid 1998, the covering the period 1988 up to mid 1998, the main business application areas of ANNs are:main business application areas of ANNs are: Production (36%)Production (36%) Information systems (20%)Information systems (20%) Finance (18%)Finance (18%) Marketing & distribution (14.5%)Marketing & distribution (14.5%) Accounting/Auditing (5%)Accounting/Auditing (5%) Others (6.5%)Others (6.5%)
ICT619ICT619 4343
ANN applications - broad ANN applications - broad categories (cont’d)categories (cont’d)
The levelling off of publications on ANN applications The levelling off of publications on ANN applications may be attributed to the ANN moving from the research may be attributed to the ANN moving from the research to the commercial application domainto the commercial application domain
The emergence of other intelligent system tools may The emergence of other intelligent system tools may be another factorbe another factor
Table 1: Distribution of the Articles by Areas and YearAREA 1988 89 90 91 92 93 94 95 96 97 98 Total % of TotalAccounting/Auditing 1 0 1 1 6 3 3 7 7 5 0 34 4.97Finance 0 0 4 11 19 28 27 18 5 9 2 123 17.98Human resources 0 0 0 1 0 1 1 0 0 0 0 3 0.44Information systems 4 6 9 7 15 24 21 18 13 18 3 138 20.18Marketing/Distribution 2 2 2 3 8 10 12 17 29 14 0 99 14.47Production 2 6 8 21 31 38 24 50 29 31 1 241 35.23Others 0 0 1 7 3 8 7 8 7 5 0 46 6.73Yearly Total 9 14 25 51 82 112 95 118 90 82 6 684 100.00% of Total 1.32 2.05 3.65 7.46 11.99 16.37 13.89 17.25 13.16 11.99 0.88 100.00
ICT619ICT619 4444
Some advantages of ANNsSome advantages of ANNs
Able to take incomplete or corrupt data and provide Able to take incomplete or corrupt data and provide approximate results.approximate results.
Good at generalisation, that is recognising patterns Good at generalisation, that is recognising patterns similar to those learned during trainingsimilar to those learned during training
Inherent parallelism makes them fault-tolerant – loss of Inherent parallelism makes them fault-tolerant – loss of a few interconnections or nodes leaves the system a few interconnections or nodes leaves the system relatively unaffectedrelatively unaffected
Parallelism also makes ANNs fast and efficient for Parallelism also makes ANNs fast and efficient for handling large amounts of data.handling large amounts of data.
ICT619ICT619 4545
ANN State-of-the-art ANN State-of-the-art overviewoverview
Currently neural network systems are available as Currently neural network systems are available as Software simulation on conventional computers - prevalentSoftware simulation on conventional computers - prevalent Special purpose hardware that models the parallelism of Special purpose hardware that models the parallelism of
neurons. neurons.
ANN-based systems not likely to replace conventional ANN-based systems not likely to replace conventional computing systems, but they are an established computing systems, but they are an established alternative to the symbolic logic approach to alternative to the symbolic logic approach to information processinginformation processing
A new computing paradigm in the form of hybrid A new computing paradigm in the form of hybrid intelligent systems has emerged - often involving ANNs intelligent systems has emerged - often involving ANNs with other intelligent system tools with other intelligent system tools
ICT619ICT619 4646
REFERENCESREFERENCES AI Expert (special issue on ANN), June 1990.AI Expert (special issue on ANN), June 1990.
BYTE (special issue on ANN), Aug. 1989.BYTE (special issue on ANN), Aug. 1989.
Caudill,M., "The View from Now", AI Expert, June 1992, pp.27-31.Caudill,M., "The View from Now", AI Expert, June 1992, pp.27-31.
Dhar, V., & Stein, RDhar, V., & Stein, R., Seven Methods for Transforming Corporate Data ., Seven Methods for Transforming Corporate Data into Business Intelligenceinto Business Intelligence., Prentice Hall 1997., Prentice Hall 1997
Kirrmann,H., "Neural Computing: The new gold rush in informatics", Kirrmann,H., "Neural Computing: The new gold rush in informatics", IEEE Micro June 1989 pp. 7-9IEEE Micro June 1989 pp. 7-9
Lippman, R.P., "An Introduction to Computing with Neural Nets", IEEE Lippman, R.P., "An Introduction to Computing with Neural Nets", IEEE ASSP Magazine, April 1987 pp.4-21.ASSP Magazine, April 1987 pp.4-21.
Lisboa, P., (Ed.) Neural Networks Current Applications, Chapman & Lisboa, P., (Ed.) Neural Networks Current Applications, Chapman & Hall, 1992.Hall, 1992.
Negnevitsky, M. Artificial Intelligence A Guide to Intelligent Systems, Negnevitsky, M. Artificial Intelligence A Guide to Intelligent Systems, Addison-Wesley 2005.Addison-Wesley 2005.
ICT619ICT619 4747
REFERENCES (cont’d)REFERENCES (cont’d) Quaddus, M. A., and Khan, M. S., "Evolution of Artificial Neural Quaddus, M. A., and Khan, M. S., "Evolution of Artificial Neural
Networks in Business Applications: An Empirical Investigation Networks in Business Applications: An Empirical Investigation Using a Growth Model", International Journal of Management and Using a Growth Model", International Journal of Management and Decision Making, Vol.3, No.1, March 2002, pp.19-34.(see also Decision Making, Vol.3, No.1, March 2002, pp.19-34.(see also ANN application publications end note library files, ICT619 ftp site)ANN application publications end note library files, ICT619 ftp site)
Wasserman, P.D., Neural Computing, Theory and Practice, Van Wasserman, P.D., Neural Computing, Theory and Practice, Van Nostrand Reinhold, New York 1989Nostrand Reinhold, New York 1989
Wong, B.K., Bodnovich, T.A., Selvi, Yakup, "Neural Networks Wong, B.K., Bodnovich, T.A., Selvi, Yakup, "Neural Networks applications in business: A Review and Analysis of the literature applications in business: A Review and Analysis of the literature (1988-95)", Decision Support Systems, 19, 1997, pp. 301-320. (1988-95)", Decision Support Systems, 19, 1997, pp. 301-320.
Zahedi, F., Zahedi, F., Intelligent Systems for BusinessIntelligent Systems for Business, Wadsworth , Wadsworth Publishing, Belmont, California, 1993.Publishing, Belmont, California, 1993.
http://www.doc.ic.ac.uk/~nd/surprise_96/journal/vol4/cs11/http://www.doc.ic.ac.uk/~nd/surprise_96/journal/vol4/cs11/report.html report.html