Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Neural Networks
Primer
Overview
Background Biological neural network
Artificial neural network
Setting up (designing) the model
Executing the model
Interpreting the results
Biological Neuron
Biological Neural Network
How does it work?
Information flows in via dendrites
Sensor information (sight, sound, smell, etc.)
Nucleus combines information
Not a well understood methodology
Information (decision) flows out via axon to other neurons
Could be final answer or intermediate partial answer
What can they do?
Many of the functions our brain accomplishes
100 billion neurons, each of which may be connected to 1,000 other neurons; over a trillion connections
Pattern recognition is key function
Can deal with incomplete data
Fast processing via parallel processing
Learn, adapt
Degrade gracefully
Artificial Neural Network
An artificial representation of a biological neuron:
designed to perform similarly to
biological neurons (as best we understand them at the moment)
What can they do?
A small piece of what the brain does
Learn from prior data
Given inputs (variables with values) and associated outputs (answers or solutions), find the relationship between the two
Pattern recognition is key function
Can deal with incomplete data
Fast processing via parallel processing
Learn, adapt
Degrade gracefully
Some success stories Evaluation of cancer biopsies
Identify disease in citrus
Find fossil sites
Identify tea leaves
Forecast patient load
Diagnosing turbine engine problems
Stock market modeling
Predicting risk of bankruptcy
Forecasting wool prices Further sources of applications:
http://www.steveknode.com/professional-experience/neural-networks
http://www.wardsystems.com/apptalk.asp
http://www.palisade.com/cases/katherinenhospital.asp?caseNav=byProduct
Artificial Neural Network
Simplified architecture employed
Often the “backpropagation” choice
Usually three layer network
Input neurons – represent key variables
‘hidden layer’ neurons – represent ‘feature collectors’
Output layer neurons – represent results
Artificial Neural Network
Hidden neurons
Designing the model
Input neurons:
Chosen based on having an “expected causative connection” to the outputs Somewhat subjective– art, not science
Relatively few in number (usually)
Degrees of freedom consideration
Designing the model
Hidden neurons:
Mysterious part of the approach Serve as “feature detectors”
Aggregate the inputs from several input neurons
Function not easily explainable
Vary the number to get better results
Most software programs do this automatically
Designing the model
Output neurons:
Answers to the problem Neural networks good at predicting and
classifying
Can be continuous or categorical
Probability that a customer will defect
Whether a student will succeed or not
Machine breakdown likelihood
Is this a fraudulent transaction?
What will customer purchase next?
Model execution: Learning
Step 1: Take the past data set and separate the
cases into ‘training’ examples and ‘testing’ examples
NN model will ‘learn’ from the training examples how to relate the inputs from each case to the output from each case
Training examples must be representative and proportional for learning to be effective
NN model will use the test examples to see how it performs on new cases
Model execution: Learning
Step 2: Past input training examples (one at a
time) are fed to the neural network, along with the associated outputs
Input variables must contain numerical values (0-1)
transformations often necessary, e.g., gender or days of the week
Model execution: Learning
Step 3: Normalized (0-1) inputs, multiplied by
weights, are ‘pushed’ to the hidden layer
Each input neuron fully connected to the next layer (hidden layer)
Weights are initially random
Different for each neuron connection
Hidden Layer processing
Step 4a: hidden layer activation
Each neuron performs a combination function (sums the products from each input neuron node)
Each neuron performs a transfer function (feeds the sum into a transfer function)
Usually a sigmoid function
Results in an output number between -1 and 1 or 0 and 1
Outputs the result to the output layer
Again, fully connected with weights on each connection to the output neurons
Hidden layer magic
Hidden layer magic - Sigmoid
Input from each hidden node
Ou
tpu
t fr
om
each
hid
den
no
de
Hidden Layer processing
Step 4b: hidden layer results
Sent to the output layer
Output from hidden layer neurons is a number between 0 and 1 (or between -1 and 1) from each
These numbers are again multiplied by weights on the connections to the results layer
Output Layer processing
Step 5: Combine inputs from the hidden layer
by addition (combination function)
Input into another sigmoid function in the output neuron layer to get a number between 0 and 1
Then what?
Step 6: compare the computed result with the
known result from this case
IF result is correct, then proceed to assess case 2
IF the result is wrong, then change the weights and try again (train the network)
Some neural networks change the weights at the end of the epoch (attempt at all cases) if the stopping criteria are not met.
When do we stop the training?
Step 7: continue with training until a set of
criteria are met (user designated)
Stop if the answers for the entire training set are acceptable (within tolerance)
Stop if the neural network is not learning or doing any better after a specified number of epochs
Stop after a certain amount of time
Testing the network
Step 8: freeze the weights after training is
completed and test the neural network against the TEST set of cases
Likely not to perform as well since the test set has not been seen before
If performance against the test set is acceptable, begin to use the trained net on new cases
If performance against the test set is unacceptable, revisit the neural net for improvements
Testing the network
Step 8a: Adjusting the neural network if it does
not perform well enough
Choose different set of inputs (more related to the output)
Add more data, properly formatted
Train the network longer
Adjust the number of hidden neurons
Use a different transfer function
Use a different architecture
NOTE: most neural network software performs many of the necessary adjustments (e.g., number of hidden nodes, layers, etc.) automatically
Interpreting the answer
How well does it perform?
Compare to the best alternate method you have, not to perfection
Do you need to explain the answer?
Difficult to explain how the final result came about
Can you get the necessary inputs in a timely manner?
Need inputs in time to run model
Can you operationalize the answer?
Neural Network - Considerations
Realtime vs. Off line
do you need realtime results?
use spreadsheet or develop other interface
Interpretation of Results
problem dependent
continuous variables require interpretation
Comparison (benchmark)
what else is used?
compare statistically if possible
Neural Network Applications Application Function/Purpose Payoff
CemQUEST Predicts the quality of
oilfield cement; avoids
operational failures
Saves $3-5 Million per year per
client
PAPNET Detects cancerous cells on
Pap smears; detects 128
cells out of over 300,000
97% accurate vs. Human
accuracy of approximately 50%
(being extended for Lung
Cancer detection)
Neuroroute Printed Circuit Board
Design optimizer based on
600 actual designs
Considerable time savings;
automates process.
NeuSight Optimizes combustion
process in coal-fired boilers
$250,000 vs. $5M; reduces
unburned carbon by 30%.
Wrangler CRP Forecast production
planning and inventory
Increased sales, lower inventory
costs, improved operations
OptimizOR Improves OR utilization 10-20% improvement; $1M
savings per hospital
Source: ISR, Vol. 13, Nos. 9, 6, 4, 1 and ISS, Vol. XII, No. 11,.
NN Strengths and Weaknesses
Strengths
wide range of problems (prediction and categorization)
produce “good” results (compare to alternate methods)
Weaknesses
data requirements
data preprocessing
no explanatory capability