View
220
Download
0
Category
Preview:
Citation preview
Copy
right
R. W
eber
Machine Learning, Data Mining,Genetic Algorithms, Neural
Networks
ISYS370
Dr. R. Weber
Copy
right
R. W
eber
•Learner uses:–positive examples (instances ARE examples of
a concept) and –negative examples (instances ARE NOT
examples of a concept)
Concept Learning is a Form of Inductive Learning
Copy
right
R. W
eber
• Needs empirical validation• Dense or sparse data determine quality
of different methods
Concept Learning
Copy
right
R. W
eber
• The learned concept should be able to correctly classify new instances of the concept– When it succeeds in a real instance of the
concept it finds true positives – When it fails in a real instance of the concept
it finds false negatives
Validation of Concept Learning i
Copy
right
R. W
eber
• The learned concept should be able to correctly classify new instances of the concept– When it succeeds in a counterexample it
finds true negatives– When it fails in a counterexample it finds
false positives
Validation of Concept Learning ii
Copy
right
R. W
eber
Rule Learning
• Learning widely used in data mining• Version Space Learning is a search
method to learn rules• Decision Trees
Copy
right
R. W
eber
Decision trees• Knowledge representation formalism• Represent mutually exclusive rules
(disjunction)• A way of breaking up a data set into classes
or categories• Classification rules that determine, for each
instance with attribute values, whether it belongs to one or another class
• Not incremental
Copy
right
R. W
eber
Decision trees-leaf nodes (classes)
- decision nodes (tests on attribute values)
-from decision nodes branches grow for each possible outcome of the test
From Cawsey, 1997
Copy
right
R. W
eber
Decision tree induction• Goal is to correctly classify all example data• Several algorithms to induce decision trees:
ID3 (Quinlan 1979) , CLS, ACLS, ASSISTANT, IND, C4.5
• Constructs decision tree from past data• Attempts to find the simplest tree (not
guaranteed because it is based on heuristics)
Copy
right
R. W
eber
•From:– a set of target classes–Training data containing objects of more than one class
•ID3 uses test to refine the training data set into subsets that contain objects of only one class each•Choosing the right test is the key
ID3 algorithm
Copy
right
R. W
eber
• Information gain or ‘minimum entropy’• Maximizing information gain corresponds to minimizing entropy•Predictive features (good indicators of the outcome)
How does ID3 chooses tests
Copy
right
R. W
eber
• Information gain or ‘minimum entropy’• Maximizing information gain corresponds to minimizing entropy•Predictive features (good indicators of the outcome)
Choosing tests
Copy
right
R. W
eber
Monthy income Job status Repayment Loan status
1 2,000 Salaried 200 Good
2 4,000 Salaried 600 Very bad
3 3,000 Waged 300 Very good
4 1,500 salaried 400 Bad
Copy
right
R. W
eber
• Link analysis
• Deviation detection
Data mining tasks ii
Rules: • Association generation• Relationships between entities
• How things change over time, trends
Copy
right
R. W
eber
KDD applications• Fraud detection
– Telecom (calling cards, cell phones)– Credit cards– Health insurance
Loan approval Investment analysis Marketing and sales data analysis
Identify potential customers Effectiveness of sales campaign Store layout
Copy
right
R. W
eber
Text mining
The problem starts with a query and the solution is a set of information (e.g., patterns, connections, profiles, trends) contained in several different texts that are potentially relevant to the initial query.
Copy
right
R. W
eber
Text mining applications
• IBM Text Navigator– Cluster documents by content;– Each document is annotated by the 2 most
frequently used words in the cluster;
• Concept Extraction (Los Alamos)– Text analysis of medical records;– Uses a clustering approach based on trigram
representation;– Documents in vectors, cosine for comparison;
Copy
right
R. W
eber
rule-based ES
case-based reasoning
inductive ML, NN
algorithms
deductive reasoning
analogical reasoning
inductive reasoning
search
Problemsolving
method
Reasoning type
Copy
right
R. W
eber
Genetic Algorithms (GA)
Copy
right
R. W
eber
Genetic algorithms (i)
• learn by experimentation• based on human genetics, it originates new
solutions • representational restrictions• good to improve quality of other methods
e.g., search algorithms, CBR• evolutionary algorithms (broader)
Copy
right
R. W
eber
Genetic algorithms (ii)
• requires an evaluation function to guide the process• population of genomes represent possible solutions• operations are applied over these genomes• operations can be mutation, crossover• operations produce new offspring• an evaluation function tests how fit an offspring is • the fittest will survive to mate again
Copy
right
R. W
eber
Genetic Algorithms ii
• http://ai.bpa.arizona.edu/~mramsey/ga.html You can change parameters
• http://www.rennard.org/alife/english/gavgb.html Steven Thompson presented
Copy
right
R. W
eber
Neural Networks (NN)
Copy
right
R. W
eber
~= 2nd-5th week
training vision
the evidence
Copy
right
R. W
eber
the evidence
~= 2nd-5th week
training vision
10
Copy
right
R. W
eber
the evidence
~= 2nd-5th week
training vision
10
Copy
right
R. W
eber
the evidence
~= 2nd-5th week
training vision
Copy
right
R. W
eber
NN: model of brains
input output
neuronssynapses
electric transmissions:
Copy
right
R. W
eber
Elements
• input nodes• output nodes• links• weights
Copy
right
R. W
eber
terminology
• input and output nodes (or units) connected by links
• each link has a numeric weight
• weights store information
• networks are trained on training sets (examples) and after are tested on test sets to assess networks’ accuracy
• learning/training takes place as weights are updated to reflect the input/output behavior
Copy
right
R. W
eber
The concept
=> mammal
=> bird0 1 1
4 legs flylayeggs
1 0 0
1 Yes, 0 No
=> mammal1 1 0
Copy
right
R. W
eber
The concept
=> mammal
=> bird0 1 1
4 legs flylayeggs
1 0 0
=> mammal1 1 0
1 Yes, 0 No
Copy
right
R. W
eber
The concept
=> mammal
=> bird0 1 1
4 legs flylayeggs
1 0 0
=> mammal1 1 0
0.5 0.5 0.5
1 Yes, 0 No
=> mammal
=> bird0 1 1
4 legs flylayeggs
1 0 0
=> mammal1 1 0
0*0.5+1*0.5+1*0.5= 1
1*0.5+0*0.5+0*0.5= 0.5
1*0.5+1*0.5+0*0.5= 1Goal is to have weights that recognize different representations of mammals and birds as such
0.5 0.5 0.5
=> mammal
=> bird0 1 1
4 legs flylayeggs
1 0 0
=> mammal1 1 0
0*0.5+1*0.5+1*0.5= 1
1*0.5+0*0.5+0*0.5= 0.5
1*0.5+1*0.5+0*0.5= 1Suppose we want bird to be greater 0.5 and mammal to be equal or less than 0.5
0.5 0.5 0.5
=> mammal
=> bird0 1 1
4 legs flylayeggs
1 0 0
=> mammal1 1 0
0*0.25+1*0.25+1*0.5= 0.75
1*0.25+0*0.25+0*0.5= 0.25
1*0.25+1*0.25+0*0.5= 0.5Suppose we want bird to be greater 0.5 and mammal to be equal or less than 0.5
0.25 0.25 0.5
Copy
right
R. W
eber
The trainingOutput=Step(w f )
learning takes place as weights are updated to reflect the input/output behavior
=> mammal (1)=> bird (0)
0 1 1
4 legs flies eggs
i=1
i=2
i=3
j=1 j=2 j=3
ij
0 0 0
0 0 0
0 0 0
1 0 0
1 0 0
1 0 0
1 0 0
1 0 0
1 1 1
1 0 0
1 1 1
1 1 1
Goal minimize error between representation of the expected and actual outcome
20
ij
Copy
right
R. W
eber
NN demo…..
Copy
right
R. W
eber
Characteristics• NN implement inductive learning algorithms
(through generalization) therefore, it requires several training examples to learn
• NN do not provide an explanation why the task performed the way it was
• no explicit knowledge; uses data• Classification (pattern recognition), clustering,
diagnosis, optimization, forecasting (prediction), modeling, reconstruction, routing
Copy
right
R. W
eber
Where are NN applicable?
• Where they can form a model from training data alone;
• When there may be an algorithm, but it is not known, or has too many variables;
• There are enough examples available• It is easier to let the network learn from
examples• Other inductive learning methods may not
be as accurate
Copy
right
R. W
eber
Applications (i)• predict movement of stocks, currencies,
etc., from previous data;• to recognize signatures made (e.g. in a
bank) with those stored;• to monitor the state of aircraft engines (by
monitoring vibration levels and sound, early warning of engine problems can be given; British Rail have been testing an application to monitor diesel engines;
Copy
right
R. W
eber
Applications (ii)
• Pronunciation (rules with many exceptions)
• Handwritten character recognition(network w/ 200,000 is impossible to train, final 9,760 weights, used 7300 examples to train and 2,000 to test, 99% accuracy)
• Learn brain patterns to control and activate limbs as in the “Rats control a robot by thought alone” article
• Credit assignment
Copy
right
R. W
eberCMU Driving ALVINN
• learns from human drivers how to steer a vehicle along a single lane on a highway
• ALVINN is implemented in two vehicles equipped with computer-controlled steering, acceleration, and braking
• cars can reach 70 m/h with ALVINN• programs that consider all the problem
environment reach 4 m/h only
Copy
right
R. W
eberWhy using NN for the driving
task? • there is no good theory of driving, but it is easy
to collect training samples• training data is obtained with a human* driving
the vehicle–5min training, 10 min algorithm runs
• driving is continuous and noisy• almost all features contribute with useful
information*humans are not very good generators of training instances when they behave too regularly without making mistakes
Copy
right
R. W
eber
• INPUT:video camera generates array of 30x32 grid of input nodes
•OUTPUT: 30 nodes layer corresponding to steering direction
•vehicle steers to the direction of the layer with highest activation
the neural network
Copy
right
R. W
eber
Resourceshttp://www.cs.stir.ac.uk/~lss/NNIntro/InvSlides.html#what
http://www.ri.cmu.edu/projects/project_160.html
http://www.txtwriter.com/Onscience/Articles/ratrobot.html
Recommended