Upload
laith-nunez
View
58
Download
2
Embed Size (px)
DESCRIPTION
CS4705 – Natural Language Processing Thursday, September 28. Introduction to Weka. What is weka?. java-based Machine Learning Tool 3 modes of operation GUI Command Line API (not discussed here) To run: java -Xmx1024M -jar ~cs4705/bin/weka.jar &. weka Homepage. - PowerPoint PPT Presentation
Citation preview
Introduction to Weka
CS4705 – Natural Language ProcessingThursday, September 28
What is weka?
● java-based Machine Learning Tool● 3 modes of operation
– GUI
– Command Line
– API (not discussed here)● To run:
– java -Xmx1024M -jar ~cs4705/bin/weka.jar &
weka Homepage
● http://www.cs.waikato.ac.nz/ml/weka/
.arff file format
● http://www.cs.waikato.ac.nz/~ml/weka/arff.html@relation name
@attribute attrName {numeric, string, <nominal>, date}
...
@data
a,b,c,d,e
● <nominal> := {class1,class2,...,classN}
Example Arff Files
● http://sourceforge.net/projects/weka
● iris.arff● cmc.arff
To Classify with weka GUI
1.Run weka GUI
2.Click 'Explorer'
3.'Open file...'
4.Select 'Classify' tab
5.'Choose' a classifier
6.Confirm options
7.Click 'Start'
8.Wait...
9.Right-click on Result list entry
a. 'Save result buffer'
b.'Save model'
Classify
● Some classifiers to start with.
– NaiveBayes
– JRip
– J48
– SMO● Find References by selecting a classifier● Use Cross-Validation!
Analyzing Results
● Important tools for Homework 2
– Accuracy● “Correctly classified instances”
– Confusion matrix
– Save model
– Visualization
Running weka from the Command Line
● Running an N-fold cross validation experiment– java -cp ~cs4705/bin/weka.jar weka.classifiers.bayes.NaiveBayes -t trainingdata.arff -x N
● Using a predefined test set– java -cp ~cs4705/bin/weka.jar weka.classifiers.bayes.NaiveBayes -t trainingdata.arff -T testingdata.arff
● Saving the model– java -cp ~cs4705/bin/weka.jar weka.classifiers.bayes.NaiveBayes -t trainingdata.arff -d output.model
● Classifying a test set– java -cp ~cs4705/bin/weka.jar weka.classifiers.bayes.NaiveBayes -l input.model -T testingdata.arff
● Analyzing results
– Get predictions from test data● java -cp ~cs4705/bin/weka.jar weka.classifiers.bayes.NaiveBayes -l input.model -T testingdata.arff -p range
– Then DIY with scripts● awk and sed will be your friends
● Getting predictions from crossvalidation
– “Output Predictions” doesn't cut it.– export CLASSPATH=~cs4705/bin/:~cs4705/bin/weka.jar
– java callClassifier weka.classifiers.bayes.NaiveBayes -t trainingdata.arff