Upload
wade-mcintosh
View
19
Download
1
Embed Size (px)
DESCRIPTION
Feedback – Lab 2. 9 Sept 2014. Your learning experience in this course. Active Listening Video Lectures Underlying question is ” how am I going to use this concept later on”? Consolidating new knowledge via quizzes and surveys - PowerPoint PPT Presentation
Citation preview
Feedback – Lab 2
9 Sept 2014
Your learning experience in this course
• Active Listening Video Lectures– Underlying question is ”how am I going to use this
concept later on”?• Consolidating new knowledge via quizzes and
surveys– Direct questions will help you memorize the concepts
• Activating your knowledge by reflecting and reasoning (your cognitive effort!)– How shall i solve this problem with the knowlege that I
have acquired so far? – Lab Classes
Lab Sessions: Text Comprehension & Task Interpretation
• (Always: point out inaccuracies)• Use Case 1– I do not understand the text: • go back to the video lecture, probably you have not
built the background context required for completing the task.
…. Continued
Use case 2:– Oh my god, what am I supposed to do here?• read the text several times and identify the key points in the text strucure:
- Description - The purpose- Tasks
- pre-processing: feature transformation- identify the best features by applying your knowledge
about empirical error- Interpret the results based on your knowledge
about empirical error and your common sense knowledge or historical research.
My expectations on your learning experience
• Students should be able to interpret the text and the tasks (diversified interpretations are allowed and welcome)
• Students should be able to show critical mind by working out a plausible interpretation(s) and motivate their choice (s).
About instructions and time…
• I am not sure that instructions were unclear.• The core task is the representation bin0 and
bin1 in order to apply the formulae.
• This was the cognitive effort of this lab.• You could work in groups and groups could
exchange info between them… and for several hours…. And you made it!
Pre-processing: feature transformation
• Categorical features Binary features– Each feature shoud assume a value 0 or a value 1
following the instructions under the heading ”Preprocessing” (search & replace; if formulae; whatever…)
The task was about empirical error(Lect 6, min 7:44)
• Empirical error: how well the chosen hypothesis classifies the training data.
• How do you assess a hypothesis?– Systematic counting of correct guesses and wrong
guesses made by the hypothesis wrt the correct labels
– This means that you must compare the predictions of the hypothesis with the actual labels
Lab Task
• Our hypotheses were the different features.
• We have to assess each feature wtr to classiffication (survived vs died)
1) For each feature, calculate the empirical error
• LEARN TO PREDICT THE FIRST COLUMN– (a) For each of the features calculate (and write down) the training error if you used only that
feature to classify the data. To do this you will need to do the following for each feature:– Split the data based on that feature. Call bin0 all examples that have 0 for that features and
bin1 all examples that have 1 for that feature.– Calculate the majority count for the label in each bin, i.e. for bin0, majority(bin0) =
max(count(bin0 = survive); count(bin0 = notsurvive))
Accuracy/Error
• A possible representation….
WATCH OUT! AGE FEATURE IS TRICKY HERE!
Other representations (etc. etc.)
Which feature would be best to use?
• EMBARKED… if we trust this sample and our calculations… (error rate on this feature is the lowest)
• Basically this means that many of those who started their trip from Southampton did not survived.
• However, the difference betw the features was very small!
Many interesting interpretations!None believed that Embarked was a good
feature for real • ”this could depend on the small dataset”• ”embarked feature gave the lowest error […] Intutivetly
the first class feature should have the strongest relationship with the chance of surviving”
• ”If we calculate accuracy with more features […], we get more interesting results”
• ”The Embarked would be the best to use because it has the lowest error rate. In reality it is very unlikely that the city has any correlation with their chance of survival, unless they recieved some special training before boarding or shared a rough upbringing in the city”
• Etc.
Missing values
• Good that you noticed that there were missing values, ie cells without any value!– Some of you have removed them– Some of you have coverted to >25
• In practice, missing values require ”more investigation”
• Missing values are not considered to be ”noise” in the sense that was explained during the video lecture.
Technical troubles
• If you experience problems with a computer: configuration problems, weird behaviour, etc. just change computer and report the touble (Per?)
Next…
• Those who have miscalculated the empirical error should recalculated in the correct way as presented.
• Those who want, can have some additional training with an optional task that is on the website. It contains the solution. You do not need to submit anything. It is just for you!
• All those who have submitted the report have completed this lab task. Well done!