82
Beverly Park Woolf School of Computer Science, University of Massachusetts [email protected] Learning to Teach: Machine Learning to Improve Instruction NIPS 2015 Workshop on Human Propelled Machine Learning, Dec 13, 2014

Learning to Teach: Improving Instruction with Machine Learning Techniques

Embed Size (px)

Citation preview

Page 1: Learning to Teach: Improving Instruction with Machine Learning Techniques

Beverly Park Woolf School of Computer Science, University of Massachusetts

[email protected]

Learning to Teach: Machine Learning to Improve

Instruction

NIPS 2015 Workshop on Human Propelled Machine Learning, Dec 13, 2014

Page 2: Learning to Teach: Improving Instruction with Machine Learning Techniques

Long, long Term Goal

Millions of schoolchildren will have access to what Alexander the Great enjoyed as a royal prerogerative:

“the personal services of a tutor as well informed as Aristotle”

Pat Suppes, Stanford University, 1966Died Nov 2014)

”Students will have instant access to vast stores of knowledge through their computerized tutors”

Page 3: Learning to Teach: Improving Instruction with Machine Learning Techniques

Alexander the Great valued learning so highly, that he said he was more indebted to Aristotle for giving him knowledge than to his father for giving him life.

Page 4: Learning to Teach: Improving Instruction with Machine Learning Techniques

We are on track.

Key components:

Artificial IntelligenceMachine LearningLearning Sciences

We are able to achieve personal services of a tutor for every student and instant access to vast stores of knowledge

Page 5: Learning to Teach: Improving Instruction with Machine Learning Techniques

Then: ~ 400 BCNow: 2014

Page 6: Learning to Teach: Improving Instruction with Machine Learning Techniques

Model the Student

Model the Domain

Personalize Tutoring

Assess Learning

Intelligent TutoringSystems

Learning@ Scale

Page 7: Learning to Teach: Improving Instruction with Machine Learning Techniques

Research Questions

How to retrieve substance from educational data?

What do teachers and students need to know?

What do researchers in Learning Sciences want to know?

Page 8: Learning to Teach: Improving Instruction with Machine Learning Techniques

• Explore large educational data sets and how they are analyzed– create models and pattern finding.

• How are researchers in the field of educational technology using a variety of techniques to use data to improve teaching and learning?

Page 9: Learning to Teach: Improving Instruction with Machine Learning Techniques

What Kind of ML Techniques?

– Visualization and modeling – Decision trees – Bayesian networks– Logistic Regression– Temporal Models– Markov Models – Classification: Naïve Bayes, Neural Networks,

Decision trees

Page 10: Learning to Teach: Improving Instruction with Machine Learning Techniques

Reasoning about the Learner with Machine Learning

Page 11: Learning to Teach: Improving Instruction with Machine Learning Techniques

TechniquesOpen

Learner Models

Models for

Teachers, Parents

Models of the Domain

Models of Student

Knowledge, Learning

Models of Student

Affect/Motivatio

Engagement/Use and

Misuse/On-off task

Pedagogical

Moves and

Tutorial Actions

Pre-processing: Discretizing Variables, Normalizing and Transforming Variables

    Arroyo, EDM 2010     Baker Arroyo, EDM 2010

Visualizations: Single Variables and Relationshipts Bull & Mitrovic     Ritter EDM2011 best

paper      

Models: Correlations/Crosstabulations       Arroyo, Log Files Arroyo, Log Files Arroyo, Log Files  

Models: Causal Modeling         Beck & Rai    

Models: Linear Regression   Heffernan  Koedinger   Arroyo, Shanabrook; Baker  

Arroyo --Animalwatc

h

Models: Feature Selection. Splitting Models vs. Accounting for.     Martin & Koedinger   Arroyo    

Classification: Logistic Regression       Pavlik (PFA); Gong & Beck, v Cooper, David Beck  

Classification: Clustering    Desmarais -- non negateive matrix

factorization

Yue Gong UMAP2012, clustering without

features :-)      

Classification: Naive Bayes             Stern, MANIC

Classification: Neural Networks       Burns, Handwriting D'Mello: Predicting affective states Baker  

Classification: Decision Trees      random forest

approach was widely used in KDD2011 cup

    de Vicente & Pain

Models: Association Rule Learning   Romero   Merceron      

Temporal Models: Temporal Patterns and Trails over observable variables, and Markov Chains

      Romero (Educational Trails) Shanabrook Shanabrook Shanabrook

Models: Bayesian Networks Zapata-Rivera   HeffernanLots of classic ITS work

(HYDRIVE; William Murray)

Conati; Arroyo; Rai   Chaz Murray RTDT

Temporal Models: Hidden Markov Models (latent variables)     Mayo & Mitrovic Beck; Pardos Johns & Woolf    Ivon Arroy, Worcester Polytechnic

Institute

Page 12: Learning to Teach: Improving Instruction with Machine Learning Techniques

Data Sets Used

–Data sets come from Log Files – Educational tutoring and assessment

software,

Page 13: Learning to Teach: Improving Instruction with Machine Learning Techniques

Large Data Sets

EventLog Table of a Math Tutoring System. 571,776 rows, just in a year time.

Page 14: Learning to Teach: Improving Instruction with Machine Learning Techniques

Introduction

Model the Student

Model the Domain

Personalize Tutoring

Assess Learning

Agenda

Intelligent TutoringSystems

Learning@ Scale

Page 15: Learning to Teach: Improving Instruction with Machine Learning Techniques

Student Model

Page 16: Learning to Teach: Improving Instruction with Machine Learning Techniques

Student Model

Page 17: Learning to Teach: Improving Instruction with Machine Learning Techniques

Student Model

Page 18: Learning to Teach: Improving Instruction with Machine Learning Techniques
Page 19: Learning to Teach: Improving Instruction with Machine Learning Techniques

A data-driven approach toward automatic prediction of studentsemotional states without sensors and while students are still actively engaged in their learning.

Models from students ongoing behavior. A cross-validation revealed small gains in accuracy for the more sophisticated state-basedmodels and better predictions of the remaining unpredicted cases, compared to the baseline models.

By modifying the context of the tutoring system including students perceived emotion around mathematics, a tutor can nowoptimize and improve a students mathematics attitudes.

David H. Shanabrook, David G. Cooper, Beverly Park Woolf, and Ivon Arroyo

Page 20: Learning to Teach: Improving Instruction with Machine Learning Techniques

Student States Describing student/tutor interaction

Page 21: Learning to Teach: Improving Instruction with Machine Learning Techniques
Page 22: Learning to Teach: Improving Instruction with Machine Learning Techniques
Page 23: Learning to Teach: Improving Instruction with Machine Learning Techniques
Page 24: Learning to Teach: Improving Instruction with Machine Learning Techniques
Page 25: Learning to Teach: Improving Instruction with Machine Learning Techniques
Page 26: Learning to Teach: Improving Instruction with Machine Learning Techniques
Page 27: Learning to Teach: Improving Instruction with Machine Learning Techniques
Page 28: Learning to Teach: Improving Instruction with Machine Learning Techniques

Problem state patterns

IBMs Many Eyes Word Tree algorithm. The total 1280 ATT (attempted and solved) events. Most frequently ATT was followed by a SOF event (see top tree). The second level of the tree shows that the sequence ATT ATT the highest frequent event changes to the ATTevent, i.e. the shift in behavior occurs after two ATT states (see second tree andtop branch). This indicates the ATT state is more often a solitary event, wherethe ATT ATT pattern will continue in the ATT state. Thus, from the analysis the most frequent 3 problem state patterns (e.g., NOTR-NOTR-NOTR) aredetermined (see third tree and second branch).

Page 29: Learning to Teach: Improving Instruction with Machine Learning Techniques

Jeff JohnsAutonomous

Learning Laboratory

Beverly WoolfCenter for KnowledgeCommunication

AAAI 7/20/2006

A Dynamic Mixture Model to Detect Student Motivation and Proficiency

Page 30: Learning to Teach: Improving Instruction with Machine Learning Techniques

Problem Statement• Background

– Develop a machine learning component for a math tutoring system used by high school students (SAT, MCAS)

– Focus on estimating the “state” of a student, which is then used for selecting an appropriate pedagogical action

• Problem– Using a model to estimate student ability, but…– Students appear unmotivated in ~30% of problems

• Solution– Explicitly model motivation (as a dynamic variable) and student proficiency

in a single model

Page 31: Learning to Teach: Improving Instruction with Machine Learning Techniques

Detection of Motivation

Unmotivated students do not reap the full rewards ofusing a computer-based intelligent tutoring system. Detection of improper behavior is thus an important component of an online student model.

Dynamic mixture model based on Item Response Theory. This model simultaneously estimates a student’s proficiency and changing motivation level.

By accounting for student motivation, the dynamic mixture model researchers can more accurately estimate proficiency and the probability of a correct response.

Page 32: Learning to Teach: Improving Instruction with Machine Learning Techniques

• Created Item Response Theory (IRT) models for modeling the student's knowledge

• Data consists of responses (correct/incorrect) for 400 students across 70 problems, where a student performs ~33 problems on average

• - implemented an EM algorithm to learn the parameters of the IRT model

• - cross-validated results indicate the model can predict with 72% accuracy how the student will perform on each problem

• - algorithms can be used online to estimate a student's ability while interacting with the tutor

• - currently working on an extension of the IRT model to include information relevant to a student's motivation (time spent on problem, number of hints requested)

Page 33: Learning to Teach: Improving Instruction with Machine Learning Techniques

Low Student Motivation

• Example: Actual data from a student performing 12 problems (green = correct, red = incorrect)– Problems are of roughly equal difficulty

• Student appears to perform well in beginning and worse toward the end

• Conclusion: The student’s proficiency is average121110987654321 …

Page 34: Learning to Teach: Improving Instruction with Machine Learning Techniques

Low Student Motivation

• Conclusion: Poor performance on the last five problems is due to low motivation (not proficiency)

1211109876543210

10

20

30

40

50

Time (s)To First

ResponseStudent is

unmotivated

Use observed data to infer motivation!

Page 35: Learning to Teach: Improving Instruction with Machine Learning Techniques

Low Student Motivation

• Opportunity for intelligent tutoring systems to improve student learning by addressing motivation

• This issue is being dealt with on a larger scale by the educational assessment community– Wise & Demars 2005. Low Examinee Effort in Low-

Stakes Assessment: Potential Problems and Solutions. Educational Assessment.

Page 36: Learning to Teach: Improving Instruction with Machine Learning Techniques

Hidden Markov Model (HMM)• A HMM is used to capture a student’s

changing behavior (level of motivation)

H1 H2 Hn

M1 M2 Mn…

Mi (hidden) Hi (observed)

Unmotivated – HintTime to first response < tmin AND Number of hints before correct response > hmax

Unmotivated – GuessTime to first response < tmin AND Number of hints before correct response < hmin

Motivated If other two cases don’t apply

Page 37: Learning to Teach: Improving Instruction with Machine Learning Techniques

• New edges (in red) change the conditional probability of a student’s response: P(Ui | , Mi)

U1 U2 Un

H1 H2 Hn

M1 M2 Mn…

… Motivation (Mi ) affects student response (Ui )

Page 38: Learning to Teach: Improving Instruction with Machine Learning Techniques

Parameter Estimation• Uses an Expectation-Maximization algorithm to estimate

parameters– M-Step is iterative, similar to the Iterative Reweighted Least Squares

(IRLS) algorithm

• Model consists of discrete and continuous variables– Integral for the continuous variable is approximated using a quadrature

technique

• Only parameters not estimated– P(Ui | , Mi=unmotivated-guess) = 0.2

– P(Ui | , Mi=unmotivated-hint) = 0.02

Page 39: Learning to Teach: Improving Instruction with Machine Learning Techniques

Modeling Ability and Motivation

• Combined model does not decrease the ability estimate when the student is unmotivated

Combined model separates ability from motivation (IRT model lumps them together)

Page 40: Learning to Teach: Improving Instruction with Machine Learning Techniques

Experiments• Data: 400 high school students, 70 problems, a student finished 32

problems on average

• Train the Model– Estimate parameters

• Test the Model– For each student, for each problem:

• Estimate and P(Mi) via maximum likelihood• Predict P(Mi+1) given HMM dynamics• Predict Ui+1. Does it match actual Ui+1?

• Compare combined model vs. just an IRT model

Page 41: Learning to Teach: Improving Instruction with Machine Learning Techniques

Results

• Combined model achieved 72.5% cross-validation accuracy versus 72.0% for the IRT model– Gap is not statistically significant

• Opportunities for improving the accuracy of the combined model– Longer sequences (per student)– Better model of the dynamics, P(Mi+1 | Mi)

Page 42: Learning to Teach: Improving Instruction with Machine Learning Techniques

Conclusions

• Proposed a new, flexible model to jointly estimate student motivation and ability– Not separating ability from motivation conflates the two

concepts– Easily adjusted for other tutoring systems

• Combined model achieved similar accuracy to IRT model

• Online inference in real-time– Implemented in Java; ran it in one high school in May ’06

Page 43: Learning to Teach: Improving Instruction with Machine Learning Techniques

Introduction

Model Student Emotion

Model the Domain

Personalize Tutoring

Assess Learning

Agenda

Page 44: Learning to Teach: Improving Instruction with Machine Learning Techniques

Sensors used in the classroom

Bayesian networks and Linear regression models

Page 45: Learning to Teach: Improving Instruction with Machine Learning Techniques

Linear Models to Predict Emotions

Variables that help predict self-report of emotions. The result suggest that emotion depends on the context in which the emotion occurs (math problem just solved) and also can be predicted from physiological activity captured by the sensors (bottom row).

Page 46: Learning to Teach: Improving Instruction with Machine Learning Techniques

Introduction

Model the Student

Model the Domain

Personalize Tutoring

Assess Learning

Agenda

Intelligent TutoringSystems

Learning@ Scale

Page 47: Learning to Teach: Improving Instruction with Machine Learning Techniques

Domain Model

Kurt VanLehn,

Page 48: Learning to Teach: Improving Instruction with Machine Learning Techniques

Domain Model

The Andes Bayesian network before (left) and after (right) the observation A-is-a body.Kurt VanLehn.

Page 49: Learning to Teach: Improving Instruction with Machine Learning Techniques

Domain Model

Student actions (left) and the self-explanation model (right).The physics problem asks the student to fi nd the tension force exerted on a personhanging by a rope tied to his waist. Assume the midshipman was named Jake.

Page 50: Learning to Teach: Improving Instruction with Machine Learning Techniques
Page 51: Learning to Teach: Improving Instruction with Machine Learning Techniques

Stephens, 2006

Page 52: Learning to Teach: Improving Instruction with Machine Learning Techniques

Stephens, 2006

Page 53: Learning to Teach: Improving Instruction with Machine Learning Techniques

Stephens, 2006

Page 54: Learning to Teach: Improving Instruction with Machine Learning Techniques

Introduction

Model the Student

Model the Domain

Personalize Tutoring

Assess Learning

Agenda

Page 55: Learning to Teach: Improving Instruction with Machine Learning Techniques

Predicting Student Time To Complete

Two agents were built to predict student time to solve problems (Beck et al., 2000) .

1) Population student model (PSM): responsible for modeling how students interacted with the tutor, based on data from the entire population of users and input characteristics of the student, as well as information about the problem to be solved and output about the expected time (in seconds) the student would need to solve that problem.

2) Pedagogical agent (PA), and it was responsible for constructing a teaching policy. It was a reinforcement learning agent that reasoned about a student’s knowledge and provided customized examples and hints tailored for each student (Beck and Woolf, 2001; Beck et al., 1999a, 2000) .

Page 56: Learning to Teach: Improving Instruction with Machine Learning Techniques

The tutor predicted a current student’s reaction to a varietyof teaching actions, such as presentation of specifi c problem type.(Beck et al, 2000)

Overview of the ADVISOR machine learning component in AnimalWatch.

Page 57: Learning to Teach: Improving Instruction with Machine Learning Techniques

The tutor predicted a current student’s reaction to a varietyof teaching actions, such as presentation of specific problem type.Accounted for roughly 50% of the variance in the amount of time the system predicted a student would spend on a problem and the actual time spent to solve a problem.

(Beck et al, 2000)

Page 58: Learning to Teach: Improving Instruction with Machine Learning Techniques

ADVISOR predicted student response time using its population student model

Page 59: Learning to Teach: Improving Instruction with Machine Learning Techniques

Cycle Network

Cycle network in DT tutor. The network is rolled out to three time periods representing current, possible, and projected student actions. (From Murray et al., 2004.)

Page 60: Learning to Teach: Improving Instruction with Machine Learning Techniques

60

Models being EvaluatedSarah Schultz, WPI

Which model, learned over data, helps predict future performance best?

Few issues to solve

Page 61: Learning to Teach: Improving Instruction with Machine Learning Techniques

61

Problem Selection Within a Topic

Arroyo et al.

EDM Jounral effort.

Page 62: Learning to Teach: Improving Instruction with Machine Learning Techniques

62

Pedagogical Moves : Dynamically adjusted Empirical-based estimates of effort lead to adjusted problem difficulty and other affective and meta-cognitive feedback

Page 63: Learning to Teach: Improving Instruction with Machine Learning Techniques

63

E(Ii)

IL IH

E(Hi)

HL HH

E(Ti)

TLTH

0 1 2 3 4 0 1 2 3 4 5 6 7

Incorrect Attempts Hints Time (each bar=5seconds)

What is “normal” behavior?In EACH problem pi i=1, .., N N=Total problems in system

Within expected behavior

A new student encounters this problem…Is their behavior within expectation, or atypical?

Looking across the whole population of students who used a problem

Page 64: Learning to Teach: Improving Instruction with Machine Learning Techniques

64

What is odd behavior?

E(Ii)

IL IH

E(Hi)

HL HH

E(Ti)

TLTH

0 1 2 3 4 0 1 2 3 4 5 6 7

Incorrect Attempts Hints Time (each bar=5seconds)

Attempts < E(Ii) — IL Hints > E(Hi) + HH Time < E(Ti) — TL

In any problem pi i=1, .., N N=Total problems in system

Odd behavior

Few Inc. Attempts Lots of Hints Little Time< > <

Page 65: Learning to Teach: Improving Instruction with Machine Learning Techniques

65

Increasing Problem DifficultyAt the next time step. Assume we know problem difficulty of items.

LastProbSeen

Sorted list of harder math problems

Hardest of allEasiest

m

H=

= 3Parameter

X

--> Challenge rate

Page 66: Learning to Teach: Improving Instruction with Machine Learning Techniques

66

Decreasing Problem DifficultyAt the next time step. Assume we know problem difficulty of items.

LastProbSeen

Sorted list of easier math problems

HardestEasiest of all

n

E=

= 3Parameter

X

Page 67: Learning to Teach: Improving Instruction with Machine Learning Techniques

Introduction

Model the Student

Model the Domain

Personalize Tutoring

Assess Learning

Agenda

Learning@ Scale

Page 68: Learning to Teach: Improving Instruction with Machine Learning Techniques

Stanford’s Computer Science Course

Machine learning techniques were used to autonomously create a graphical model of how students in an introductory programming course progress through the homework assignment.

Machine learning algorithms found patterns in how students solved the Checkerboard Karel problem. These patterns were more informative at predicting how well students would perform on the class midterm than the grades students received on the assignment. The algorithm captured a meaningful general trend about how students were solving programming problems.

Piech, C., Sahami, M., Koller, D., Cooper, S., & Blikstein, P. (2012, February). Modeling how students learn to program. In Proceedings of the 43rd ACM technical symposium on Computer Science Education (pp. 153-160). ACM.

Page 69: Learning to Teach: Improving Instruction with Machine Learning Techniques

Student Modeling in Computer Programming

Bag of Words Difference: Researchers first built histograms of the different key words used in a computer program and used the Euclidean distance between two histograms as a naïve measure of the dissimilarity. This is akin to distance measures of text commonly used in information retrieval systems.

Application Program Interface (API) Call Dissimilarity: They ran each program with standard inputs and recorded the resulting sequence of API calls. They used Needleman-Wunsch global DNA alignment to measure the difference between the lists of API calls generated by the two programs.

Piech, C., Sahami, M., Koller, D., Cooper, S., & Blikstein, P. (2012, February). Modeling how students learn to program. In Proceedings of

the 43rd ACM technical symposium on Computer Science Education (pp. 153-160). ACM.

Page 70: Learning to Teach: Improving Instruction with Machine Learning Techniques

Hidden Markov Model

The first step in their student modeling process was to learn a high level representation of how each student progressed through the checkerboard Karel assignment. To learn this representation they modeled a student’s progress as a Hidden Markov Model (HMM) [17].

Learning a HMM. Each state from the HMM becomes a node in the FSM and the weight of a directed edge from one node to another provides the probability of transitioning from one state to the next. The program's Hidden Markov Model of state transitions for a given student. The node "codet" denotes the code snapshot of the student at time t, and the node "statet" denotes the high-level milestone that the student is in at time t. N is the number of snapshots for the student.

Piech, C., Sahami, M., Koller, D., Cooper, S., & Blikstein, P. (2012, February). Modeling how students learn to program. In Proceedings of the 43rd ACM technical symposium on Computer Science Education (pp. 153-160). ACM.

Page 71: Learning to Teach: Improving Instruction with Machine Learning Techniques

Dissimilarity Matrix

Clustering on a sample of 2000 random snapshots from the training set returned a group of well-defined snapshot clusters (see Figure 2). The value of K that maximized silhouette score (a measure of how natural the clustering was) was 26 clusters. A visual inspection of these clusters confirmed that snapshots which clustered together were functionally similar pieces of code.

Dissimilarity matrix for clustering of 2000 snapshots. Each row and column in the matrix represents a snapshot and the entry at row i, column j represents how similar snapshot i and j are (dark means more similar)

Piech, C., Sahami, M., Koller, D., Cooper, S., & Blikstein, P. (2012, February). Modeling how students learn to program. In Proceedings of the 43rd ACM technical symposium on Computer Science Education (pp. 153-160). ACM.

Page 72: Learning to Teach: Improving Instruction with Machine Learning Techniques

The finite set of high-level or milestones that a student could be in. A state is defined by a set of snapshots where all the snapshots in the set came from the same milestone. The transition probability, of being in a state given the state the student was in in the previous unit of time.

The emission probability, of seeing a specific snapshot given that you are in a particular state. To calculate the emission probability we interpreted each of the states as emitting snapshots with normally distributed dissimilarities. In other words, given the dissimilarity between a particular snapshot of student code and a state’s "representative" snapshot, we can calculate the probability that the student snapshot came from a given state using a Normal distribution based on the dissimilarity.

Piech, C., Sahami, M., Koller, D., Cooper, S., & Blikstein, P. (2012, February). Modeling how students learn to program. In Proceedings of

the 43rd ACM technical symposium on Computer Science Education (pp. 153-160). ACM.

Page 73: Learning to Teach: Improving Instruction with Machine Learning Techniques

The landscape of solutions for “gradient descent for linear regression” representing over 40,000 student code submissions with edges drawn between syntactically similar submissions and colors corresponding to performance on a

battery of unit tests (red submissions passed all unit tests).

Huang, J., Piech, C., Nguyen, A., & Guibas, L. (2013, June). Syntactic and functional variability of a million code submissions in a machine learning mooc. In AIED 2013 Workshops Proceedings Volume (p. 25).

Stanford’s MOOC:Teaching Machine Learning topics

Page 74: Learning to Teach: Improving Instruction with Machine Learning Techniques

Hour of Code Challenge Modeling How Young Students Learn to Program

Page 75: Learning to Teach: Improving Instruction with Machine Learning Techniques

Code.org problem solving graph of learned policy for how to solve a single open ended programming assignment from over 1M users. Each node is a unique partial-solution. The node 0 is the correct answer.

Chris Piech, Stanford Ph.D. student

Correct Answer

Arc: Next solution an expert would

recommend.

Node: unique partial solution.

Page 76: Learning to Teach: Improving Instruction with Machine Learning Techniques

Improved Retention

Code.org gathered over 137 million partialsolutions. Not all students made it through the entire Hour of Code but retention was quite high relative to other contemporaryopen access courses.

Page 77: Learning to Teach: Improving Instruction with Machine Learning Techniques

63K Peer Grading for 7K studentsBlue Blob: Student A

Red Circle: Students who

were graded by Student A.

Red Squares: Students who graded

Student A

A Coursera course to teach HCI. Peer grading network of 63K peer grades for 7K students. A single student is highlighted, red squares graded the student, red circles were graded by the student.

Chris Piech, Stanford Ph.D. student

Page 78: Learning to Teach: Improving Instruction with Machine Learning Techniques

Lan, A. S., Studer, C., Waters, A. E., & Baraniuk, R. G. (2013). Joint topic modeling and factor analysis of textual information and graded response data. arXiv preprint arXiv:1305.1956.

Circles: Concepts

Squares: Questions

Edges: StrongQuestion Concept

Relationship

Page 79: Learning to Teach: Improving Instruction with Machine Learning Techniques

Introduction

Model the Student

Model the Domain

Personalize Tutoring

Assess Learning

Agenda

Intelligent TutoringSystems

Learning@ Scale

Page 80: Learning to Teach: Improving Instruction with Machine Learning Techniques

Long term goal

Millions of schoolchildren will have access to what Alexander the Great enjoyed as a royal prerogerative: “the personal services of a tutor as well informed as Aristotle”

Pat Suppes, Stanford University, 1966Died Nov 2014)

”Students will have instant access to vast stores of knowledge through their computerized tutors”

Page 81: Learning to Teach: Improving Instruction with Machine Learning Techniques

Long term goal

Millions of schoolchildren will have access to what Alexander the Great enjoyed as a royal prerogerative: “the personal services of a tutor as well informed as Aristotle”

Pat Suppes, Stanford University, 1966Died Nov 2014)

”Students will have instant access to vast stores of knowledge through their computerized tutors”

Page 82: Learning to Teach: Improving Instruction with Machine Learning Techniques

Thank You !Any Questions?

Learning to Teach: Machine Learning Techniques

To Improving Instruction

NIPS 2015 Workshop on Human Propelled Machine Learning

Dec 13, 2014