24
Educational Data Mining and DataShop John Stamper Carnegie Mellon University 1 9/12/201 2 PSLC Corporate Partner Meeting 2012

Educational Data Mining and DataShop

  • Upload
    geoff

  • View
    39

  • Download
    0

Embed Size (px)

DESCRIPTION

Educational Data Mining and DataShop. John Stamper Carnegie Mellon University. The Classroom of the Future. Which picture represents the “Classroom of the Future”?. 9/12/2012. The Classroom of the Future. The answer is both! Depends of how much money you have... - PowerPoint PPT Presentation

Citation preview

Page 1: Educational Data Mining  and DataShop

1

Educational Data Mining and DataShop

John StamperCarnegie Mellon University

9/12/2012 PSLC Corporate Partner Meeting 2012

Page 2: Educational Data Mining  and DataShop

The Classroom of the Future

Which picture represents the “Classroom of the Future”?

29/12/2012 PSLC Corporate Partner Meeting 2012

Page 3: Educational Data Mining  and DataShop

3

The Classroom of the Future

The answer is both!Depends of how much money you have...

… but maybe not what you think…

9/12/2012 PSLC Corporate Partner Meeting 2012

Page 4: Educational Data Mining  and DataShop

4

The Classroom of the Future

Rich vs. Poor– Poor kids will be forced to rely on “cheap” technology– Rich kids will have access to “expensive” teachers

We are seeing this today!– Waldorf school in Silicon Valley – no technology– NGLC Wave III Grants– MOOCs (AI Course at Stanford)– Growth of adaptive technology companies– Online instruction– … and more…

9/12/2012 PSLC Corporate Partner Meeting 2012

Page 5: Educational Data Mining  and DataShop

5

What does this mean?

My view is that we cannot stop this, I believe we must accept that economics will force this route.

We should focus on improving learning technology• New ways to improve teacher-student access• Add more adaptive features to learning software

Intelligent Tutors, at scale, using data!

9/12/2012 PSLC Corporate Partner Meeting 2012

Page 6: Educational Data Mining  and DataShop

PSLC Corporate Partner Meeting 2012 69/12/2012

Educational Data Mining

• “Educational Data Mining is an emerging discipline, concerned with developing methods for exploring the unique types of data that come from educational settings, and using those methods to better understand students, and the settings which they learn in.” – www.educationaldatamining.org

Page 7: Educational Data Mining  and DataShop

PSLC Corporate Partner Meeting 2012 79/12/2012

Classes of EDM Methods(Baker & Yacef, 2009)

• Prediction• Clustering• Relationship Mining• Discovery with Models• Distillation of Data For Human Judgment

Page 8: Educational Data Mining  and DataShop

PSLC Corporate Partner Meeting 2012 89/12/2012

Prediction

• Develop a model which can infer a single aspect of the data (predicted variable) from some combination of other aspects of the data (predictor variables)

• Does a student know a skill?• Which students are off-task?• Which students will fail the class?

Page 9: Educational Data Mining  and DataShop

PSLC Corporate Partner Meeting 2012 99/12/2012

Clustering

• Find points that naturally group together, splitting full data set into set of clusters

• Usually used when nothing is known about the structure of the data– What behaviors are prominent in domain?– What are the main groups of students?

Page 10: Educational Data Mining  and DataShop

PSLC Corporate Partner Meeting 2012 109/12/2012

Relationship Mining

• Discover relationships between variables in a data set with many variables– Association rule mining– Correlation mining– Sequential pattern mining– Causal data mining

Page 11: Educational Data Mining  and DataShop

PSLC Corporate Partner Meeting 2012 119/12/2012

Discovery with Models

• Pre-existing model (developed with EDM prediction methods… or clustering… or knowledge engineering)

• Applied to data and used as a component in another analysis

Page 12: Educational Data Mining  and DataShop

PSLC Corporate Partner Meeting 2012 129/12/2012

Distillation of Data for Human Judgment

• Making complex data understandable by humans to leverage their judgment

• Text replays are a simple example of this

Page 13: Educational Data Mining  and DataShop

PSLC Corporate Partner Meeting 2012 139/12/2012

Knowledge Engineering

• Creating a model by hand rather than automatically fitting model

• In one comparison, leads to worse fit to gold-standard labels of construct of interest than data mining (Roll et al, 2005), but similar qualitative performance

Page 14: Educational Data Mining  and DataShop

PSLC Corporate Partner Meeting 2012 149/12/2012

LearnLab

• The LearnLab has played a pivotal role in the creation of the EDM community

• The CMDM thrust of the center focuses on Educational Data Mining

• DataShop is also a key tool for the EDM community

Page 15: Educational Data Mining  and DataShop

PSLC Corporate Partner Meeting 2012 159/12/2012

DataShop

• Open repository for educational data

• Many large-scale datasets both public and private

• Tools for – exploratory data analysis– learning curves– domain model testing

Page 16: Educational Data Mining  and DataShop

PSLC Corporate Partner Meeting 2012 169/12/2012

DataShop

• Import/Export of data

• Custom fields

• Easy Knowledge Model creation and validation

• Web services for tools integration

Page 17: Educational Data Mining  and DataShop

PSLC Corporate Partner Meeting 2012 17

Demo

9/12/2012

Page 18: Educational Data Mining  and DataShop

PSLC Corporate Partner Meeting 2012 189/12/2012

Engaging the KDD/ICDM Community

• Some hesitation from these groups– Educational data not interesting– Too applied– Not “big” enough for eScience

• This was one motivation for the 2010 KDD Cup

Page 19: Educational Data Mining  and DataShop

PSLC Corporate Partner Meeting 2012 199/12/2012

KDD Cup Competition

Knowledge Discovery and Data Mining (KDD) is the most prestigious conference in the data mining and machine learning fields

KDD Cup is the premier data mining challenge 2010 KDD Cup called “Educational Data

Mining Challenge” Ran from April 2010 through June 2010

Page 20: Educational Data Mining  and DataShop

PSLC Corporate Partner Meeting 2012 209/12/2012

KDD Cup Competition Competition goal is to predict student responses given tutor

data provided by Carnegie Learning

Dataset Students Steps File size

Algebra I 2008-2009 3,310 9,426,966 3 GB

Bridge to Algebra 2008-2009

6,043 20,768,884 5.43 GB

Page 21: Educational Data Mining  and DataShop

PSLC Corporate Partner Meeting 2012 219/12/2012

KDD Cup Competition 655 registered participants

130 participants who submitted predictions

3,400 submissions

Page 22: Educational Data Mining  and DataShop

PSLC Corporate Partner Meeting 2012 229/12/2012

KDD Cup Competition

Advances in prediction and cognitive modeling

Excitement in the KDD Community The datasets are now in the “wild” and

showing up in non KDD conferences New competitions have been done and are

in the works

Page 23: Educational Data Mining  and DataShop

PSLC Corporate Partner Meeting 2012 23

Opportunities

• Huge potential for EDM and DataShop to improve educational systems

• DataShop is open and staff is available to help get users started

• Great option for creating capstone projects

9/12/2012

Page 24: Educational Data Mining  and DataShop

PSLC Corporate Partner Meeting 2012 249/12/2012

EDM Community is Online!

www.educationaldatamining.org

EDM 2013 in Memphis TN in July

Questions: [email protected]