2 - 2 - W1 V1- Big Data in Education (6-55)

Embed Size (px)

Citation preview

  • 8/13/2019 2 - 2 - W1 V1- Big Data in Education (6-55)

    1/5

    Hello.I'm Ryan Baker and this is Big Data inEducation.In this class, you'll learn methods usedfor exploring bigdata education, which is called

    educational data mining or learninganalytics.These two communities have the joint goalof exploringthe big data now available on learners andlearning.To promote new scientific discoveries andto advance the sciences of learning.To promote better assessment of learnersalong multiple dimensions,including the social,cognitive,emotional, meta-cognitive dimensions.And at multiple levels including theindividual,

    the group, and the institution and so on.

    And to use these findings to promotebetter real-time support for learners.Art Graesser, the editor of the Journal ofEducational Psychology,said that EDMlearning analytics is escalating the speedof research on many problems in education.Not only can you look at the uniquelearning trajectories

    of individuals, but the sophistication of

    the models goes up enormously.I say, that EDM learning analytics is, andI quote, great.Despite the area's newness,we've learneda few things about key problems.And this course is about methods that havebeen found tobe useful for those problems by EDM andlearning analytics researchers.For a more theoretical overview of thefield, you mightalso be interested in the Society for

    Learning Analytics Research's MOOC.Now where do these methods come from?Some of the methods will be familiar tosomeonewho's got a background in data mining ormachine learning.Some of the other methods will be familiartosomeone with a background in psychometricsor traditional statistics.

  • 8/13/2019 2 - 2 - W1 V1- Big Data in Education (6-55)

    2/5

    You don't have to have either of thesemethods to get something useful out ofthis course.Regardless of what background you have,what I'd encourage youto do is pick and choose what you findmost useful.A few words for data miners.You're going to find that there are somecurrent trends in data mining that aren'trepresented.And someof them haven't gotten here yet.And some of them have gotten here, butthey just haven't been very useful yet.Educational data's big, but it's notGoogle big.In this course, I'm going to be focusingonthe methods of broadest usefulness not oncoolest newestness.Also, you may find that some classicalgorithms aren't very well represented.One example's neural networks.

    They haven't been all that heavily used ineducational data mining, andone reason is that over-fitting is aplague in the highly context-based, andnot that big, data sets we use.Now you may notice I've said a coupletimes, our data sets aren'tthat big, but the name of the course isbig data in education.I see you all running to hand back yourmoney that you didn't pay for this course.Well, big data in education is big.It's big by comparison to most classical

    education research, andit's actually big compared to common datasets in many domains.But its not human genome project or Googlebig.Itis big enough for example, thatdifferences in rsquared of 0.0019 routinely come up withstatistically significant.I will talk about statistical significancesometimes, but it'snot going to be a focus of this class.

    We briefly talk about some of the types ofeducational data mining and learninganalytics method, and this comesfrom a rev, review by me and GeorgeSiemens, whichbuilt off a separate review by me andKalina Yacef.One of the key types of educational datamining, andone we'll focus on early in the class, is

  • 8/13/2019 2 - 2 - W1 V1- Big Data in Education (6-55)

    3/5

    prediction.In prediction, we develop a model whichcaninfer a single aspect of the data, whichwecall the predicted variable, from somecombination ofother aspects of the data, the predictorvariables.So, for example, we might ask, whichstudentsare off-class, or which students will failthe class?Another type of method is structurediscovery, where we try to findstructure and patterns in the data thatemerge naturally, so to speak.They come out of the data, we don't tellthe datawhat we're looking for, there's nospecific target or predictor variable.Example of this include,clustering, factor analysis and domainstructure discovery.

    In your relationship mining, a third keytype of educational data miningmethod, we discover relationships betweenvariablesin a data set with many variables.And in discovery with models, apre-existing model, developedwith prediction methods, or clustering, orknowledge engineering, orwhatever, we take it, we apply it to data,and we use it as, a component in anotheranalysis.One question that's worth asking is, why

    now?Why has educational data mining only beenemerging over the last five, six, eightyears?Why didn't it emerge in the early 1980slike bioinformatics?Well, there's a lot of reasons.One of the ones, is not enough data earlyon.In the 1980s, collecting educational datawas highly resource-intensive anddifficult to scale.And most of the data that was

    easily collectible was purely summative innature.Gathering data on learning processes andlearning behaviors, in fieldsettings, required methods like,quantitative field operations, videorecordingsand think-aloud studies, and none of themscale very easily.Fast forward to today.

  • 8/13/2019 2 - 2 - W1 V1- Big Data in Education (6-55)

    4/5

    There's still lots of standardized exams,probablymore than a couple decades ago, andthey'restill summative in nature, but lots ofstudents now use internet-basededucational software in class.And that can be used to get atlearning processes and learner behaviorsat a fine-grained scale.We can log student behaviorat a second by second level.And the data acquisition is very scalable.And even beyond that, there are thesethingscalled MOOCs, which you may have heard of.One great source for this educational datais the Pittsburgh Science and LearningCenter DataShop.I love DataShop, it's the world's leadingpublic repository for educational softwareinteraction data.Last I heard, it's got over 250,000 hoursof data from students

    using educational software, with over 30million student actions, responses andsystem annotations.Actions include things like entering anequation, typinga phrase into a learning system,requesting help.Responses include error feedback andstrategic hits.Annotations include things like, was thestudent'saction correct, how long will thestudent's

    action take, and what was the role of itsskill or concept at the time.Say a quick word about tools.There are a bunch of tools you can use inthis class, and for the mostpart, few assignments aside, I don't havestrongrequirements about what tools you chooseto use.We'll talk about tools during the course.You may think, you may want to think aboutdownloading or setting up accountsfor tools like RapidMiner 5.3, that's

    probablygoing to be our biggest tool in thecourse.SAS OnDemand for Academics, Weka,Microsoft Excel, great tool if you got it.Microsoft Excel is great for prototypingstuff, figuring outwhat you want to do, kind of morethoroughly afterwards.Java, MATLAB.

  • 8/13/2019 2 - 2 - W1 V1- Big Data in Education (6-55)

    5/5

    No hurry, but keep this in mind, some ofthis may come up more useful later in thecourse.Some closing thoughts.Educational data mining and learninganalyticsmethods are emerging for big dataeducation.In this class, you'll learn the keymethods, and how to use them topromote scientific discovery, and to driveinterventionand improvement in educational softwareand systems.We'll discuss the strengths and weaknessesof methods for different applicationstowards answering the question, is youranalysis trustworthy and is it applicable?So thank you very much for coming today,welcome to Big Data in Education,I look forward to having you in the class.Thank you very much.