46
1

Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

Embed Size (px)

DESCRIPTION

Presentation give at the dataTEl Marketplace at the RecSysTEL workshop a

Citation preview

Page 1: Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

1

Page 2: Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

2http://www.flickr.com/photos/traftery/4773457853/sizes/lby Tom Raftery

Free the data

Page 3: Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

2http://www.flickr.com/photos/traftery/4773457853/sizes/lby Tom Raftery

Why ?

Page 4: Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

2http://www.flickr.com/photos/traftery/4773457853/sizes/lby Tom Raftery

Because we will get new

insights

Page 5: Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

Hendrik DrachslerCentre for Learning Sciences and Technology@ Open University of the Netherlands

Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

3

29.09.2010 RecSysTEL workshop at RecSys and ECTEL conference, Barcelona, Spain

#dataTEL

Page 6: Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

What is dataTEL

•dataTEL is a Theme Team funded by the STELLAR network of excellence.

•It address 2 STELLAR Grand Challenges 1. Connecting Learner2. Contextualisation

4

Page 7: Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

Recommender in TEL

5

Page 8: Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

The TEL recommender are a bit like this...

6

Page 9: Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

The TEL recommender are a bit like this...

6

We need to select for each application an appropriate recsys that fits its needs.

Page 10: Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

But...

7

Kaptain Koboldhttp://www.flickr.com/photos/kaptainkobold/3203311346/

“The performance results of different research efforts in recommender systems are hardly comparable.”

(Manouselis et al., 2010)

Page 11: Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

But...

7

Kaptain Koboldhttp://www.flickr.com/photos/kaptainkobold/3203311346/

“The performance results of different research efforts in recommender systems are hardly comparable.”

(Manouselis et al., 2010)

The TEL recommender experiments lack transparency. They need to be repeatable to test:

• Validity• Verification• Compare results

Page 12: Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

How others compare their recommenders

8

Page 13: Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

What kind of data set we can expect in TEL

9

Find an appropriate recommendation system for a set of goals, tasks, limitations, and constraints.More accurately – choose the most appropriate

system from a set of candidatesMajor claim: there is no silver bullet – a

system that is both the most accurate, the fastest, the cheapest, …

We need to select for each application an appropriate recsys that fits its needs.

Graphic by J. Cross, Informal learning, Pfeifer (2006)

Page 14: Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

The Long Tail

10Graphic Wilkins, D., (2009); Long tail concept Anderson, C. (2004)

Page 15: Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

The Long Tail

10Graphic Wilkins, D., (2009); Long tail concept Anderson, C. (2004)

The Long Tail of Learning

Page 16: Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

The Long Tail

10Graphic Wilkins, D., (2009); Long tail concept Anderson, C. (2004)

The Long Tail of LearningFormal Data

Informal Data

Page 17: Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

11

Guidelines for Data Set Aggregation

Justin Marshall, Coded Ornament by rootoftwo http://www.flickr.com/photos/rootoftwo/267285816

Page 18: Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

11

Guidelines for Data Set Aggregation

1. Create a data set that realistically reflects the variables of the learning setting.

Justin Marshall, Coded Ornament by rootoftwo http://www.flickr.com/photos/rootoftwo/267285816

Page 19: Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

11

Guidelines for Data Set Aggregation

1. Create a data set that realistically reflects the variables of the learning setting.

2. Use a sufficiently large set of user profiles

Justin Marshall, Coded Ornament by rootoftwo http://www.flickr.com/photos/rootoftwo/267285816

Page 20: Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

11

Guidelines for Data Set Aggregation

1. Create a data set that realistically reflects the variables of the learning setting.

2. Use a sufficiently large set of user profiles

3. Create data sets that are comparable to others

Justin Marshall, Coded Ornament by rootoftwo http://www.flickr.com/photos/rootoftwo/267285816

Page 21: Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

11

Guidelines for Data Set Aggregation

1. Create a data set that realistically reflects the variables of the learning setting.

2. Use a sufficiently large set of user profiles

3. Create data sets that are comparable to others

Justin Marshall, Coded Ornament by rootoftwo http://www.flickr.com/photos/rootoftwo/267285816

Page 22: Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

Standardize research on recommender systems in TEL

Five core questions:

1.How can data sets be shared according to privacy and legal protection rights?

2.How to development a policy to use and share data sets?3.How to pre-process data sets to make them suitable for

other researchers?4.How to define common evaluation criteria for TEL

recommender systems?5.How to develop overview methods to monitor the

performance of TEL recommender systems on data sets?

12

dataTEL::Objectives

Page 23: Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

13

1. Protection Rights

Page 24: Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

13

1. Protection Rights

Page 25: Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

13

1. Protection Rights

O V E R S H A R I N G

Page 26: Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

13

1. Protection Rights

O V E R S H A R I N GWere the founders of PleaseRobMe.com actually allowed to grab the data from the web and present it in that way?

Page 27: Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

13

1. Protection Rights

O V E R S H A R I N GWere the founders of PleaseRobMe.com actually allowed to grab the data from the web and present it in that way?

Are we allowed to use data from social handles and reuse it for research purposes?

Page 28: Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

13

1. Protection Rights

O V E R S H A R I N GWere the founders of PleaseRobMe.com actually allowed to grab the data from the web and present it in that way?

Are we allowed to use data from social handles and reuse it for research purposes?

and there is more company protection rights!

Page 29: Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

14

2. Policies to use Data

Page 30: Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

14

2. Policies to use Data

Page 31: Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

14

2. Policies to use Data

Page 32: Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

14

2. Policies to use Data

Page 33: Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

14

2. Policies to use Data

Page 34: Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

14

2. Policies to use Data

Page 35: Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

14

2. Policies to use Data

Page 36: Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

15

3. Pre-Process Data Sets

For formal data sets from LMS:

1. Data storing scripts2. Anonymisation scripts

For informal data sets:

1. Collect data2. Process data3. Share data

Page 37: Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

16

4. Evaluation Criteria

Kirkpatrick model by Manouselis et al. 2010

Combine approach by Drachsler et al. 2008

Page 38: Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

16

4. Evaluation Criteria

Kirkpatrick model by Manouselis et al. 2010

Combine approach by Drachsler et al. 2008

1. Accuracy2. Coverage3. Precision

Page 39: Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

16

4. Evaluation Criteria

Kirkpatrick model by Manouselis et al. 2010

Combine approach by Drachsler et al. 2008

1. Accuracy2. Coverage3. Precision

1. Effectiveness of learning2. Efficiency of learning 3. Drop out rate4. Satisfaction

Page 40: Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

16

4. Evaluation Criteria

Kirkpatrick model by Manouselis et al. 2010

Combine approach by Drachsler et al. 2008

1. Accuracy2. Coverage3. Precision

1. Effectiveness of learning2. Efficiency of learning 3. Drop out rate4. Satisfaction

1. Reaction of learner2. Learning improved 3. Behaviour 4. Results

Page 41: Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

16

4. Evaluation Criteria

Kirkpatrick model by Manouselis et al. 2010

Combine approach by Drachsler et al. 2008

1. Accuracy2. Coverage3. Precision

1. Effectiveness of learning2. Efficiency of learning 3. Drop out rate4. Satisfaction

1. Reaction of learner2. Learning improved 3. Behaviour 4. Results

Page 42: Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

16

4. Evaluation Criteria

Kirkpatrick model by Manouselis et al. 2010

Combine approach by Drachsler et al. 2008

1. Accuracy2. Coverage3. Precision

1. Effectiveness of learning2. Efficiency of learning 3. Drop out rate4. Satisfaction

1. Reaction of learner2. Learning improved 3. Behaviour 4. Results

Page 43: Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

17

5. Data Set Framework to Monitor Performance

Formal Datasets Informal

Data A Data B Data C

Algorithms: Algoritmen A Algoritmen B Algoritmen C

Models: Learner Model A Learner Model B

Measured attributes: Attribute A Attribute B Attribute C

Algorithms: Algoritmen D Algoritmen E

Models: Learner Model C Learner Model E

Measured attributes: Attribute A Attribute B Attribute C

Algorithms: Algoritmen B Algoritmen D

Models: Learner Model A Learner Model C

Measured attributes: Attribute A Attribute B Attribute C

Page 44: Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

18

Join us for a Coffee ...http://www.teleurope.eu/pg/groups/9405/datatel/

Page 45: Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

19

Upcoming event ...Workshop: dataTEL- Data Sets for Technology Enhanced Learning

Date: March 30th to March 31st, 2011

Location: Ski resort La Clusaz in the French Alps, Massif des Aravis

Funding: Food and lodging for 3 nights for 10 selected participants

Submissions: For being funded, please send extended abstracts to http://www.easychair.org/conferences/?conf=datatel2011

Deadline for submissions: October 25th, 2010

CfP: http://www.teleurope.eu/pg/pages/view/46082/

Page 46: Issues and Considerations regarding Sharable Data Sets for Recommender Systems in TEL

This silde is available at:http://www.slideshare.com/Drachsler

Email: [email protected]

Skype: celstec-hendrik.drachsler

Blogging at: http://www.drachsler.de

Twittering at: http://twitter.com/HDrachsler

20

Many thanks for your interests