39
Temporal Patterns in Online Food Innovation Tomasz Kuśmierczyk , Christoph Trattner, Kjetil Nørvåg

Temporal Patterns in Online Food Innovation (TempWeb2015; Companion)

Embed Size (px)

Citation preview

Temporal Patterns in Online Food Innovation

Tomasz Kuśmierczyk, Christoph Trattner, Kjetil Nørvåg

Agenda

● Data set● Measures● Findings

2

Data set

3

Data set

4

● source: kochbar.de ● years: 2008-2014

Data set● source: kochbar.de ● years: 2008-2014● 200k users● social connections● groups

5

Data set● source: kochbar.de ● years: 2008-2014● 200k users● social connections● groups● 400k recipes

6

Data set● source: kochbar.de ● years: 2008-2014● 200k users● social connections● groups● 400k recipes● 270 categories

7

Data set● source: kochbar.de ● years: 2008-2014● 200k users● social connections● groups● 400k recipes● 270 categories● recipes feedback

○ ratings○ comments 8

Basic statistics

9

Recipe representation

recipe = set of ingredients

10

Community Evolution

● #ingredients (almost) constant

● #recipes increases

● #combinations increases

● #recipes ≈ #combinations

11

Measuring Innovation

12

Approach 1 to capture innovation

13

Methods: Entropy

14

● Captures distribution complexity:○ Increases with

number of possible outcomes

○ Increases when distribution becomes more uniform

X = ingredients / combinations

Methods: Conditional Entropy

15

● Measures how much more you need to predict Y knowing X

Y = combinationsX = ingredients

Community Evolution: Entropy● constant entropy of

ingredients● continuous growth of

ingredients combinations complexity

● consequence: H(combination | ingredients) grows

16

Approach 2 to capture innovation

17

Methods: Innovation Factor

18

recipe r similarity to all preceding recipes r’

Methods: Innovation Factor

19

recipe r similarity to all preceding recipes r’

Jaccard Index over recipes ingredients

Two sample recipes from 2012-01-01

20

pumpkin pesto chicken soup à la Heiko

● hokkaido pumpkin pulp

● oil● garlic cloves● pumpkin seeds● sunflower oil● pumpkin seed oil● grated parmesan

● chicken● onion● carrots● leek● celery● salt● parsley

rarely used ingredients typical ingredients

Two sample recipes from 2012-01-01

21

pumpkin pesto chicken soup à la Heiko

● hokkaido pumpkin pulp

● oil● garlic cloves● pumpkin seeds● sunflower oil● pumpkin seed oil● grated parmesan

● chicken● onion● carrots● leek● celery● salt● parsley

IF(pumpkin pesto) = 0.67 IF(chicken soup) = 0.33

Innovation in time

Two phases:1. strong decline 2. slow but steady

increase

22

2010

Recipes innovation distribution

● shift towards higher values

● long tail

23

(after 2010-01-01)

Innovation seasonal & weekly trends

24

Seasonal trends Weekly trends

Food categories innovation

● discrepancies between different categories

● no correlation between IF and recipes production

25

(after 2010-01-01)

Users’ innovation

26

Methods: User Innovation Factor

27

user innovation = mean over innovation of his/her recipes

Users innovation distribution

● means follow a normal distribution

● medians reveal potential irregularities

28

Users innovation LM fitting

● interesting outliers (changing innovation a lot)

29

What drives innovation?

● low values● localization, gender...

30

Info. Gain. = How many bits on average would it save me if I knew feature value?

Future work● what are the factors that drive online food recipe

innovation?

● how does user’s success depend on innovativeness?

● why do we observe temporal patterns in innovation?

● more advanced representations of recipes31

Thank you!

Tomasz Kuś[email protected]

33

Questions?

Tomasz Kuś[email protected]

34

Data set● 400k recipes (ingredients, steps)● 200k users (5k with 10 recipes)● ratings (recipe consumption proxy)● timestamps

35

Findings 1: Community evolution● Although the number of known ingredients remains relatively low and

constant, the community is able to continuously combine them to form a number of new and innovative recipes over time. Hence, after an initial phase where the innovation factor is decreasing we find that the innovation factor is not only stabilizing at a surprisingly high level but is slowly growing in time.

36

Findings 2: Temporality & Categories● The food innovation factor depends on the season of the year and to some

smaller extent also on the day of the week.● We find significant differences in terms of the innovation factor between

recipes from different categories that cannot be explained by the number of recipes produced in the category.

37

Findings 3: Users Innovation● The temporal profiles of the users meaningfully vary, e.g., some users are

more successful in innovating over time than others.● Geographical origin is the most important factor, significantly more than

age, gender or number of friends to drive the users’ innovation factors.

38

Contributions● The first study of online food communities innovation● Findings about:

○ community evolution○ temporality of innovation ○ users and categories innovation

● Novel large-scale dataset to study online food recipe consumption and production patterns

39