Upload
tomasz-kusmierczyk
View
246
Download
0
Embed Size (px)
Citation preview
Data set● source: kochbar.de ● years: 2008-2014● 200k users● social connections● groups● 400k recipes
6
Data set● source: kochbar.de ● years: 2008-2014● 200k users● social connections● groups● 400k recipes● 270 categories
7
Data set● source: kochbar.de ● years: 2008-2014● 200k users● social connections● groups● 400k recipes● 270 categories● recipes feedback
○ ratings○ comments 8
Community Evolution
● #ingredients (almost) constant
● #recipes increases
● #combinations increases
● #recipes ≈ #combinations
11
Methods: Entropy
14
● Captures distribution complexity:○ Increases with
number of possible outcomes
○ Increases when distribution becomes more uniform
X = ingredients / combinations
Methods: Conditional Entropy
15
● Measures how much more you need to predict Y knowing X
Y = combinationsX = ingredients
Community Evolution: Entropy● constant entropy of
ingredients● continuous growth of
ingredients combinations complexity
● consequence: H(combination | ingredients) grows
16
Methods: Innovation Factor
19
recipe r similarity to all preceding recipes r’
Jaccard Index over recipes ingredients
Two sample recipes from 2012-01-01
20
pumpkin pesto chicken soup à la Heiko
● hokkaido pumpkin pulp
● oil● garlic cloves● pumpkin seeds● sunflower oil● pumpkin seed oil● grated parmesan
● chicken● onion● carrots● leek● celery● salt● parsley
rarely used ingredients typical ingredients
Two sample recipes from 2012-01-01
21
pumpkin pesto chicken soup à la Heiko
● hokkaido pumpkin pulp
● oil● garlic cloves● pumpkin seeds● sunflower oil● pumpkin seed oil● grated parmesan
● chicken● onion● carrots● leek● celery● salt● parsley
IF(pumpkin pesto) = 0.67 IF(chicken soup) = 0.33
Food categories innovation
● discrepancies between different categories
● no correlation between IF and recipes production
25
(after 2010-01-01)
Users innovation distribution
● means follow a normal distribution
● medians reveal potential irregularities
28
What drives innovation?
● low values● localization, gender...
30
Info. Gain. = How many bits on average would it save me if I knew feature value?
Future work● what are the factors that drive online food recipe
innovation?
● how does user’s success depend on innovativeness?
● why do we observe temporal patterns in innovation?
● more advanced representations of recipes31
Data set: Contact to acquire
32
Data set● 400k recipes (ingredients, steps)● 200k users (5k with 10 recipes)● ratings (recipe consumption proxy)● timestamps
35
Findings 1: Community evolution● Although the number of known ingredients remains relatively low and
constant, the community is able to continuously combine them to form a number of new and innovative recipes over time. Hence, after an initial phase where the innovation factor is decreasing we find that the innovation factor is not only stabilizing at a surprisingly high level but is slowly growing in time.
36
Findings 2: Temporality & Categories● The food innovation factor depends on the season of the year and to some
smaller extent also on the day of the week.● We find significant differences in terms of the innovation factor between
recipes from different categories that cannot be explained by the number of recipes produced in the category.
37
Findings 3: Users Innovation● The temporal profiles of the users meaningfully vary, e.g., some users are
more successful in innovating over time than others.● Geographical origin is the most important factor, significantly more than
age, gender or number of friends to drive the users’ innovation factors.
38