View
150
Download
0
Category
Tags:
Preview:
Citation preview
Topic0
japanese
Topic1
mexican
Topic2
brunchTopic3
bar/atmosphere
Topic4
pizza
Topic 5
compliment
sushi tacos breakfast great pizza great
roll mexican coffee beer crust best
pita salsa eggs happyhour wings love
tuna burrito bacon bar thin good
salmon chips pancakes drinks pepperoni like
Topic6
indian
Topic7
asian
Topic8
fastfood
Topic9
sweets
Topic10
bbq
Topic 11
service(bad)
indian thai burger bagels cheese service
buffet pho fries cheese bbq didn’t
masala chinese potato best sauce never
naan soup onion rings smoothies chicken even
bianco curry dog Iove ribs back
LDA (Latent Dirichlet Allocation) 12 topics
Good restaurant: average star>3.5 Bad restaurant: average star<=3.5
Classification : Weight for each topic
Classifier Linear SVM Logistic
Regression
Random Forest
Accuracy in
Cross Validation 73.67% 81.19% 77.7%
Evaluation of Recommendation Error Using
Normalized Distance-based Performance Measure (NDPM)
R_2
R_5
R_1
R_3
R_4
R_1
R_2
R_3
R_4
R_5
Recom Actual Ranking
(ground truth)
Minimize NDPM score
Goal:
Assessment of LDA
BOW (Bag of Words) LDA
Feature
Dimension 10000 words in dictionary 15 topics
>99%
dimension
reduction
Computation
Efficiency 2.5 hrs 15 min
>90%
computation
time
(2000 samples)
(10 fold cross validation)
Topic0 Topic1 Topic2 Topic3 Topic4
Japanese Mexican brunchBar/
pizzaAtmosphere
sushi tacos breakfast great pizza
roll mexican coffee beer crust
pita salsa eggs happyhour wings
tuna burrito bacon bar thin
salmon chips pancakes drinks pepperoni
Topic5 Topic6 Topic7 Topic8 Topic9
Indian Asian fastfood sweets bbq
Indian Thai burger bagels cheese
buffet Pho fries cheese bbq
masala Chinese potato best sauce
naan soup Onion ring smoothies chicken
bianco curry dog Iove ribs
LDA (Latent DiriChlet Allocation) 15 topics
Topic 10: Compliment
Great, best, live, good, like
Topic 11: Service ( Bad)
Service, didn’t, never, even, back
Assessment of LDA
Dimension Reduction: 99% reduction in dimension
BOW (bag of words) features: 10,000 LDA features: 15 Topics
Computation efficiency: 2000 samples, 10 fold cross validation
BOW features: 2.5 hrs LDA features: 15 min
3%4%
10%
4%
5%
29%
20%
6%
3%
16%
Percentage
others Japanese Mexican Brunch
bar Service compliment Asian
fastfood bbq
Evaluation of Recommendation Error Using
Normalized Distance-based Performance Measure (NDPM)
Rank by recommendation
Rank by user's actual ratings
Restaurant_1 Restaurant_1
Restaurant_2 Restaurant_3
Restaurant_3 Restaurant_7
Restaurant_4 Restaurant_2
Restaurant_5 Restaurant_4
Restaurant_6 Restaurant_9
Restaurant_7 Restaurant_8
Restaurant_8 Restaurant_5
Restaurant_9 Restaurant_10
Restaurant_10 Restaurant_6
Recommended