Download pptx - Yelp Presentation_Final

Transcript
Page 1: Yelp Presentation_Final

Yelp Data AnalysisSugandha GoelNisha NairLiz StapletonYiqun Xiang

Page 2: Yelp Presentation_Final

Our Data• Data received from Yelp

• All Data – includes four countries (US, UK, DE, CA)

• Business – list of businesses, key variables included:• Business Category (multiple)• Review Count• # Stars• Location

• Tip – comments given by users about businesses

• User – list of users, key variables included:• Review Count• Average Stars Given• Yelping Since• # of Fans

Page 3: Yelp Presentation_Final

BUSINESS DATA

Page 4: Yelp Presentation_Final

• Initial dataset:• 61,184 initial records• 436 categories

• Removed:• Non-food related categories using category1 and category2

• 19,981 rows remaining• 113 categories remaining

• Columns that had less than 1,000 completed rows• More complete dataset

Business Data – Cleaning the Data

Page 5: Yelp Presentation_Final

Business Data – Decision Tree Analysis (CHAID)

With Drive Thru: 60% between 2.5 and 3.5

No Drive Thru: 58% between 3.5 and 4.0

26% 31% 60% 95%

64% 76% 86% 63% 73% 80% 90%

No Street Parkting: 69% > 3.5 stars

72% 58% 33%

With Street Parkting: 83% >3.5 stars

Page 6: Yelp Presentation_Final

• Important factors:• Drive Thru• Review Count• Parking (Lot/Street)• Noise Level• Takes Reservations• Outdoor Seating

• Non-Drive Thrus > Drive Thrus

• The greater the review count, the better the star rating

Business Data – Decision Tree Analysis (CHAID)

Page 7: Yelp Presentation_Final

8

Business Data – Tableau Discovery

Page 8: Yelp Presentation_Final

Business Data – Tableau Discovery

Population # of Business

Avg Reviews per Business

NV 2.8 M 4,626 83

AZ 6.7 M 7,255 47

NV/AZ(%) 42% 64% 179%

• The average number of reviews per business of NV (83) is twice of AZ (47) and five times of SC (16).

• Potential reasons:• (1). NV has more Yelp users• (2). The Yelp users in NV write reviews more frequently

• Conclusion: Yelp is more of a cultural norm in NV

Page 9: Yelp Presentation_Final

Business Data – Tableau Discovery

Page 10: Yelp Presentation_Final

TIP DATA

Page 11: Yelp Presentation_Final

• Completed Sentiment Analysis using r-studio• Randomly chose 50,000 comments from the

500,000 available

• Conclusions:• People may be worried about writing negative

reviews• People that are satisfied are more likely to spend

the time giving the business a positive review

Tip Data – Sentiment Analysis

Page 12: Yelp Presentation_Final

Tip Data – Word CloudsMost frequent words (1-star reviews)

Page 13: Yelp Presentation_Final

Tip Data – Word CloudsMost frequent words (5-star reviews)

Page 14: Yelp Presentation_Final

USER DATA

Page 15: Yelp Presentation_Final

• Removed:• All users without a user ID

• Added:• # of years since users started yelping

User Data – Cleaning the Data

Page 16: Yelp Presentation_Final

∗𝐹𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦𝑜𝑓 𝑅𝑒𝑣𝑖𝑒𝑤=𝑅𝑒𝑣𝑖𝑒𝑤𝐶𝑜𝑢𝑛𝑡

¿𝑜𝑓 𝑦𝑒𝑎𝑟𝑠 𝑦𝑒𝑙𝑝𝑖𝑛𝑔

User Data – Regression AnalysisConclusion• All three independent

variables are significant in this model

• More frequently a user writes reviews, the less fans they will have

• People care about quality rather than quantity of reviews

Page 17: Yelp Presentation_Final

SUMMARY

Page 18: Yelp Presentation_Final

Advice to Improve your Yelp Rating

Do:• Take reservations• Offer a quieter atmosphere• Offer sufficient parking• Encourage customers to write

reviews

Don’t:• Have a drive-thru• Have a noisy environment• Be cash only

Page 19: Yelp Presentation_Final

Software Used in Our Analysis

Page 20: Yelp Presentation_Final

QUESTIONS?

Page 21: Yelp Presentation_Final

APPENDIX

Page 22: Yelp Presentation_Final

User Data – Cluster Analysis

Conclusion• Cluster Analysis does not

provide any useful conclusions because 96% of the data falls into one cluster

• Most users are similar to one another

Page 23: Yelp Presentation_Final

Business Data – Tableau Discovery