RATE MY LISTING - UC Berkeley School of Information · Airbnb RATE MY LISTING 1 By Aysa Fan, Mohammad Habib, Janson Lui. Understanding the problem Project Objective Assumptions Demonstration

Predicting Customer Satisfaction on Airbnb

RATE MY LISTING

1

By Aysa Fan, Mohammad Habib, Janson Lui

Understanding the problem

Project Objective

Assumptions

Demonstration

How It Works

● Exploratory Data Analysis

● Baseline Models

● Final Model

Takeaways

TOC

2

www.companyname.com

Understanding the Problem

3

Highly competitive rental market Customer Satisfaction Matters What drives customer

satisfaction?

Challenge 1 Challenge 2 Challenge 3

Project Objective

Predict Customer Satisfaction Score using the features of an AirBnb listing.(The higher the better)

Target Audience

● Existing hosts● Potential hosts

Prediction Result

1. Customer Satisfaction Score2. Amenity List for Improvement3. Listing Comparison

4

5

Our data is specific to San Francisco.

Regional Data

Listing score is an acceptable proxy for customer satisfaction.

Listing Score as Customer Satisfaction Score

We use fixed features, such as neighborhood and property type, to establish a baseline customer satisfaction score for each listing, and use changeable features like air conditioning, bed type or parking to further refine the score.

Focusing on Changeable Features

Assumptions

Let’s see it in action

6

www.companyname.com

How It Works

7

Web Form

User

GunicornServer

Flask (init)

Web Form

Post

Dispatch

Response

Response

Model

Load model

www.companyname.com

Ratings Distribution Transformations

8

EDA: Dependent Variable (Ratings)

www.companyname.com9

EDA: Independent Variables - Part 1

Some amenities have higher weights in the overall model than others.


EDA: Independent Variables - Part 2

We kept neighborhood in the model to establish a baseline score, over which the changeable features affect the score.


EDA: Omitted Data

We omitted 3% of the data (rating < 80) from the main model


Baseline Models: Part 1

MODEL MSE R-Square

Lasso 21.67 0.13

Median 27.246 -0.12

Mean 24.84 0

Single-Feature 25.702 -0.012


Baseline Models: Part 2

Polynomials with different degrees Adding more data into the training model, things start to get better.


Final Model Construction

Data

ElasticNet

Ridge

GradientBoostingRegressor

AdaBoostRegressor

Random Forest Regressor

XGBRegressor Prediction

10-Fold Cross Validation

vecstack StackingTransformer


Final Model Results

Feature Name

MSE - Training

R2 - Training

MSE - Test R2 - Test

Numeric Features

12.80 0.5 15.03 0.39

Categorical Features

19.71 0.22 21.30 0.14

Amenities 5.37 0.79 6.19 0.75

Final Model Overall

4.89 0.81 5.75 0.77

We performed permutation test on the model, and the prediction on actual data was significantly better than on randomly permuted data.

Our tool can accurately predict the customer

satisfaction rating using features of a listing.

16

Hosts can improve customer satisfaction by updating their properties, and offering different amenities.

In short...

Breakfast, guests, pets, and a washer/dryer matter most to customers in San Francisco.

WiFi and Cable TV, not so much!

17

Takeaway

Thanks for WatchingQuestions?

18

Documents

RATE MY LISTING - UC Berkeley School of Information · Airbnb RATE MY LISTING 1 By Aysa Fan, Mohammad Habib, Janson Lui. Understanding the problem Project Objective Assumptions Demonstration