39
Hello! We are Group 2 ! Alok Mani Singh (03) Bodhisattwa Prasad Majumder (11) Gautam Kumar (17) Pranita Khandelwal (31) Ramakrishna Ronanki (34) PGDBA!

DSL presentation_final

Embed Size (px)

Citation preview

Page 1: DSL presentation_final

Hello!We are Group 2 !Alok Mani Singh (03)Bodhisattwa Prasad Majumder (11)Gautam Kumar (17)Pranita Khandelwal (31)Ramakrishna Ronanki (34)

PGDBA!

Page 2: DSL presentation_final

OpinionMining: A complete story telling algorithm

Page 3: DSL presentation_final

“Your most unhappy customers are your greatest source of learning.

-Bill Gates.

Page 4: DSL presentation_final

PROBLEM STATEMENT

• Detecting main causes of satisfaction and dissatisfaction, given a collection of customer reviews about a product

• Infer conclusions about most liked or most disliked aspects of product

• How can these insights be useful to improve sales

Page 5: DSL presentation_final

INTRODUCTION• Only knowing sentiments is not

enough . It is needed to know more about what customer said and why they said

• Reason for their satisfaction and dissatisfaction are important

• Customers’ opinion about features of product are important

Page 6: DSL presentation_final

DATA

Our working data contains set of reviews for Hotel Marriott (San Diego), taken fromTripAdvisor.com

Page 7: DSL presentation_final

NATURAL LANGUAGE PARSER

Page 8: DSL presentation_final

Why?We can estimate aspect sentiment score for each aspect of the Hotel

Page 9: DSL presentation_final

ALGORITHM

Aspect Segmenter

Prepare an exhaustive equivalence class for each aspects (seed words)

For nodes matching with aspect equivalence class, extract the 2-hop subgraph from dependency graph

Generate word dependency graph using NLP Parser

Calculate sentiment score for each aspect with the help of sentiment lexicon

Page 10: DSL presentation_final

RESULTS

Value Room Location Cleanliness Checkin Service Business Service

0

0.05

0.1

0.15

0.2

0.25

Sentiments over aspects

• Location has most positive sentiment whereas cleanliness has most negative sentiment

Page 11: DSL presentation_final

What’s Missing?This approach is unable to give how the customer weigh each aspects! It is desirable to know how much customer value each aspect along with aspect wise sentiments.

Page 12: DSL presentation_final

LATENT ASPECT RATING ANALYSIS (LARA)

Page 13: DSL presentation_final

Why?Key Idea is to find out the latent aspect wise weights for each review along with aspect wise ratings

Page 14: DSL presentation_final

ALGORITHM

Aspect Segmenter

Prepare initial aspect keyword vocabulary

Count matches of aspect keywords in each sentence

Split each review in sentences

Annotate each sentence with proper aspect using argmax of count keywords

Calculate Chi-square statistics for each word

Append top 5 words for each aspect into new bootstrapped vocabulary

Page 15: DSL presentation_final

BOOTSTRAPPED VOCABULARY FOR EACH ASPECT

Business service Check in Cleanliness Location Rooms Service Value

wifi stuff clean traffic room food range

computer check dirty minute suite breakfast price

internet help maintain restaurant view buffet quality

business reservation smell great bed good per

centre willing rooms town facing eat night

position staff comfortable old king customer priceline

free friendly corridors walking clean worth

helpful carpet easily quiet paid

hotel upgraded

room

Page 16: DSL presentation_final

ALGORITHM

Latent Rating

Regression

Overall rating for each review is weighted sum of each latent aspect rating

Each word in the vocabulary has aspect related sentiment (Beta)

Aspect weights (Alpha) follow multivariate normal distribution

The vector of aspect weights for each review is estimated from maximum a posteriori (MAP)Aspect wise sentiment

(Beta) of each word, multivariate mean and covariance is estimated using an Expectation-Maximization from maximizing the overall liklihood

Page 17: DSL presentation_final

GENERATIVE FRAMEWORK

Page 18: DSL presentation_final

RESULTSReview: “We enjoyed this area of San Diego a lot. We were there visiting a niece in the Navy but the area offers a lot of good food, entertainment and sight seeing. Old town San Diego is with in walking distant. We suggest paying for the parking as it can be a bit of a problem in the area.”

Value

Room

Loca

tion

Cleanli

ness

Checkin

Servic

e

Busine

ss S

ervic

e0

0.1

0.2

0.3

0.4

Latent Aspect Weights for the review

Page 19: DSL presentation_final

RESULTSReview: “We arrived to find the outside of the hotel presentable and the front desk staff was very kind and helpful. However the room is OLD.. Also the complimentary shuttle to the ZOO and cruise port were also appreciated.. Also the complimentary shuttle to the ZOO and cruise port were also appreciated. We decided to eat their breakfast buffet which was pretty good for the money.

Value

Room

Loca

tion

Cleanli

ness

Check

in

Servic

e

Busine

ss S

ervic

e0

0.05

0.1

0.15

0.2

0.25

latent aspect weights for the review

Page 20: DSL presentation_final

RESULTSReview: “An excellent experience with a good location and excellent staff. The location in Old Town is not only a good site for exploring Old Town, but it is minutes by car from the Airport and Downtown San Diego. It is next to Interstate 5 and it is advisable to request a room facing away from the highway. ..we could not hear any highway noise. For those without a car, there is a nearby trolley going to the center of San Diego at minimal cost.

Value

Room

Loca

tion

Cleanli

ness

Check

in

Servic

e

Busine

ss S

ervic

e0

0.20.40.60.8

11.2

latent aspect weights for the review

Page 21: DSL presentation_final

NLP based sentiment mapping + Latent Aspect Rating Regression

Page 22: DSL presentation_final

RESULTSCustomer Satisfaction Index:

Value Room Location Cleanliness Checkin Service Business Service

0

0.5

1

1.5

2

2.5

CUSTOMER SATISFACTION INDEX

Customer Satisfaction Index = Aspect wise sentiment x Aspect Weight

Page 23: DSL presentation_final

What’s Missing?This approach gives the overall sentiment along with the weights for each aspect. But still the actual issues are not clear regarding the opinion. It is desirable to know the exact issues regarding different aspects after knowing the overall satisfaction indices.

Page 24: DSL presentation_final

OPINION MINING WITH LexRANK

Page 25: DSL presentation_final

Why?Key Idea is to find out the most important opinion expressed in the reviews.

Page 26: DSL presentation_final

ALGORITHMTag each sentence with its corresponding aspect

Generate a graph having nodes as sentences and edges having weights as similarities

Generate similarity scores with all pairs of sentences in each aspect

Identify the node having highest LexRank.

The sentence having highest LexRank is the most central node in the sentence graph. For better representativeness, we have taken top 3 most central nodes. Centrality is eigenvector centrality.

Page 27: DSL presentation_final

we walked two blocks into town for breakfast... all in all a comfortable stay.

"the location in old town is not only a good site for exploring old town, but it is minutes by car from the airport and downtown san diego.

two blocks from old town and all the restaurants and shops there.

great location

Page 28: DSL presentation_final

they have a nice lobby with free coffee and 3 computers for you to use.

free wine and cheese in the afternoon (between 4:00 and 6:00) was a nice touch, and free ice cream on friday was nice too.

a nice touch is the free wine and cheese most evenings around 5:00pm in the lobby.

Page 29: DSL presentation_final

"requested room away from freeway so my room was very quiet."

the room was clean and the carpet appears to have been recently updated

"rooms were clean and comfy, nice big tv, comfy bed."

Page 30: DSL presentation_final

RESULTS

LOCATION

◦ “The location is great for walking to things in old town and for easily getting onto the interstate to drive elsewhere.”

◦ “the location was great for exploring old town”

VALUE

◦ “the price was $189 plus tax per night, was not worth it “

◦ “for the price paid for a three night stay, i could have probably gotten a better room”

SERVICE

◦ “My kids had fun in hotel’s pool and the breakfast buffet was pretty good.”

◦ “we ate the breakfast buffet all 3 mornings and it was very good! “

Page 31: DSL presentation_final

RESULTSCLEANLINESS

◦ “The carpet in the room was stained and dirty, so that was gross.”

◦ “It is in the middle of major building work, it is dirty, dusty”

ROOMS

◦ “Our room was a king suite, on the 3rd floor, one window facing the pool .”

◦ “nice room, big bed and even bigger chair facing into town”

CHECK-IN

“Check-in was prompt and the staff were helpful and friendly the entire time.”

BUSINESS SERVICE

“Free internet in the lobby and quiet considering its size and position near the highway on one side”

Page 32: DSL presentation_final

BUSINESSRECOMMENDATION

Page 33: DSL presentation_final

RETENTION STRATEGY (CONJOINT APPROACH)

• Based on Aspect rating we can segregate satisfied customer and unsatisfied customer for each aspect

• Do a survey over existing customer base to find out price sensitivity.

• For a regular customer (unsatisfied) who has high RFM score, a price discount can be offered such that utility gain by reducing price supersedes utility lost by aspect specific service

• It will lead to retention of customer

Page 34: DSL presentation_final

DIRECT MARKETING

• For high customer lifetime value customer’s, a direct marketing campaign can be done by tele-marketing or mailers.

• Advertisement should be customer specific and it needs to focus more on the aspect for which customer gives more importance.

• Mailers could be sent focusing his most liked and most disliked aspects according to his personal preferences using the aspect weights found from LARA

• Feedback call mechanism should be activated and customer should be assured of better service next time based on aspect ratings.

Page 35: DSL presentation_final

GAINING COMPETITIVE ADVANTAGE

• Expansion of customer base by attracting customers from other competitors

• Hotel reviews for other competitor can be parsed and profile of each user can be generated i.e based on aspect weight, we can find out what aspects in hotel a particular customer deem important

• Based on the aspect weights of each customer, segmentation of customer can be done and each segment of the customer can be targeted based on their most-preferred aspects which will gain competitive advantages by attracting potential customers

Page 36: DSL presentation_final

ASPECT SPECIFIC IMPROVEMENT

• Train the in house staff on importance of cleanliness in hospitality business. Train them to improve overall cleanliness level of the property.

• .In cases where hotels have a problem of commute, liaise with cab providers in the local areas and make sure guests can commute easily.

• Make check-ins as easy and seamless as possible. Do away with lot of paper use like printouts, photocopies at check in. Hotel must do check-ins electronically. Highlight “Green checkins” as property’s feature. For example scan the ID cards than using printouts.

• Ensure well groomed, dapper ,helpful and communicative staff at check in desk.

Page 37: DSL presentation_final

KEEPING UP WITH CUSTOMER INFO

• The hotel must a keep watch on its online reviews and discuss those in daily employee meetings. Triage the issues based on customer satisfaction index.

• Solve top priority issues in a time bound manner. Easy to solve issues, must be acted upon the same day. Escalate costly and complex issues to higher management

• Maintaining and updating profile of new customers and existing customers.

Page 38: DSL presentation_final

REFERENCES

• Standford NLP Parser

• Hongning Wang, Yue Lu, Chengxiang Zhai; Latent Aspect Rating Analysis on Review Text Data: A Rating Regression Approach

• Güneş Erkan, Dragomir R. Radev; LexRank: Graph-based Lexical Centrality as Salience in Text Summarization