Opinion Driven Decision Support System

Kavita GanesanPhD Thesis

Opinion-Driven Decision Support System

Committee:Prof. ChengXiang Zhai, UIUC Prof. Jiawei Han, UIUC Prof. Kevin C. Chang, UIUC Prof. Ellen Riloff, University of Utah Dr. Evelyne Viegas, Microsoft Research

2

Visiting a new city….Online Opinions Which hotel to stay at?

Opinions are essential for decision making

3

Visiting a new city….Online Opinions What attractions to visit?

Opinions are essential for decision making

Without opinions, decision making becomes difficult!

4

ODSS Components1. DataComprehensive set of opinions to support search and analysis capabilities

4. PresentationPutting it all altogether- easy way for users to explore results of search and analysis components (ex. organizing and summarizing results)

3. Search CapabilitiesAbility to find entities using existing opinions

Focus of Existing Work

opinion summarization

Existing work leveraging opinions

structured summaries

1. Sentiment Summary(ex. +ve/-ve on a piece of text)

2. Fine-grained Sentiment Summ.(ex. Battery life: 2 stars; Audio: 1 star)

2. Analysis ToolsTools to help digest opinions(ex. Summaries, Opinion trend visualization)

Not a complete solution to support decision making based on opinions !

5

ODSS Components1. DataComprehensive set of opinions to support search and analysis capabilities

4. PresentationPutting it all altogether- easy way for users to explore results of search and analysis components (ex. organizing and summarizing results)

3. Search CapabilitiesAbility to find entities using existing opinions

Focus of Existing Work

opinion summarization

Existing work leveraging opinions

structured summaries

1. Sentiment Summary(ex. +ve/-ve on a piece of text)

2. Fine-grained Sentiment Summ.(ex. Battery life: 2 stars; Audio: 1 star)

2. Analysis ToolsTools to help digest opinions(ex. Summaries, Opinion trend visualization)

Need to address broader set of problems to enable opinion driven decision support

We need data: large number of online opinions Allow users to get complete and unbiased picture▪ Opinions are very subjective and can vary a lot

Currently: No study on how to systematically collect opinions from the web

To support effective decision making…

We need different analysis tools To help users analyze & digest opinions▪ Sentiment trend visualization ▪ fluctuation over time

▪ Aspect level sentiment summaries▪ Textual summaries, etc…

Currently: focus on structured summarization



We need to incorporate search Allow users find different items or entities based

on existing opinions This can improve user productivity cuts down on

the time spent on reading large number opinions


We also need to know how to organize & present opinions at hand effectively Aspect level summaries: ▪ How to organize these summaries? ▪ Scores or Visuals (stars)?▪ Do you show supporting phrases?

Full opinions:▪ How to allow effective browsing of reviews/opinions?

don’t overwhelm users

10

ODSS Components

Opinion-Driven Decision Support System

1. DataComprehensive set of opinions to support opinion based search & analysis tasks

3. Search CapabilitiesFind items/entities based on existing opinions(ex. show “clean” hotels only)

4. PresentationOrganizing opinions to support effective decision making

2. Analysis ToolsTools to help analyze & digest opinions (ex. Summaries, Opinion trend visualization)

11

Focus of Proposed Methods in Thesis

1. Should be general Works across different domains & possibly content

type

2. Should be practical & lightweight Can be integrated into existing applications Can potentially scale up to large amounts of data

12

Opinion-Based Entity RankingGanesan & Zhai 2012 (Information Retrieval)

Search

13

The Problem

Currently: No direct way of finding entities based on online opinions

Need to read opinions about different entities to find entities that fulfill personal criteria

Time consuming & impairs user productivity!

14

Proposed Idea

Use existing opinions to rank entities based on a set of unstructured user preferences Finding a hotel: “clean rooms, good service” Finding a restaurant: “authentic food, good ambience”

15

Most Intuitive Approach for such Ranking

Use results of existing opinion mining methods Find sentiment ratings on different aspects Rank entities based on discovered aspect ratings

Problem: Not practical! Costly - mine large amounts of textual content Need prior knowledge on set of queriable aspects Most existing methods rely on supervision▪ E.g. Overall user rating

16

Proposed Approach

Use existing text retrieval models for ranking entities based on preferences: Can scale up to large amounts of textual content Can be tweaked Do not require costly IE or text mining

17

Goal of this work

Investigate use of text retrieval models for Opinion-Based Entity Ranking Compare 3 state-of-the-art retrieval models:

BM25, PL2, DirichletLM – shown to work best for TR tasks

Which one works best for this ranking task?

Explore some extensions over existing IR models Can ranking improve with these extensions?

Compile the first test set & propose evaluation method for this new ranking task

18

2 Proposed Extensions Over IR Models

19

Extension 1: Modeling Aspects in Query

Standard retrieval cannot distinguish multiple preferences in queryE.g. Query: “clean rooms, cheap, good service” Treated as long keyword query but actually 3 preferences Problem: An entity may score highly because of matching one aspect

extremely well

To address this problem: Score each preference separately – multiple queries Combine the results of each query – different strategies▪ Score combination works best▪ Average rank▪ Min rank▪ Max rank

20

Extension 2: Opinion Expansion

In standard retrieval: Matching an opinion word & standard topic word is not distinguished

Opinion-Based Entity Ranking: Important to match opinion words in the query▪ opinion words have more variation than topic words▪ E.g. Great: excellent, good, fantastic, terrific…

Intuition: ▪ Expand a query with similar opinion words ▪ Help emphasize matching of opinions

21

Performance Improvement over Std. Retrieval using Extensions

PL2 LM BM250.0%1.0%2.0%3.0%4.0%5.0%6.0%7.0%8.0%9.0%

QAM QAM + OpinExp

PL2 LM BM250.0%

0.5%

1.0%

1.5%

2.0%

2.5%

QAM QAM + OpinExp

Hotels Cars

Improvement using QAMImprovement using QAM + OpinExp

22


PL2 LM BM250.0%1.0%2.0%3.0%4.0%5.0%6.0%7.0%8.0%9.0%

QAM QAM + OpinExp

PL2 LM BM250.0%

0.5%

1.0%

1.5%

2.0%

2.5%

QAM QAM + OpinExp

Hotels Cars


QAM: Any model can be used QAM: Any model

can be used

23


PL2 LM BM250.0%1.0%2.0%3.0%4.0%5.0%6.0%7.0%8.0%9.0%

QAM QAM + OpinExp

PL2 LM BM250.0%

0.5%

1.0%

1.5%

2.0%

2.5%

QAM QAM + OpinExp

Hotels Cars


QAM+OpinExp: BM25 most effective

QAM+OpinExp: BM25 most effective

24

Abstractive Summarization of OpinionsOpinosis: Ganesan et al., COLING ’10Micropinion Generation: Ganesan et al., WWW ‘12

Analysis

Opinion Summarization Today…

Current methods: Focus on generating structured

summaries of opinions [Lu et al., 2009; Lerman et al., 2009;..]

Opinion Summary for iPod

Opinion Summarization Today…

We need supporting textual summaries!

To know more: read many redundant sentences

Opinion Summary for iPod

27

Summarize the major opinions What are the major complaints/praise in the text?

Concise◦ Easily digestible◦ Viewable on smaller screen

Readable◦ Easily understood

Criteria for ideal textual summaries

Widely studied for years [Radev et al.2000; Erkan & Radev, 2004; Mihalcea & Tarau, 2004…]

Not suitable for generating concise summaries Bias: with limit on summary size▪ Selected sentences may have missed critical info.

Verbose: Not shortening sentences

Extractive Summarization

We need more of an abstractive approach

29

Approaches to Generating Textual Opinion Summaries

2 Abstractive Summarization Methods

Opinosis-Graph based summarization framework-Relies on structural redundancies in sentences

WebNgram-Optimization framework based on readability & representativeness scoring-Phrases generated by combining words in original text

30

Opinosis: High Level Overview

Input

Set of sentences:Topic specificPOS annotated

31


my

the iphone is a

phone calls frequently

too

with

.

drop

Step 1: Generate graph representation oftext (Opinosis-Graph)

great

device

Input


32


Step 2: Find promising paths (candidate summaries) & score the candidates

my

the iphone is a


too

with

.

drop


great

device

Input


calls frequently drop

great device candidate sum1

candidate sum2

3.2

2.5

33


The iPhone is a great device, but calls drop frequently.

Step 3: Select top scoring candidates as final summary

calls frequently drop

great device

Step 2: Find promising paths (candidate summaries) & score the candidates

candidate sum1

candidate sum2

3.2

2.5

my

the iphone is a


too

with

.

drop


great

device

Input

Set of sentences:Topic specific POS annotated

34

Example Opinosis-Graph

Assume: 2 sentences about “call quality of iphone”

1. My phone calls drop frequently with the iPhone.2. Great device, but the calls drop too frequently.

35

Example Opinosis-Graph

• One node for each unique word + POS combination• Sid and Pid maintained at each node• Edges indicate relationship between words in sentence

great

2:1

device

2:2

,

2:3

but

2:4

.1:9, 2:10my

1:1

phone

1:2

drop

1:4, 2:7

frequently

1:5, 2:9 with

1:6

the

1:7, 2:5

iphone

1:8

calls

1:3, 2:6

too

2:8

3 Unique Properties of Opinosis-Graph

great

2:1

device

2:2

,

2:3

but

2:4

.1:9, 2:10my

1:1

phone

1:2

drop

1:4, 2:7

frequently

1:5, 2:9 with

1:6

the

1:7, 2:5

iphone

1:8

calls

1:3, 2:6

too

2:8

37

great

2:1

device

2:2

,

2:3

but

2:4

.1:9, 2:10my

1:1

phone

1:2

drop

1:4, 2:7

frequently

1:5, 2:9 with

1:6

the

1:7, 2:5

iphone

1:8

calls

1:3, 2:6

too

2:8

Property 1: Naturally captures redundancies

drop

1:4, 2:7

frequently

1:5, 2:9

calls

1:3, 2:6

Path shared by 2 sentences naturally captured by nodes

38

great

2:1

device

2:2

,

2:3

but

2:4

.1:9, 2:10my

1:1

phone

1:2

drop

1:4, 2:7

frequently

1:5, 2:9 with

1:6

the

1:7, 2:5

iphone

1:8

calls

1:3, 2:6

too

2:8

Property 1: Naturally captures redundancies

drop

1:4, 2:7

frequently

1:5, 2:9

calls

1:3, 2:6

Easily discover redundancies for high confidence summaries

39

great

2:1

device

2:2

,

2:3

but

2:4

.1:9, 2:10my

1:1

phone

1:2

drop

1:4, 2:7

frequently

1:5, 2:9 with

1:6

the

1:7, 2:5

iphone

1:8

calls

1:3, 2:6

too

2:8

Property 2: Captures gapped subsequences

drop

1:4, 2:7

frequently

1:5, 2:9

calls

1:3, 2:6

Gap between words = 2

40

great

2:1

device

2:2

,

2:3

but

2:4

.1:9, 2:10my

1:1

phone

1:2

drop

1:4, 2:7

frequently

1:5, 2:9 with

1:6

the

1:7, 2:5

iphone

1:8

calls

1:3, 2:6

too

2:8

Property 2: Captures gapped subsequences

drop

1:4, 2:7

frequently

1:5, 2:9

calls

1:3, 2:6

Gapped subsequences allow:• redundancy enforcements• discovery of new sentences

41

Calls drop frequently with the iPhone Calls drop frequently with the Black Berry

Property 3: Captures collapsible structures

drop frequently with the iphonecalls

black berryOne common high redundancy path

High fan-out“calls drop frequently with the iphone and black berry”

42

Evaluation

Input: Topic specific sentences from user reviews

Evaluation Measure: Automatic ROUGE evaluation

43

Human vs. Opinosis vs. MEAD

HUMAN (17 words)

OPINOSISbest (15 words)

MEAD (75 words)

0.318400000000005 0.2831

0.4932

0.1293 0.0851000000000001

0.2316

ROUGE-1 ROUGE-SU4

ROUGE Recall

HUMAN (17 words)


MEAD (75 words)

0.3434

0.4482

0.09160000000000

01

0.30880000000000

6

0.32710000000000

5

0.1515

ROUGE Precision

Lowest precision

Much longer sentences

Highest recall

MEAD does not do well in generating concise summaries.

44

Human vs. Opinosis

HUMAN (17 words)


MEAD (75 words)

0.318400000000005 0.2831

0.4932

0.1293 0.0851000000000001

0.2316

ROUGE-1 ROUGE-SU4

ROUGE Recall

HUMAN (17 words)


MEAD (75 words)

0.3434

0.4482

0.09160000000000

01

0.30880000000000

6

0.32710000000000

5

0.1515

ROUGE Precision

similar similar

Performance of Opinosis is reasonable similar to human performance

45

WebNGram

Use existing words in original text to generate micropinion summaries- set of short phrases

Emphasis on 3 aspects: Compactness - use as few words as possible Representativeness – reflect major opinions in text Readability – fairly well formed

46

Optimization Framework to capture compactness, representativeness & readability

kmmsim

)(mS

)(mS

m

)(mS) (mS...mm M

jisimji

readiread

repirep

ss

k

i

i

k

i

iread irep ki

,1(

subject to

maxarg

,),

1

1

47


kmmsim

)(mS

)(mS

m

)(mS) (mS...mm M

jisimji

readiread

repirep

ss

k

i

i

k

i

iread irep ki

,1(

subject to

maxarg

,),

1

1

Objective function: Optimize representativeness & readability scores• Ensure: summaries reflect key opinions &reasonably well formed

48


kmmsim

)(mS

)(mS

m

)(mS) (mS...mm M

jisimji

readiread

repirep

ss

k

i

i

k

i

iread irep ki

,1(

subject to

maxarg

,),

1

1

Readability score of mi

Representativeness score of mi

49

kmmsim

)(mS

)(mS

m

)(mS) (mS...mm M

jisimji

readiread

repirep

ss

k

i

i

k

i

iread irep ki

,1(

subject to

maxarg

,),

1

1


Constraint 1: Maximum length of summary. •User adjustable•Captures compactness.

50

kmmsim

)(mS

)(mS

m

)(mS) (mS...mm M

jisimji

readiread

repirep

ss

k

i

i

k

i

iread irep ki

,1(

subject to

maxarg

,),

1

1


Constraint 2 &3: Min representativeness & readability. •Helps improve efficiency•Does not affect performance

51

kmmsim

)(mS

)(mS

m

)(mS) (mS...mm M

jisimji

readiread

repirep

ss

k

i

i

k

i

iread irep ki

,1(

subject to

maxarg

,),

1

1


Constraint 4: Max similarity of phrases • User adjustable • Captures compactness by minimizing redundancies

52

Measure used: Standard Jaccard Similarity Measure

Why important? Allows user to control amount of redundancy E.g. User desires good coverage of information on

small device request less redundancies !

Similarity scoring, sim(mi, mj)

53

Representativeness scoring, Srep(mi)

Purpose: Measure how well a phrase represents opinions from the original text?

2 properties of a highly representative phrase:1. Words should be strongly associated in text2. Words should be sufficiently frequent in text

Captured by a modified pointwise mutual information (PMI) function

)()(

),(),(log)(' 2,

ji

jijiji

wpwp

wwcwwpwwpmi

Add frequency of occurrence within a window

54

Readability scoring, Sread(mi)

Purpose: Measure well-formedness of a phrase

Readability scoring: Use Microsoft's Web N-gram model (publicly available) Obtain conditional probabilities of phrases Intuition: A readable phrase would occur more

frequently according to the web than a non-readable phrase

)|(log1

)( 1...12...

kqk

n

qk

knkread wwwpK

wwS

chain rule to computejoint probability in terms of conditional probabilities(averaged)

55

Evaluation

Input: User reviews for 330 products (CNET)

Evaluation Measure: Automatic ROUGE evaluation

56

5 10 15 20 25 300.00

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

KEATfidfOpinosisWebNGram

Summary Size (max words)

ROU

GE-

2 RE

CALL

Results: Performance comparison

WebNgram: Performs the best for this task

KEA: slightly better than tfidfTfidf: Worst

performance

57

5 10 15 20 25 300.00

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

KEATfidfOpinosisWebNGram

Summary Size (max words)

ROU

GE-

2 RE

CALL

Results: Performance comparison

WebNgram: Performs the best for this task

KEA: slightly better than tfidfTfidf: Worst

performance

PROS

CONS

FULL REVIEW

58

OpinoFetch: An Unsupervised Approach to Collecting Opinions for Arbitrary EntitiesTo Submit.

3. DATA

59

The current problem

No easy way to obtain a comprehensive set of opinions about an entity

Where to get opinions now? Rely on content providers or crawl a few sources

Problem :▪ Can result in source specific bias ▪ Data sparseness for some entities

60

Goal of this work

Automatically crawl online reviews forarbitrary entities E.g. Cars, Restaurants, Doctors

Target online reviews represent a big portion of online opinions

61

Existing Focused Crawlers

Meant to collect pages relevant to a topicE.g. “Databases Systems”, “Boston Terror Attack”

Page type is not as important content news article, review pages, forum page, etc.

Most focused crawlers are supervised require large amounts of training data for each topic

Not suitable for review collection on arbitrary entities Need training data for each entity will not scale up to

large # of entities

Propose: OpinoFetch

Focused crawler for collecting reviews pages on arbitrary entities

Unsupervised approach Does not require large amounts of training data

Solves crawling problem efficiently Uses a special data structure for relevance scoring

63

OpinoFetch: High Level Overview

Set of entities in a domain (e.g. All hotels in a city)

Step 1: For each entity, obtain initial set of Candidate Review Pages (CRP).

Find Initial Candidate Review Pages (CRP)

Input

1. Hampton Inn Champaign…2. I Hotel Conference Center…3. La Quinta Inn Champaign…4. Drury Inn5. ….

Hampton Inn…Reviews

Step 3: Score CRPs: • Entity relevance (Sent) • Review pg. relevance (Srev) Select: Srev > σrev ; Sent > σent

Expand CRP List

tripadvisor.com/Hotel_Review-g36806-d903...tripadvisor.com/Hotels-g36806-Urbana_Cha...hamptoninn3.hilton.com/en/hotels/…tripadvisor.com/ShowUserReviews-g36806-.........…

tripadvisor.com/Hotel_Review-g35790-d102…tripadvisor.com/Hotels-g36806-Urbana_Cha...hamptoninn3.hilton.com/en/hotels/…tripadvisor.com/ShowUserReviews-g36806-.........…

Step 2: Expand list of CRPs by exploring links in neighborhood of initial CRPs.

Collect Relevant Review Pages

64

OpinoFetch: High Level Overview

65

Step 1. Finding Initial CRPs

Use any general web search (e.g. Bing/Google) Per entity basis

Search engines do partial matching of entities to pages

More likely pages in vicinity of search results related to entity

QueryEntity QueryFormat: “entity name + brand / address” + “reviews”E.g. “Hampton Inn Champaign 1200 W University Ave Reviews”

66

Follow top-N URLs around vicinity of search results

Use URL prioritization strategy: Bias crawl path towards entity related pages Score each URL: based on similarity between

(a) URL + Entity Query, Sim(URL,EQ) (b) Anchor + Entity Query, Sim(Anchor,EQ)

Step 2. Expand CRP List

67

Step 3a. Review Relevance, Srev(pi)

To determine if page is indeed a review page Use review vocabulary:

Lexicon with most commonly occurring words within review pages – details in thesis

Idea: score a page based on # of review page words

]10[)(,)(

)(

)(),(log)( 2

pirevSnormalizer

piSpirevS

Vt twtiptcpirevS

rawrev

raw

]10[)(,)(

)(

)(),(log)( 2

pirevSnormalizer

piSpirevS

Vt twtiptcpirevS

rawrev

raw

To determine if page is indeed a review page Use review vocabulary:

Lexicon with most commonly occurring words within review pages – details in thesis

Idea: score a page based on # of review page words

Step 3a. Review Relevance, Srev(pi)

Raw review page relevance score

Normalize to obtain final review page relevance score

68

t is a term in the review vocabulary, Vc(t, pi) – freq. of t in page pi (tf). wt(t) - importance

weighting of t in RV

Normalizer needed to set proper thresholds

69

Step 3a. How to Normalize Srevraw(pi)?

Explored 3 normalization options: SiteMax (SM) : Max Srevraw(pi) amongst all pages

related to a particular site - Normalize based on site density

EntityMax (EM) : Max Srevraw(pi) score amongst all pages related to an entity - Normalize based on entity popularity

EntityMax + GlobalMax (GM) or SiteMax + GlobalMax (GM) :▪ To help with cases where SM/EM are unreliable

70

Step 3b. Entity Relevance, Sent(pi,ek)

To determine if page is about target entity

Based on similarity between a page URL & Entity Query

Why it works? Most review pages have highly descriptive URLs Entity Query is a detailed description of entity The more URL resembles query, more likely it is

relevant to target entity

Similarity measure: Jaccard Similarity

71

Usability of Crawler

Steps proposed so far, can be implemented in a variety of different ways

Our goal: make the crawling framework usable in practice

72

2 Aspects of Usability

1. Efficiency: Allow review collection for large number of entities Task should terminate in reasonable time & accuracy Problem happens when cannot access required

information quickly▪ E.g. Repeated access to term frequencies of different pages

2. Rich Information Access (RIA): Allow client to access info. beyond crawled pages

E.g. Get all review pages from top 10 popular sites for entity X DB not suitable because you cannot naturally model

complex relationships and would yield in large joins

73

FetchGraph: A Rich Information Network

Heterogeneous graph data structure

Models complex relationships between different components in a data collection problem

74


Review Vocabulary

Current Query

Q

V

t1

t2

t3

t4

t5

tz

.

.

.

.

Term Nodes

wt

wt

wt

wt

wt

wt

wt

wt

wt

wt

wt

E1

Entity Nodes

E2

Ek

Hampton Inn Champaign

I-Hotel Conference Center

Drury inn Champaign

t

t

t

u

u

u

c

c

c

Page Nodes

P2

P1

P3

P4

P5

P6

Pn

Site Nodes

S2

St

hotels.com

local.yahoo.com

S1

tripadvisor.com

t = title, u = url, c = content

Logical Nodes

Other Logical Nodes

75


Review Vocabulary

Current Query

Q

V

Other Logical Nodes

t1

t2

t3

t4

t5

tz

.

.

.

.

Term Nodes

wt

wt

wt

wt

wt

wt

wt

wt

wt

wt

wt

E1

Entity Nodes

E2

Ek



Drury inn Champaign

u

u

u

c

c

c

Page Nodes

P2

P1

P3

P4

P5

P6

Pn

Site Nodes

S2

St

hotels.com

local.yahoo.com

S1

tripadvisor.com

t

t

t

t = title, u = url, c = content

Logical Nodes

List of entities on which reviews are required

Based on set of CRPs found for each entity

At the core, made up of terms

One node per unique term

76

Benefits of the FetchGraph

Maintain one simple data structure: Access to various statistics▪ E.g TF of word in a page EdgeWT(content node term node)

Access to complex relationships and global information Compact: can be an in memory data structure

Network can be persisted and accessed later Client applications can use network to answer

interesting app. related questionsE.g. Get all review pages for entity X from top 10 popular sites

77

Computing Srevraw(pi) using FetchGraph

t1

t2

t3

t4

t5

tz

.

.

.

.

Term NodesPage Nodes

P2

P1

P3

P4

P5

P6

Pn

wt

wt

wt

wt

V

C

Content Node(logical node)

tf

tf

tf

Review Vocabulary Node(logical node)

To compute Srevraw(pi) :-Terms present in both the Content node and RV node. -TF and weights can be obtained from edges-Lookup of review vocabulary words within a page is fast-No need to parse page contents each time encountered

Outgoing edges = term ownershipEdge weight = importance wt

Edge weight = TF

78

Obtaining SiteMax from FetchGraph

Opinion Vocabulary

Current Query

Q

O

Other Logical Nodes

t1

t2

t3

t4

t5

tz

.

.

.

.

Term Nodes

wt

wt

wt

wt

wt

wt

wt

wt

wt

wt

wt

E1

Entity Node

E2

Ek



Drury inn Champaign

u

u

u

c

c

c

Page Nodes

P2

P1

P3

P4

P5

P6

Pn

Site Nodes

S2

St

hotels.com

local.yahoo.com

S1

tripadvisor.com

t

t

t

Logical Nodes

Access all pages connected to the site node requires complete graph

79

Obtaining EntityMax from FetchGraph

Opinion Vocabulary

Current Query

Q

O

Other Logical Nodes

t1

t2

t3

t4

t5

tz

.

.

.

.

Term Nodes

wt

wt

wt

wt

wt

wt

wt

wt

wt

wt

wt

E1

Entity Node

E2

Ek



Drury inn Champaign

u

u

u

c

c

c

Page Nodes

P2

P1

P3

P4

P5

P6

Pn

Site Nodes

S2

St

hotels.com

local.yahoo.com

tripadvisor.com

S1

t

t

t

Logical Nodes

Access all pages connected to entity node requires complete graph

80

Computing Sent(pi,ek) using FetchGraph

t1

t2

t3

t4

t5

tz

.

.

.

.

Term Nodes

tf

Page Nodes

P2

P1

P3

P4

P5

P6

Pn

tf

tf

tf

tf

q1

Entity Query Node(logical node)

Hampton Inn Champaign 1200 W Univ…Reviews

tf

tf

tripadvisor.com/ShowUser…

U

URL Node(logical node)

81

Evaluation

Goal: Evaluate accuracy & give insights into efficiency using FetchGraph

Evaluated in 3 domains: (5) – Electronics, (5) – Hotels, (4) - Attractions Only 14 entities expensive to obtain judgments

Gold standard: For each entity, explore top 50 Google results & links

around vicinity of the results (up to depth 3) 3 Human judges used to determine relevance of

collected links to entity query (crowd sourcing) Final judgment: majority voting

82

Evaluation

Baseline: Google search results Deemed relevant to entity query

Evaluation measure: Precision Recall – estimate of coverage of review pages

)Pages(eGoldStdRel#)RelPages(e#

)Recall(ek

kk

)ages(eRetrievedP#)RelPages(e#

)Prec(ek

kk

83

Results…

84

OpinoFetch vs. Google - Recall

10 20 30 40 500.00

0.05

0.10

0.15

0.20

0.25

Google OpinoFetchOpinoFetchUnnormalized

Number of search results

Reca

ll

OpinoFetch

OpinoFetchUnnormalized

GoogleGoogle: recall consistently low

Google: recall consistently low



Search results not always relevant to EQ or not direct pointers to actual review pages.

85

OpinoFetch vs. Google - Recall

10 20 30 40 500.00

0.05

0.10

0.15

0.20

0.25



Reca

ll

OpinoFetch


Google

OpinoFetch: recall keeps improving




-A lot of relevant content in vicinity of search results -OpinoFetch is able to discover such relevant content

86

OpinoFetch vs. Google

10 20 30 40 500.00

0.05

0.10

0.15

0.20

0.25



Reca

ll

OpinoFetch


Google

OpinoFetch: better recall with normalization

-Scores are normalized using special normalizers (e.g. EntityMax / SiteMax)-Easier to distinguish relevant review pages

87

Best Normalizer for Srev(pi)

EntityMax + GlobalMax

EntityMax SiteMax + GlobalMax

SiteMax0%

20%

40%

60%

80%

100%

97.23%85.72%

36.23%19.62%

% C

hang

e in

pre

cisi

on

EM + GM: gives the best precision

SM: gives lowest precision

SM is worst performing: certain sites cover different classes of entities. Max score from the site may be unreliable for sparse entities

88

Growth of FetchGraph

0 100 200 300 400 500 600 700 800 900 10000

50000100000150000200000250000300000350000400000450000

# pages crawled

Gra

ph S

ize

Linear growth without any optimization/compression

Possible to use FetchGraph as in memory data structure

89

Avg. Execution Time with/without FetchGraph

FetchGraph and Efficiency

With FetchGraph

Without FetchGraph

Srevraw(pi) 0.09ms 8.60ms

EnityMax Normalizer

0.06ms 4.40 s

Without FetchGraph:-Parse page contents each time

With FetchGraph:-Page loaded into memory once-Use FetchGraph to compute Srevraw(pi)

90

Avg. Execution Time with/without FetchGraph

…FetchGraph and Efficiency

With FetchGraph

Without FetchGraph

Srevraw(pi) ~0.09ms ~8.60ms

EnityMax Normalizer

~0.06ms ~4.40s

Without FetchGraph:load sets of pages into memory to find entity max normalizer

With FetchGraph:-Global info tracked till the end-Only need to do a lookup on related sets of pages to obtain entity max normalizer

91

OpinoFetch: Conclusion

Proposed: An unsupervised, practical method for collecting reviews on arbitrary entities

Works with reasonable accuracy without requiring large amounts of training data

Proposed FetchGraph: Helps with efficient lookup of various statistics Useful for answering application related queries

92

FindiLike Demo Thesis Ideas into Usable SystemGanesan & Zhai, WWW 2012

93

FindiLike – Preference Driven Entity Search

Finds & ranks entities based on user preferences Unstructured opinion preferences - novel Structured preferences - e.g. price, brand, etc.

Beyond search: Support for analysis of entities Ability to generate textual summaries of reviews Ability to display tag clouds of reviews

Current version: Works in the hotels domain

94

FindiLike – Search Interface [ Link ]E.g. Finding “clean” hotels in Los Angeles close to “Universal Studios”

Search: Find entities based on unstructured opinion preferences

Search: + Combine with structured preferences

Ranking: How well all preferences are matched?

http://findilike.cs.illinois.edu/ROOTOLD/search

95

FindiLike – Review Tag Clouds for “Sportsmen’s Lodge”

Tag cloudsweighted by frequency

Related snippets (“convenient location”)

96

FindiLike – Review Summary for “Sportsmen’s Lodge”

Opinion summariesreadable, well-formed

Related snippets

97

Review summary using OpinoFetch crawled reviews - “Hampton Inn Champaign” [link]

Summary with Initial Reviews:-26 reviews in total-1-2 sources

Summary with OpinoFetch Reviews:-135 reviews (8 sources)-Extracted with a baseline extractor -Not all reviews were included – filter• Based on length of review• Subjectivity score of review

http://67.161.223.220:8080/webfindilikenew/view

98

Future Work

Opinion Based Entity Ranking Use click through & query logs to further improve

ranking of entities▪ Now possible everything is logged by demo system

Look into the use of phrasal search for ranking▪ Limit deviation from actual query (e.g. “close to university”)▪ Explore: “back-off” style scoring – score based on phrase

then remove the phrase restriction

99

…Future Work

Opinosis How to scale up to very large amounts of text?▪ Explore use of map reduce framework

Would this approach work with other types of texts?▪ E.g. Tweets, Facebook comments – shorter texts

Opinion Acquisition Compare OpinoFetch with a supervised crawler▪ Can achieve comparable results?

How to improve recall of OpinoFetch?▪ To evaluate at a reasonable scale: approximate judgments

without relying on humans?

100

References

[Barzilay and Lee2003] Barzilay, Regina and Lillian Lee. 2003. Learning to paraphrase: an unsupervised approach using multiple-sequence alignment. In NAACL ’03: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, pages 16–23, Morristown, NJ, USA.

[DeJong1982] DeJong, Gerald F. 1982. An overview of the FRUMP system. In Lehnert, Wendy G. and Martin H. Ringle, editors, Strategies for Natural Language Processing, pages 149–176. Lawrence Erlbaum, Hillsdale, NJ.

[Erkan and Radev2004] Erkan, G¨unes and Dragomir R. Radev. 2004. Lexrank: graph-based lexical centrality as salience in text summarization. J. Artif. Int. Res.,22(1):457–479.

[Finley and Harabagiu2002] Finley, Sanda Harabagiu and Sanda M. Harabagiu. 2002. Generating single and multi-document summaries with gistexter. In Proceedings of the workshop on automatic summarization, pages 30–38.

[Hu and Liu2004] Hu, Minqing and Bing Liu. 2004. Mining and summarizing customer reviews. In KDD, pages 168–177. [Jing and McKeown2000] Jing, Hongyan and Kathleen R. McKeown. 2000. Cut and paste based text summarization. In

Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference, pages 178–185, San Francisco, CA, USA. Morgan Kaufmann Publishers Inc.

[Lerman et al.2009] Lerman, Kevin, Sasha Blair-Goldensohn, and Ryan Mcdonald. 2009. Sentiment summarization: Evaluating and learning user preferences. In 12th Conference of the European Chapter of the Association for Computational Linguistics (EACL-09).

[Mihalcea and Tarau2004] Mihalcea, R. and P. Tarau. 2004. TextRank: Bringing order into texts. In Proceedings of EMNLP-04and the 2004 Conference on Empirical Methods in Natural Language Processing, July.

[Pang and Lee2004] Pang, Bo and Lillian Lee. 2004. A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In Proceedings of the ACL, pages 271–278.

[Pang et al.2002] Pang, Bo, Lillian Lee, and Shivakumar Vaithyanathan. 2002. Thumbs up? Sentiment classification using machine learning techniques. In Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 79–86.

[Radev and McKeown1998] Radev, DR and K. McKeown. 1998. Generating natural language summaries from multiple on-line sources. Computational Linguistics, 24(3):469–500.

[More in Thesis Report]

http://kavita-ganesan.com/modules/pubdlcnt/pubdlcnt.php?file=http://kavita-ganesan.com/sites/default/files/kavita_thesis.pdf&nid=195