Upload
kavita-ganesan
View
115
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Summary of Kavita Ganesan's PhD Thesis
Citation preview
Kavita GanesanPhD Thesis
Opinion-Driven Decision Support System
Committee:Prof. ChengXiang Zhai, UIUC Prof. Jiawei Han, UIUC Prof. Kevin C. Chang, UIUC Prof. Ellen Riloff, University of Utah Dr. Evelyne Viegas, Microsoft Research
2
Visiting a new city….Online Opinions Which hotel to stay at?
Opinions are essential for decision making
3
Visiting a new city….Online Opinions What attractions to visit?
Opinions are essential for decision making
Without opinions, decision making becomes difficult!
4
ODSS Components1. DataComprehensive set of opinions to support search and analysis capabilities
4. PresentationPutting it all altogether- easy way for users to explore results of search and analysis components (ex. organizing and summarizing results)
3. Search CapabilitiesAbility to find entities using existing opinions
Focus of Existing Work
opinion summarization
Existing work leveraging opinions
structured summaries
1. Sentiment Summary(ex. +ve/-ve on a piece of text)
2. Fine-grained Sentiment Summ.(ex. Battery life: 2 stars; Audio: 1 star)
2. Analysis ToolsTools to help digest opinions(ex. Summaries, Opinion trend visualization)
Not a complete solution to support decision making based on opinions !
5
ODSS Components1. DataComprehensive set of opinions to support search and analysis capabilities
4. PresentationPutting it all altogether- easy way for users to explore results of search and analysis components (ex. organizing and summarizing results)
3. Search CapabilitiesAbility to find entities using existing opinions
Focus of Existing Work
opinion summarization
Existing work leveraging opinions
structured summaries
1. Sentiment Summary(ex. +ve/-ve on a piece of text)
2. Fine-grained Sentiment Summ.(ex. Battery life: 2 stars; Audio: 1 star)
2. Analysis ToolsTools to help digest opinions(ex. Summaries, Opinion trend visualization)
Need to address broader set of problems to enable opinion driven decision support
We need data: large number of online opinions Allow users to get complete and unbiased picture▪ Opinions are very subjective and can vary a lot
Currently: No study on how to systematically collect opinions from the web
To support effective decision making…
We need different analysis tools To help users analyze & digest opinions▪ Sentiment trend visualization ▪ fluctuation over time
▪ Aspect level sentiment summaries▪ Textual summaries, etc…
Currently: focus on structured summarization
To support effective decision making…
To support effective decision making…
We need to incorporate search Allow users find different items or entities based
on existing opinions This can improve user productivity cuts down on
the time spent on reading large number opinions
To support effective decision making…
We also need to know how to organize & present opinions at hand effectively Aspect level summaries: ▪ How to organize these summaries? ▪ Scores or Visuals (stars)?▪ Do you show supporting phrases?
Full opinions:▪ How to allow effective browsing of reviews/opinions?
don’t overwhelm users
10
ODSS Components
Opinion-Driven Decision Support System
1. DataComprehensive set of opinions to support opinion based search & analysis tasks
3. Search CapabilitiesFind items/entities based on existing opinions(ex. show “clean” hotels only)
4. PresentationOrganizing opinions to support effective decision making
2. Analysis ToolsTools to help analyze & digest opinions (ex. Summaries, Opinion trend visualization)
11
Focus of Proposed Methods in Thesis
1. Should be general Works across different domains & possibly content
type
2. Should be practical & lightweight Can be integrated into existing applications Can potentially scale up to large amounts of data
12
Opinion-Based Entity RankingGanesan & Zhai 2012 (Information Retrieval)
Search
13
The Problem
Currently: No direct way of finding entities based on online opinions
Need to read opinions about different entities to find entities that fulfill personal criteria
Time consuming & impairs user productivity!
14
Proposed Idea
Use existing opinions to rank entities based on a set of unstructured user preferences Finding a hotel: “clean rooms, good service” Finding a restaurant: “authentic food, good ambience”
15
Most Intuitive Approach for such Ranking
Use results of existing opinion mining methods Find sentiment ratings on different aspects Rank entities based on discovered aspect ratings
Problem: Not practical! Costly - mine large amounts of textual content Need prior knowledge on set of queriable aspects Most existing methods rely on supervision▪ E.g. Overall user rating
16
Proposed Approach
Use existing text retrieval models for ranking entities based on preferences: Can scale up to large amounts of textual content Can be tweaked Do not require costly IE or text mining
17
Goal of this work
Investigate use of text retrieval models for Opinion-Based Entity Ranking Compare 3 state-of-the-art retrieval models:
BM25, PL2, DirichletLM – shown to work best for TR tasks
Which one works best for this ranking task?
Explore some extensions over existing IR models Can ranking improve with these extensions?
Compile the first test set & propose evaluation method for this new ranking task
18
2 Proposed Extensions Over IR Models
19
Extension 1: Modeling Aspects in Query
Standard retrieval cannot distinguish multiple preferences in queryE.g. Query: “clean rooms, cheap, good service” Treated as long keyword query but actually 3 preferences Problem: An entity may score highly because of matching one aspect
extremely well
To address this problem: Score each preference separately – multiple queries Combine the results of each query – different strategies▪ Score combination works best▪ Average rank▪ Min rank▪ Max rank
20
Extension 2: Opinion Expansion
In standard retrieval: Matching an opinion word & standard topic word is not distinguished
Opinion-Based Entity Ranking: Important to match opinion words in the query▪ opinion words have more variation than topic words▪ E.g. Great: excellent, good, fantastic, terrific…
Intuition: ▪ Expand a query with similar opinion words ▪ Help emphasize matching of opinions
21
Performance Improvement over Std. Retrieval using Extensions
PL2 LM BM250.0%1.0%2.0%3.0%4.0%5.0%6.0%7.0%8.0%9.0%
QAM QAM + OpinExp
PL2 LM BM250.0%
0.5%
1.0%
1.5%
2.0%
2.5%
QAM QAM + OpinExp
Hotels Cars
Improvement using QAMImprovement using QAM + OpinExp
22
Performance Improvement over Std. Retrieval using Extensions
PL2 LM BM250.0%1.0%2.0%3.0%4.0%5.0%6.0%7.0%8.0%9.0%
QAM QAM + OpinExp
PL2 LM BM250.0%
0.5%
1.0%
1.5%
2.0%
2.5%
QAM QAM + OpinExp
Hotels Cars
Improvement using QAMImprovement using QAM + OpinExp
QAM: Any model can be used QAM: Any model
can be used
23
Performance Improvement over Std. Retrieval using Extensions
PL2 LM BM250.0%1.0%2.0%3.0%4.0%5.0%6.0%7.0%8.0%9.0%
QAM QAM + OpinExp
PL2 LM BM250.0%
0.5%
1.0%
1.5%
2.0%
2.5%
QAM QAM + OpinExp
Hotels Cars
Improvement using QAMImprovement using QAM + OpinExp
QAM+OpinExp: BM25 most effective
QAM+OpinExp: BM25 most effective
24
Abstractive Summarization of OpinionsOpinosis: Ganesan et al., COLING ’10Micropinion Generation: Ganesan et al., WWW ‘12
Analysis
Opinion Summarization Today…
Current methods: Focus on generating structured
summaries of opinions [Lu et al., 2009; Lerman et al., 2009;..]
Opinion Summary for iPod
Opinion Summarization Today…
We need supporting textual summaries!
To know more: read many redundant sentences
Opinion Summary for iPod
27
Summarize the major opinions What are the major complaints/praise in the text?
Concise◦ Easily digestible◦ Viewable on smaller screen
Readable◦ Easily understood
Criteria for ideal textual summaries
Widely studied for years [Radev et al.2000; Erkan & Radev, 2004; Mihalcea & Tarau, 2004…]
Not suitable for generating concise summaries Bias: with limit on summary size▪ Selected sentences may have missed critical info.
Verbose: Not shortening sentences
Extractive Summarization
We need more of an abstractive approach
29
Approaches to Generating Textual Opinion Summaries
2 Abstractive Summarization Methods
Opinosis-Graph based summarization framework-Relies on structural redundancies in sentences
WebNgram-Optimization framework based on readability & representativeness scoring-Phrases generated by combining words in original text
30
Opinosis: High Level Overview
Input
Set of sentences:Topic specificPOS annotated
31
Opinosis: High Level Overview
my
the iphone is a
phone calls frequently
too
with
.
drop
Step 1: Generate graph representation oftext (Opinosis-Graph)
great
device
Input
Set of sentences:Topic specificPOS annotated
32
Opinosis: High Level Overview
Step 2: Find promising paths (candidate summaries) & score the candidates
my
the iphone is a
phone calls frequently
too
with
.
drop
Step 1: Generate graph representation oftext (Opinosis-Graph)
great
device
Input
Set of sentences:Topic specificPOS annotated
calls frequently drop
great device candidate sum1
candidate sum2
3.2
2.5
33
Opinosis: High Level Overview
The iPhone is a great device, but calls drop frequently.
Step 3: Select top scoring candidates as final summary
calls frequently drop
great device
Step 2: Find promising paths (candidate summaries) & score the candidates
candidate sum1
candidate sum2
3.2
2.5
my
the iphone is a
phone calls frequently
too
with
.
drop
Step 1: Generate graph representation oftext (Opinosis-Graph)
great
device
Input
Set of sentences:Topic specific POS annotated
34
Example Opinosis-Graph
Assume: 2 sentences about “call quality of iphone”
1. My phone calls drop frequently with the iPhone.2. Great device, but the calls drop too frequently.
35
Example Opinosis-Graph
• One node for each unique word + POS combination• Sid and Pid maintained at each node• Edges indicate relationship between words in sentence
great
2:1
device
2:2
,
2:3
but
2:4
.1:9, 2:10my
1:1
phone
1:2
drop
1:4, 2:7
frequently
1:5, 2:9 with
1:6
the
1:7, 2:5
iphone
1:8
calls
1:3, 2:6
too
2:8
3 Unique Properties of Opinosis-Graph
great
2:1
device
2:2
,
2:3
but
2:4
.1:9, 2:10my
1:1
phone
1:2
drop
1:4, 2:7
frequently
1:5, 2:9 with
1:6
the
1:7, 2:5
iphone
1:8
calls
1:3, 2:6
too
2:8
37
great
2:1
device
2:2
,
2:3
but
2:4
.1:9, 2:10my
1:1
phone
1:2
drop
1:4, 2:7
frequently
1:5, 2:9 with
1:6
the
1:7, 2:5
iphone
1:8
calls
1:3, 2:6
too
2:8
Property 1: Naturally captures redundancies
drop
1:4, 2:7
frequently
1:5, 2:9
calls
1:3, 2:6
Path shared by 2 sentences naturally captured by nodes
38
great
2:1
device
2:2
,
2:3
but
2:4
.1:9, 2:10my
1:1
phone
1:2
drop
1:4, 2:7
frequently
1:5, 2:9 with
1:6
the
1:7, 2:5
iphone
1:8
calls
1:3, 2:6
too
2:8
Property 1: Naturally captures redundancies
drop
1:4, 2:7
frequently
1:5, 2:9
calls
1:3, 2:6
Easily discover redundancies for high confidence summaries
39
great
2:1
device
2:2
,
2:3
but
2:4
.1:9, 2:10my
1:1
phone
1:2
drop
1:4, 2:7
frequently
1:5, 2:9 with
1:6
the
1:7, 2:5
iphone
1:8
calls
1:3, 2:6
too
2:8
Property 2: Captures gapped subsequences
drop
1:4, 2:7
frequently
1:5, 2:9
calls
1:3, 2:6
Gap between words = 2
40
great
2:1
device
2:2
,
2:3
but
2:4
.1:9, 2:10my
1:1
phone
1:2
drop
1:4, 2:7
frequently
1:5, 2:9 with
1:6
the
1:7, 2:5
iphone
1:8
calls
1:3, 2:6
too
2:8
Property 2: Captures gapped subsequences
drop
1:4, 2:7
frequently
1:5, 2:9
calls
1:3, 2:6
Gapped subsequences allow:• redundancy enforcements• discovery of new sentences
41
Calls drop frequently with the iPhone Calls drop frequently with the Black Berry
Property 3: Captures collapsible structures
drop frequently with the iphonecalls
black berryOne common high redundancy path
High fan-out“calls drop frequently with the iphone and black berry”
42
Evaluation
Input: Topic specific sentences from user reviews
Evaluation Measure: Automatic ROUGE evaluation
43
Human vs. Opinosis vs. MEAD
HUMAN (17 words)
OPINOSISbest (15 words)
MEAD (75 words)
0.318400000000005 0.2831
0.4932
0.1293 0.0851000000000001
0.2316
ROUGE-1 ROUGE-SU4
ROUGE Recall
HUMAN (17 words)
OPINOSISbest (15 words)
MEAD (75 words)
0.3434
0.4482
0.09160000000000
01
0.30880000000000
6
0.32710000000000
5
0.1515
ROUGE Precision
Lowest precision
Much longer sentences
Highest recall
MEAD does not do well in generating concise summaries.
44
Human vs. Opinosis
HUMAN (17 words)
OPINOSISbest (15 words)
MEAD (75 words)
0.318400000000005 0.2831
0.4932
0.1293 0.0851000000000001
0.2316
ROUGE-1 ROUGE-SU4
ROUGE Recall
HUMAN (17 words)
OPINOSISbest (15 words)
MEAD (75 words)
0.3434
0.4482
0.09160000000000
01
0.30880000000000
6
0.32710000000000
5
0.1515
ROUGE Precision
similar similar
Performance of Opinosis is reasonable similar to human performance
45
WebNGram
Use existing words in original text to generate micropinion summaries- set of short phrases
Emphasis on 3 aspects: Compactness - use as few words as possible Representativeness – reflect major opinions in text Readability – fairly well formed
46
Optimization Framework to capture compactness, representativeness & readability
kmmsim
)(mS
)(mS
m
)(mS) (mS...mm M
jisimji
readiread
repirep
ss
k
i
i
k
i
iread irep ki
,1(
subject to
maxarg
,),
1
1
47
Optimization Framework to capture compactness, representativeness & readability
kmmsim
)(mS
)(mS
m
)(mS) (mS...mm M
jisimji
readiread
repirep
ss
k
i
i
k
i
iread irep ki
,1(
subject to
maxarg
,),
1
1
Objective function: Optimize representativeness & readability scores• Ensure: summaries reflect key opinions &reasonably well formed
48
Optimization Framework to capture compactness, representativeness & readability
kmmsim
)(mS
)(mS
m
)(mS) (mS...mm M
jisimji
readiread
repirep
ss
k
i
i
k
i
iread irep ki
,1(
subject to
maxarg
,),
1
1
Readability score of mi
Representativeness score of mi
49
kmmsim
)(mS
)(mS
m
)(mS) (mS...mm M
jisimji
readiread
repirep
ss
k
i
i
k
i
iread irep ki
,1(
subject to
maxarg
,),
1
1
Optimization Framework to capture compactness, representativeness & readability
Constraint 1: Maximum length of summary. •User adjustable•Captures compactness.
50
kmmsim
)(mS
)(mS
m
)(mS) (mS...mm M
jisimji
readiread
repirep
ss
k
i
i
k
i
iread irep ki
,1(
subject to
maxarg
,),
1
1
Optimization Framework to capture compactness, representativeness & readability
Constraint 2 &3: Min representativeness & readability. •Helps improve efficiency•Does not affect performance
51
kmmsim
)(mS
)(mS
m
)(mS) (mS...mm M
jisimji
readiread
repirep
ss
k
i
i
k
i
iread irep ki
,1(
subject to
maxarg
,),
1
1
Optimization Framework to capture compactness, representativeness & readability
Constraint 4: Max similarity of phrases • User adjustable • Captures compactness by minimizing redundancies
52
Measure used: Standard Jaccard Similarity Measure
Why important? Allows user to control amount of redundancy E.g. User desires good coverage of information on
small device request less redundancies !
Similarity scoring, sim(mi, mj)
53
Representativeness scoring, Srep(mi)
Purpose: Measure how well a phrase represents opinions from the original text?
2 properties of a highly representative phrase:1. Words should be strongly associated in text2. Words should be sufficiently frequent in text
Captured by a modified pointwise mutual information (PMI) function
)()(
),(),(log)(' 2,
ji
jijiji
wpwp
wwcwwpwwpmi
Add frequency of occurrence within a window
54
Readability scoring, Sread(mi)
Purpose: Measure well-formedness of a phrase
Readability scoring: Use Microsoft's Web N-gram model (publicly available) Obtain conditional probabilities of phrases Intuition: A readable phrase would occur more
frequently according to the web than a non-readable phrase
)|(log1
)( 1...12...
kqk
n
qk
knkread wwwpK
wwS
chain rule to computejoint probability in terms of conditional probabilities(averaged)
55
Evaluation
Input: User reviews for 330 products (CNET)
Evaluation Measure: Automatic ROUGE evaluation
56
5 10 15 20 25 300.00
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
KEATfidfOpinosisWebNGram
Summary Size (max words)
ROU
GE-
2 RE
CALL
Results: Performance comparison
WebNgram: Performs the best for this task
KEA: slightly better than tfidfTfidf: Worst
performance
57
5 10 15 20 25 300.00
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
KEATfidfOpinosisWebNGram
Summary Size (max words)
ROU
GE-
2 RE
CALL
Results: Performance comparison
WebNgram: Performs the best for this task
KEA: slightly better than tfidfTfidf: Worst
performance
PROS
CONS
FULL REVIEW
58
OpinoFetch: An Unsupervised Approach to Collecting Opinions for Arbitrary EntitiesTo Submit.
3. DATA
59
The current problem
No easy way to obtain a comprehensive set of opinions about an entity
Where to get opinions now? Rely on content providers or crawl a few sources
Problem :▪ Can result in source specific bias ▪ Data sparseness for some entities
60
Goal of this work
Automatically crawl online reviews forarbitrary entities E.g. Cars, Restaurants, Doctors
Target online reviews represent a big portion of online opinions
61
Existing Focused Crawlers
Meant to collect pages relevant to a topicE.g. “Databases Systems”, “Boston Terror Attack”
Page type is not as important content news article, review pages, forum page, etc.
Most focused crawlers are supervised require large amounts of training data for each topic
Not suitable for review collection on arbitrary entities Need training data for each entity will not scale up to
large # of entities
Propose: OpinoFetch
Focused crawler for collecting reviews pages on arbitrary entities
Unsupervised approach Does not require large amounts of training data
Solves crawling problem efficiently Uses a special data structure for relevance scoring
63
OpinoFetch: High Level Overview
Set of entities in a domain (e.g. All hotels in a city)
Step 1: For each entity, obtain initial set of Candidate Review Pages (CRP).
Find Initial Candidate Review Pages (CRP)
Input
1. Hampton Inn Champaign…2. I Hotel Conference Center…3. La Quinta Inn Champaign…4. Drury Inn5. ….
Hampton Inn…Reviews
Step 3: Score CRPs: • Entity relevance (Sent) • Review pg. relevance (Srev) Select: Srev > σrev ; Sent > σent
Expand CRP List
tripadvisor.com/Hotel_Review-g36806-d903...tripadvisor.com/Hotels-g36806-Urbana_Cha...hamptoninn3.hilton.com/en/hotels/…tripadvisor.com/ShowUserReviews-g36806-.........…
tripadvisor.com/Hotel_Review-g35790-d102…tripadvisor.com/Hotels-g36806-Urbana_Cha...hamptoninn3.hilton.com/en/hotels/…tripadvisor.com/ShowUserReviews-g36806-.........…
Step 2: Expand list of CRPs by exploring links in neighborhood of initial CRPs.
Collect Relevant Review Pages
64
OpinoFetch: High Level Overview
65
Step 1. Finding Initial CRPs
Use any general web search (e.g. Bing/Google) Per entity basis
Search engines do partial matching of entities to pages
More likely pages in vicinity of search results related to entity
QueryEntity QueryFormat: “entity name + brand / address” + “reviews”E.g. “Hampton Inn Champaign 1200 W University Ave Reviews”
66
Follow top-N URLs around vicinity of search results
Use URL prioritization strategy: Bias crawl path towards entity related pages Score each URL: based on similarity between
(a) URL + Entity Query, Sim(URL,EQ) (b) Anchor + Entity Query, Sim(Anchor,EQ)
Step 2. Expand CRP List
67
Step 3a. Review Relevance, Srev(pi)
To determine if page is indeed a review page Use review vocabulary:
Lexicon with most commonly occurring words within review pages – details in thesis
Idea: score a page based on # of review page words
]10[)(,)(
)(
)(),(log)( 2
pirevSnormalizer
piSpirevS
Vt twtiptcpirevS
rawrev
raw
]10[)(,)(
)(
)(),(log)( 2
pirevSnormalizer
piSpirevS
Vt twtiptcpirevS
rawrev
raw
To determine if page is indeed a review page Use review vocabulary:
Lexicon with most commonly occurring words within review pages – details in thesis
Idea: score a page based on # of review page words
Step 3a. Review Relevance, Srev(pi)
Raw review page relevance score
Normalize to obtain final review page relevance score
68
t is a term in the review vocabulary, Vc(t, pi) – freq. of t in page pi (tf). wt(t) - importance
weighting of t in RV
Normalizer needed to set proper thresholds
69
Step 3a. How to Normalize Srevraw(pi)?
Explored 3 normalization options: SiteMax (SM) : Max Srevraw(pi) amongst all pages
related to a particular site - Normalize based on site density
EntityMax (EM) : Max Srevraw(pi) score amongst all pages related to an entity - Normalize based on entity popularity
EntityMax + GlobalMax (GM) or SiteMax + GlobalMax (GM) :▪ To help with cases where SM/EM are unreliable
70
Step 3b. Entity Relevance, Sent(pi,ek)
To determine if page is about target entity
Based on similarity between a page URL & Entity Query
Why it works? Most review pages have highly descriptive URLs Entity Query is a detailed description of entity The more URL resembles query, more likely it is
relevant to target entity
Similarity measure: Jaccard Similarity
71
Usability of Crawler
Steps proposed so far, can be implemented in a variety of different ways
Our goal: make the crawling framework usable in practice
72
2 Aspects of Usability
1. Efficiency: Allow review collection for large number of entities Task should terminate in reasonable time & accuracy Problem happens when cannot access required
information quickly▪ E.g. Repeated access to term frequencies of different pages
2. Rich Information Access (RIA): Allow client to access info. beyond crawled pages
E.g. Get all review pages from top 10 popular sites for entity X DB not suitable because you cannot naturally model
complex relationships and would yield in large joins
73
FetchGraph: A Rich Information Network
Heterogeneous graph data structure
Models complex relationships between different components in a data collection problem
74
FetchGraph: A Rich Information Network
Review Vocabulary
Current Query
Q
V
t1
t2
t3
t4
t5
tz
.
.
.
.
Term Nodes
wt
wt
wt
wt
wt
wt
wt
wt
wt
wt
wt
E1
Entity Nodes
E2
Ek
Hampton Inn Champaign
I-Hotel Conference Center
Drury inn Champaign
t
t
t
u
u
u
c
c
c
Page Nodes
P2
P1
P3
P4
P5
P6
Pn
Site Nodes
S2
St
hotels.com
local.yahoo.com
S1
tripadvisor.com
t = title, u = url, c = content
Logical Nodes
Other Logical Nodes
75
FetchGraph: A Rich Information Network
Review Vocabulary
Current Query
Q
V
Other Logical Nodes
t1
t2
t3
t4
t5
tz
.
.
.
.
Term Nodes
wt
wt
wt
wt
wt
wt
wt
wt
wt
wt
wt
E1
Entity Nodes
E2
Ek
Hampton Inn Champaign
I-Hotel Conference Center
Drury inn Champaign
u
u
u
c
c
c
Page Nodes
P2
P1
P3
P4
P5
P6
Pn
Site Nodes
S2
St
hotels.com
local.yahoo.com
S1
tripadvisor.com
t
t
t
t = title, u = url, c = content
Logical Nodes
List of entities on which reviews are required
Based on set of CRPs found for each entity
At the core, made up of terms
One node per unique term
76
Benefits of the FetchGraph
Maintain one simple data structure: Access to various statistics▪ E.g TF of word in a page EdgeWT(content node term node)
Access to complex relationships and global information Compact: can be an in memory data structure
Network can be persisted and accessed later Client applications can use network to answer
interesting app. related questionsE.g. Get all review pages for entity X from top 10 popular sites
77
Computing Srevraw(pi) using FetchGraph
t1
t2
t3
t4
t5
tz
.
.
.
.
Term NodesPage Nodes
P2
P1
P3
P4
P5
P6
Pn
wt
wt
wt
wt
V
C
Content Node(logical node)
tf
tf
tf
Review Vocabulary Node(logical node)
To compute Srevraw(pi) :-Terms present in both the Content node and RV node. -TF and weights can be obtained from edges-Lookup of review vocabulary words within a page is fast-No need to parse page contents each time encountered
Outgoing edges = term ownershipEdge weight = importance wt
Edge weight = TF
78
Obtaining SiteMax from FetchGraph
Opinion Vocabulary
Current Query
Q
O
Other Logical Nodes
t1
t2
t3
t4
t5
tz
.
.
.
.
Term Nodes
wt
wt
wt
wt
wt
wt
wt
wt
wt
wt
wt
E1
Entity Node
E2
Ek
Hampton Inn Champaign
I-Hotel Conference Center
Drury inn Champaign
u
u
u
c
c
c
Page Nodes
P2
P1
P3
P4
P5
P6
Pn
Site Nodes
S2
St
hotels.com
local.yahoo.com
S1
tripadvisor.com
t
t
t
Logical Nodes
Access all pages connected to the site node requires complete graph
79
Obtaining EntityMax from FetchGraph
Opinion Vocabulary
Current Query
Q
O
Other Logical Nodes
t1
t2
t3
t4
t5
tz
.
.
.
.
Term Nodes
wt
wt
wt
wt
wt
wt
wt
wt
wt
wt
wt
E1
Entity Node
E2
Ek
Hampton Inn Champaign
I-Hotel Conference Center
Drury inn Champaign
u
u
u
c
c
c
Page Nodes
P2
P1
P3
P4
P5
P6
Pn
Site Nodes
S2
St
hotels.com
local.yahoo.com
tripadvisor.com
S1
t
t
t
Logical Nodes
Access all pages connected to entity node requires complete graph
80
Computing Sent(pi,ek) using FetchGraph
t1
t2
t3
t4
t5
tz
.
.
.
.
Term Nodes
tf
Page Nodes
P2
P1
P3
P4
P5
P6
Pn
tf
tf
tf
tf
q1
Entity Query Node(logical node)
Hampton Inn Champaign 1200 W Univ…Reviews
tf
tf
tripadvisor.com/ShowUser…
U
URL Node(logical node)
81
Evaluation
Goal: Evaluate accuracy & give insights into efficiency using FetchGraph
Evaluated in 3 domains: (5) – Electronics, (5) – Hotels, (4) - Attractions Only 14 entities expensive to obtain judgments
Gold standard: For each entity, explore top 50 Google results & links
around vicinity of the results (up to depth 3) 3 Human judges used to determine relevance of
collected links to entity query (crowd sourcing) Final judgment: majority voting
82
Evaluation
Baseline: Google search results Deemed relevant to entity query
Evaluation measure: Precision Recall – estimate of coverage of review pages
)Pages(eGoldStdRel#)RelPages(e#
)Recall(ek
kk
)ages(eRetrievedP#)RelPages(e#
)Prec(ek
kk
83
Results…
84
OpinoFetch vs. Google - Recall
10 20 30 40 500.00
0.05
0.10
0.15
0.20
0.25
Google OpinoFetchOpinoFetchUnnormalized
Number of search results
Reca
ll
OpinoFetch
OpinoFetchUnnormalized
GoogleGoogle: recall consistently low
Google: recall consistently low
Google: recall consistently low
Google: recall consistently low
Search results not always relevant to EQ or not direct pointers to actual review pages.
85
OpinoFetch vs. Google - Recall
10 20 30 40 500.00
0.05
0.10
0.15
0.20
0.25
Google OpinoFetchOpinoFetchUnnormalized
Number of search results
Reca
ll
OpinoFetch
OpinoFetchUnnormalized
OpinoFetch: recall keeps improving
OpinoFetch: recall keeps improving
OpinoFetch: recall keeps improving
OpinoFetch: recall keeps improving
-A lot of relevant content in vicinity of search results -OpinoFetch is able to discover such relevant content
86
OpinoFetch vs. Google
10 20 30 40 500.00
0.05
0.10
0.15
0.20
0.25
Google OpinoFetchOpinoFetchUnnormalized
Number of search results
Reca
ll
OpinoFetch
OpinoFetchUnnormalized
OpinoFetch: better recall with normalization
-Scores are normalized using special normalizers (e.g. EntityMax / SiteMax)-Easier to distinguish relevant review pages
87
Best Normalizer for Srev(pi)
EntityMax + GlobalMax
EntityMax SiteMax + GlobalMax
SiteMax0%
20%
40%
60%
80%
100%
97.23%85.72%
36.23%19.62%
% C
hang
e in
pre
cisi
on
EM + GM: gives the best precision
SM: gives lowest precision
SM is worst performing: certain sites cover different classes of entities. Max score from the site may be unreliable for sparse entities
88
Growth of FetchGraph
0 100 200 300 400 500 600 700 800 900 10000
50000100000150000200000250000300000350000400000450000
# pages crawled
Gra
ph S
ize
Linear growth without any optimization/compression
Possible to use FetchGraph as in memory data structure
89
Avg. Execution Time with/without FetchGraph
FetchGraph and Efficiency
With FetchGraph
Without FetchGraph
Srevraw(pi) 0.09ms 8.60ms
EnityMax Normalizer
0.06ms 4.40 s
Without FetchGraph:-Parse page contents each time
With FetchGraph:-Page loaded into memory once-Use FetchGraph to compute Srevraw(pi)
90
Avg. Execution Time with/without FetchGraph
…FetchGraph and Efficiency
With FetchGraph
Without FetchGraph
Srevraw(pi) ~0.09ms ~8.60ms
EnityMax Normalizer
~0.06ms ~4.40s
Without FetchGraph:load sets of pages into memory to find entity max normalizer
With FetchGraph:-Global info tracked till the end-Only need to do a lookup on related sets of pages to obtain entity max normalizer
91
OpinoFetch: Conclusion
Proposed: An unsupervised, practical method for collecting reviews on arbitrary entities
Works with reasonable accuracy without requiring large amounts of training data
Proposed FetchGraph: Helps with efficient lookup of various statistics Useful for answering application related queries
92
FindiLike Demo Thesis Ideas into Usable SystemGanesan & Zhai, WWW 2012
93
FindiLike – Preference Driven Entity Search
Finds & ranks entities based on user preferences Unstructured opinion preferences - novel Structured preferences - e.g. price, brand, etc.
Beyond search: Support for analysis of entities Ability to generate textual summaries of reviews Ability to display tag clouds of reviews
Current version: Works in the hotels domain
94
FindiLike – Search Interface [ Link ]E.g. Finding “clean” hotels in Los Angeles close to “Universal Studios”
Search: Find entities based on unstructured opinion preferences
Search: + Combine with structured preferences
Ranking: How well all preferences are matched?
95
FindiLike – Review Tag Clouds for “Sportsmen’s Lodge”
Tag cloudsweighted by frequency
Related snippets (“convenient location”)
96
FindiLike – Review Summary for “Sportsmen’s Lodge”
Opinion summariesreadable, well-formed
Related snippets
97
Review summary using OpinoFetch crawled reviews - “Hampton Inn Champaign” [link]
Summary with Initial Reviews:-26 reviews in total-1-2 sources
Summary with OpinoFetch Reviews:-135 reviews (8 sources)-Extracted with a baseline extractor -Not all reviews were included – filter• Based on length of review• Subjectivity score of review
98
Future Work
Opinion Based Entity Ranking Use click through & query logs to further improve
ranking of entities▪ Now possible everything is logged by demo system
Look into the use of phrasal search for ranking▪ Limit deviation from actual query (e.g. “close to university”)▪ Explore: “back-off” style scoring – score based on phrase
then remove the phrase restriction
99
…Future Work
Opinosis How to scale up to very large amounts of text?▪ Explore use of map reduce framework
Would this approach work with other types of texts?▪ E.g. Tweets, Facebook comments – shorter texts
Opinion Acquisition Compare OpinoFetch with a supervised crawler▪ Can achieve comparable results?
How to improve recall of OpinoFetch?▪ To evaluate at a reasonable scale: approximate judgments
without relying on humans?
100
References
[Barzilay and Lee2003] Barzilay, Regina and Lillian Lee. 2003. Learning to paraphrase: an unsupervised approach using multiple-sequence alignment. In NAACL ’03: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, pages 16–23, Morristown, NJ, USA.
[DeJong1982] DeJong, Gerald F. 1982. An overview of the FRUMP system. In Lehnert, Wendy G. and Martin H. Ringle, editors, Strategies for Natural Language Processing, pages 149–176. Lawrence Erlbaum, Hillsdale, NJ.
[Erkan and Radev2004] Erkan, G¨unes and Dragomir R. Radev. 2004. Lexrank: graph-based lexical centrality as salience in text summarization. J. Artif. Int. Res.,22(1):457–479.
[Finley and Harabagiu2002] Finley, Sanda Harabagiu and Sanda M. Harabagiu. 2002. Generating single and multi-document summaries with gistexter. In Proceedings of the workshop on automatic summarization, pages 30–38.
[Hu and Liu2004] Hu, Minqing and Bing Liu. 2004. Mining and summarizing customer reviews. In KDD, pages 168–177. [Jing and McKeown2000] Jing, Hongyan and Kathleen R. McKeown. 2000. Cut and paste based text summarization. In
Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference, pages 178–185, San Francisco, CA, USA. Morgan Kaufmann Publishers Inc.
[Lerman et al.2009] Lerman, Kevin, Sasha Blair-Goldensohn, and Ryan Mcdonald. 2009. Sentiment summarization: Evaluating and learning user preferences. In 12th Conference of the European Chapter of the Association for Computational Linguistics (EACL-09).
[Mihalcea and Tarau2004] Mihalcea, R. and P. Tarau. 2004. TextRank: Bringing order into texts. In Proceedings of EMNLP-04and the 2004 Conference on Empirical Methods in Natural Language Processing, July.
[Pang and Lee2004] Pang, Bo and Lillian Lee. 2004. A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In Proceedings of the ACL, pages 271–278.
[Pang et al.2002] Pang, Bo, Lillian Lee, and Shivakumar Vaithyanathan. 2002. Thumbs up? Sentiment classification using machine learning techniques. In Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 79–86.
[Radev and McKeown1998] Radev, DR and K. McKeown. 1998. Generating natural language summaries from multiple on-line sources. Computational Linguistics, 24(3):469–500.
[More in Thesis Report]