AN EFFECTIVE STATISTICAL APPROACH TO BLOG POST OPINION RETRIEVAL Ben He Craig Macdonald Iadh Ounis...

Preview:

Citation preview

AN EFFECTIVE STATISTICAL APPROACH TO BLOG POST OPINION RETRIEVAL

Ben He

Craig Macdonald

Iadh Ounis

University of Glasgow

Jiyin HeUniversity of Amsterdam

CIKM 2008

1

Introduction

Finding opinionated blog posts is still an open problem.

A popular solution is to utilize the external resources and manual efforts in identifying subjective features.

The authors proposed a dictionary-based statistical approach, which automatically derives evidence for subjectivity from the blog collection itself, without requiring any manual effort.

2

TREC Opinion Finding Task (1/2) Text REtrieval Conference. Goal: To identify sentiment at the

document-level. The dataset are composed of:

Feed documents: XML format, usually a short summary of the blog post.

Permalink documents: HTML format, the complete blog post and its comments.

Homepage documents: HTML format, main entry to the blog.

3

TREC Opinion Finding Task (2/2) Sample query format:

<top><num> 863<title> netflix<desc>

Identify documents that show customer opinions of Netflix.

<narr>A relevant document will indicate subscriber satisfaction with Netflix. Opinions about the Netflix DVD allocation system, promptness or delay in mailings are relevant.Indications of having been or intent to become a Netflix subscriber that do not state an opinion are not relevant.

</top>

4

Statistical Dictionary-based Approach

5

Dictionary Generation

The Skewed Query Model Rank all terms in the collection by term

frequencies in descending order. The terms, whose rankings are in the range

(S·#terms, U·#terms)are selected in the dictionary. #terms : the number of unique terms in the

collection S,U : model parameters. S=0.00007 and

U=0.001 in this paper.

6

Dictionary Generation

Ex:#terms=200,000#terms x 0.00007=14#terms x 0.001=200Only those terms ranked 14 to 200 will be

preserved

The dictionary is not necessary opinionated.

7

Term Weighting (1/2)

KL divergence method

D(Rel): Collection of relevant documents. D(opRel): Collection of opinionated and relevant documents. c(D(opRel))= #tokens in the opinionated documents. c(D(Rel))= #tokens in the relevant documents. tfx=term frequency of the term t in the opinionated

documents. tfrel=term frequency of the term t in the relevant

documents.

8

Term Weighting (2/2)

Bose-Einstein statistics method Measures how informative a term is in the

set D(opRel) against D(Rel).

= : the frequency of the term t in the D(Rel). : the number of documents in D(Rel). : the frequency of the term t in the

D(opRel).

9

Generating the Opinion Score Take the X top weighted terms from the

opinion dictionary. X will be tuned in the training step.

Submit them to the retrieval system as a query Qopn.

Score(d,Qopn): the opinion score of document d.

Score(d,Q): the initial ranking score.

10

Score Combination

Linear combination:

Log combination:

a, k will be tuned in the training step.

11

Experiment Settings (1/3)

TREC06: 50 topics for training. TREC07: 50 topics for testing. Only the “title” field is used (1.74

words/topic). Baseline 1: Apply InLB model, a variation of

the BM25 ranking function. Retrieve as many relevant documents as possible.

12

Experiment Settings (2/3)

Baseline 2: favor documents where the query terms appear in close proximity.

Q2: The set of all query term pairs in query Q. N: #Docs in the collection. T: #Tokens in the collection. pfn: The normalized frequency of the tuple p.

13

Experiment Settings (3/3)

Manually collecting an external dictionary from OpinionFinder and several other resources.

Contains approximately 12,000 English words, mostly adjectives, adverbs and nouns.

14

Experiment: Term Weighting (1/2) Hypothesis: the most opinionated terms

for one query set are also good indicators of opinion for other queries.

Sampling:

For each sample set, calculate the weight of each terms.

Training Set(50 Topics)

Set1

Set2

Set10…

Each with 25 Topics

Overlap :65%

maximum

15

Experiment: Term Weighting (2/2) Compute the cosine similarity between

the weights of the top 100 weighted terms from each two samples

16

Experiment: Validation (1/3)

Tuning the parameters X, a and k mentioned before.

Maximize X by maximizing the mean MAP of the 10 samples.

17

Training Set(50 Topics)

Set1for

assigning term weight

Set1’for

validation

Experiment: Validation (2/3)18

Experiment: Validation (3/3)

Fix X=100, tuning a and k. a within [0, 1] , step=0.05 k within (0, 1000] , step=50

19

Experiment: Evaluation (1/3)

20

Experiment: Evaluation (2/3)

21

Experiment: Evaluation (3/3) Comparison with the OpinionFinder

All being equal, replace the opinion score Score(d,Qopn) with

22

Conclusion

An effective and practical approach to retrieving opinionated blog posts without manual effort.

Opinion scores are computed during indexing Computational cost is negligible.

The automatically generated internal dictionary performs as good as the external dictionary.

Diferrent random samples from the collection reach a high consensus on the opinionated terms if the Bose-Einstein statistics given by the geometric distribution are applied.

23

Thank you for listening!24

Recommended