25
1 Expanding Domain Sentiment Expanding Domain Sentiment Lexicon through Double Propagation Lexicon through Double Propagation Presentor Presentor : He : He Jiang Jiang Report based on the following materials: Report based on the following materials: 1) Bing Liu. Tutorial: Opinion Mining and Summarization - Sentiment Analysis, WWW-2008. 2) Guang Qiu, Bing Liu, Jiajun Bu, and Chun Chen. Expanding Domain Sentiment Lexicon through Double Propagation. IJCAI’09

Expanding Domain Sentiment Lexicon through Double Propagationxwu/wie/CourseSlides/Sentiment.pdf · 2010-02-24 · 1 Expanding Domain Sentiment Lexicon through Double Propagation Presentor:

Embed Size (px)

Citation preview

1

Expanding Domain Sentiment Expanding Domain Sentiment Lexicon through Double PropagationLexicon through Double Propagation

PresentorPresentor: He : He JiangJiang

Report based on the following materials:Report based on the following materials:

1) Bing Liu. Tutorial: Opinion Mining and Summarization - Sentiment Analysis, WWW-2008.2) Guang Qiu, Bing Liu, Jiajun Bu, and Chun Chen. Expanding Domain Sentiment Lexicon through Double Propagation. IJCAI’09

2

OutlineOutline

Part 1: Opinion Mining and Sentiment AnalysisPart 1: Opinion Mining and Sentiment Analysis

Part 2: Expanding Domain Sentiment Lexicon Part 2: Expanding Domain Sentiment Lexicon through Double Propagationthrough Double Propagation

3

Part 1: Opinion Mining and Sentiment AnalysisPart 1: Opinion Mining and Sentiment Analysis

Opinion on the WebOpinion on the WebApplicationsApplicationsOpinion SearchOpinion SearchRelated ProblemsRelated Problems

4

Part 1: Opinion Mining and Sentiment AnalysisPart 1: Opinion Mining and Sentiment Analysis

Opinion on the WebOpinion on the WebTwo main types of information on the webTwo main types of information on the web

factsfactsopinionopinion

WhereWhere’’s opinions opinionreview sitesreview sitesforumsforumsdiscussion groupsdiscussion groupsblogsblogs

5

Part 1: Opinion Mining and Sentiment AnalysisPart 1: Opinion Mining and Sentiment Analysis

Opinion on the WebOpinion on the WebTwo types of opinionsTwo types of opinions

Direct Opinions: sentiment expressions on some objects, e.g. Direct Opinions: sentiment expressions on some objects, e.g. products, events, topics, personsproducts, events, topics, persons

e.g. e.g. ““the picture quality of this camera is greatthe picture quality of this camera is great””

Comparisons: relations expressing similarities or differences ofComparisons: relations expressing similarities or differences ofmore than one objectmore than one object

e.g. e.g. ““car x is cheaper than car ycar x is cheaper than car y””

6

Part 1: Opinion Mining and Sentiment AnalysisPart 1: Opinion Mining and Sentiment Analysis

ApplicationsApplicationsBusinesses and organizations: product and service benchmarking.

Market intelligence.Business spends a huge amount of money to find consumer sentiments and opinions.Consultants, surveys and focused groups, etc

Individuals: interested in other’s opinions whenPurchasing a product or using a service,Finding opinions on political topics,

Ads placements: Placing ads in the user-generated contentPlace an ad when one praises a product.Place an ad from a competitor if one criticizes a product.

Opinion retrieval/search: providing general search for opinions.

7

Part 1: Opinion Mining and Sentiment AnalysisPart 1: Opinion Mining and Sentiment Analysis

Opinion SearchOpinion SearchFind the opinion of a person or organization (opinion

holder) on a particular object or a feature of the object.e.g., what is Bill Clinton’s opinion on abortion?

Find positive and/or negative opinions on a particularobject (or some features of the object), e.g.,

customer opinions on a digital camera.public opinions on a political topic.

Find how opinions on an object change over time.How object A compares with Object B?

Gmail vs. Hotmail

8

Part 1: Opinion Mining and Sentiment AnalysisPart 1: Opinion Mining and Sentiment Analysis

Related ProblemsRelated ProblemsSentiment classificationSentiment classification

classify wordsclassify words’’ sentiment expressed by authors (positive, negative, sentiment expressed by authors (positive, negative, neutral)neutral)

FeatureFeature--based opinion extraction and summarizationbased opinion extraction and summarizationA product consists of a few features (components), whatA product consists of a few features (components), what’’s the opinion s the opinion over each featureover each feature

Comparative sentence and relation extractionComparative sentence and relation extractionVisual Summarization & ComparisonVisual Summarization & Comparison

9

Part 1: Opinion Mining and Sentiment AnalysisPart 1: Opinion Mining and Sentiment Analysis

Related ProblemsRelated ProblemsVisual Summarization & ComparisonVisual Summarization & Comparison

10

Part 2: Expanding Domain Sentiment Lexicon through Part 2: Expanding Domain Sentiment Lexicon through Double PropagationDouble Propagation

MotivationMotivationRelated workRelated workBasic IdeaBasic IdeaSentiment Word ExtractionSentiment Word ExtractionPolarity AssignmentPolarity AssignmentExperiments and DiscussionsExperiments and Discussions

11

Part 2: Expanding Domain Sentiment Lexicon through Part 2: Expanding Domain Sentiment Lexicon through Double PropagationDouble Propagation

MotivationMotivationSentiment words: words that convey positive or negative sentimenSentiment words: words that convey positive or negative sentiment t polarities (attitude)polarities (attitude)

A comprehensive sentiment lexicon (dictionary) is essential A comprehensive sentiment lexicon (dictionary) is essential opinion expressions vary significantly among different domains, opinion expressions vary significantly among different domains, i.e., domain i.e., domain dependentdependentno universal sentiment lexicon to cover all domainsno universal sentiment lexicon to cover all domains

How to extract domainHow to extract domain--dependent sentiment words based on a set of dependent sentiment words based on a set of seeding sentiment words?seeding sentiment words?

12

Part 2: Expanding Domain Sentiment Lexicon through Part 2: Expanding Domain Sentiment Lexicon through Double PropagationDouble Propagation

Related WorkRelated WorkSentiment Analysis can be conducted atSentiment Analysis can be conducted at

word (like this paper)word (like this paper)expressionexpressionsentencesentence

Sentiment Word Analysis on Word LevelSentiment Word Analysis on Word LevelCorporaCorpora--based approaches (this paper falls in this category)based approaches (this paper falls in this category)

KanayamaKanayama and and NasukawaNasukawa 20062006» Extract domain specific sentiment words in Japanese text» Idea: 1) They exploit sentiment coherency within sentence and among

sentences to extract sentiment candidates; 2) then use a statistical method to determine whether a candidate is correct or not.

» Key difference with double propagation: Double propagation exploits the relationships between sentiment words and product features in extraction process.

dictionarydictionary--based approachesbased approaches

13

Part 2: Expanding Domain Sentiment Lexicon through Part 2: Expanding Domain Sentiment Lexicon through Double PropagationDouble Propagation

Basic IdeaBasic Ideainput: a set of sentiment wordsinput: a set of sentiment wordsoutput: a set of sentiment words & a set of featuresoutput: a set of sentiment words & a set of features

sentiment sentiment wordswords

sentiment sentiment wordswords

featuresfeatures

sentiment words sentiment words new sentiment wordsnew sentiment words

senetimentsenetiment wordswords new featuresnew features

features features new new senetimentsenetiment wordswords

features features new featuresnew features

14

Part 2: Expanding Domain Sentiment Lexicon through Part 2: Expanding Domain Sentiment Lexicon through Double PropagationDouble Propagation

Sentiment Word ExtractionSentiment Word Extraction4 tasks: 4 tasks:

Model:Model:

sentiment words sentiment words new sentiment wordsnew sentiment words

senetimentsenetiment wordswords new featuresnew features

features features new new senetimentsenetiment wordswords

features features new featuresnew features

( Extraction Rules based on Relations)dependency parser Minipar

sentiment sentiment wordswords

featuresfeatures

sentiment sentiment wordswords

15

Part 2: Expanding Domain Sentiment Lexicon through Part 2: Expanding Domain Sentiment Lexicon through Double PropagationDouble Propagation

Sentiment Word ExtractionSentiment Word ExtractionRelations of sentiment words and featuresRelations of sentiment words and features

e.g. e.g. ““I love I love iPodiPod””both both ““II”” and and ““iPodiPod”” depend on the verb depend on the verb ““lovelove”” with the relations of with the relations of subjsubj and and objobjrespectively. Here, respectively. Here, subjsubj: : ““II”” is the subject of is the subject of ““lovelove””, while , while objobj means that means that ““iPodiPod”” is the object of is the object of ““lovelove””..

16

Part 2: Expanding Domain Sentiment Lexicon through Part 2: Expanding Domain Sentiment Lexicon through Double PropagationDouble Propagation

Sentiment Word ExtractionSentiment Word ExtractionExtraction Rules based on Extraction Rules based on RelationsRelations

POS: partPOS: part--ofof--speechspeechs, f: word/feature to be extracteds, f: word/feature to be extracted{S},{F}: known sentiment words/ {S},{F}: known sentiment words/

featuresfeatures{JJ}: adjectives and their variants{JJ}: adjectives and their variantse.g. smart, smarter, smartest e.g. smart, smarter, smartest {NN}: nouns{NN}: nounse.g. picture, pictures e.g. picture, pictures {CONJ}: conjunction{CONJ}: conjunctione.g. and, ore.g. and, or{MR}: dependency relations between {MR}: dependency relations between

sentiment words and features, such sentiment words and features, such as mod, as mod, subjsubj, , objobj, , pnmodpnmod

17

Part 2: Expanding Domain Sentiment Lexicon through Part 2: Expanding Domain Sentiment Lexicon through Double PropagationDouble Propagation

Sentiment Word ExtractionSentiment Word Extractiona detailed modela detailed model

dependency parser Minipar

sentiment sentiment wordswords

featuresfeatures

sentiment sentiment wordswords stanford POS

tagger

corpuscorpus

tokenstokens

rulesrules

18

Part 2: Expanding Domain Sentiment Lexicon through Part 2: Expanding Domain Sentiment Lexicon through Double PropagationDouble Propagation

Polarity AssignmentPolarity AssignmentObservation 1Observation 1

Same polarity for same feature in a reviewSame polarity for same feature in a review

Observation 2Observation 2Same polarity for same sentiment word in a domain corpusSame polarity for same sentiment word in a domain corpus

19

Part 2: Expanding Domain Sentiment Lexicon through Part 2: Expanding Domain Sentiment Lexicon through Double PropagationDouble Propagation

Polarity AssignmentPolarity AssignmentRules:Rules:

Rule 1 : Heterogeneous ruleRule 1 : Heterogeneous ruleFor sentiment words extracted by known feature, and features extFor sentiment words extracted by known feature, and features extracted by racted by known sentiment words, known sentiment words, assign them the same polarities as the known onesassign them the same polarities as the known ones..

Rule 2: Homogeneous ruleRule 2: Homogeneous ruleFor sentiment words extracted by known sentiment words, and featFor sentiment words extracted by known sentiment words, and features ures extracted by known features, extracted by known features, also assign them the same polarities as the also assign them the same polarities as the known ones.known ones.

Rule 3: IntraRule 3: Intra--review rulereview ruleFor new sentiment words extracted by features which are extracteFor new sentiment words extracted by features which are extracted in d in other reviews, those sentiment words cannot be determined (by Ruother reviews, those sentiment words cannot be determined (by Rule1). If le1). If those sentiments words only appear in this review, Rule2 cannot those sentiments words only appear in this review, Rule2 cannot be applied be applied either. Those sentiment words are set to the same polarity of theither. Those sentiment words are set to the same polarity of the review. e review. The polarity of a review is determined by the sum of the polaritThe polarity of a review is determined by the sum of the polarities of words ies of words in this review (+1 for positive, in this review (+1 for positive, --1 for negative, 0 for neutral).1 for negative, 0 for neutral).

» If sum > 0, positive for the review» If sum < 0, negative

Conflict resolution: Conflict resolution: » For sentiment words with distinct polarities by calculation, use sum of

those polarities to determine the final polarity.

20

Part 2: Expanding Domain Sentiment Lexicon through Part 2: Expanding Domain Sentiment Lexicon through Double PropagationDouble Propagation

Experiments and DiscussionsExperiments and DiscussionsExperiment Set UpExperiment Set Up

Data set: 5 review data sets (2 for digital cameras, 1 for DVP pData set: 5 review data sets (2 for digital cameras, 1 for DVP player, 1 for layer, 1 for MP3, 1 for cell phone)MP3, 1 for cell phone)On average, each data set consists of 789 sentences and 63 revieOn average, each data set consists of 789 sentences and 63 reviews.ws.

4 4 algorihtmsalgorihtms::CRF (conditional random fields): a supervised algorithm (2001)CRF (conditional random fields): a supervised algorithm (2001)KN06 KN06 PropProp--depdepnoPropnoProp--depdep (non(non--propagation version of the new approach)propagation version of the new approach)

An initial positive and negative sentiment lists (654 and 1098 An initial positive and negative sentiment lists (654 and 1098 words, words, respectively)respectively)One data set is used as the training set for CRFOne data set is used as the training set for CRFwords appearing in both 10% (20%, 50, 80%) of initial lists and words appearing in both 10% (20%, 50, 80%) of initial lists and the chosen the chosen training data set are feed into KN06, Proptraining data set are feed into KN06, Prop--depdep, , noPropnoProp--depdep

21

Part 2: Expanding Domain Sentiment Lexicon through Part 2: Expanding Domain Sentiment Lexicon through Double PropagationDouble Propagation

Experiments and DiscussionsExperiments and DiscussionsResults of sentiment word extractionResults of sentiment word extraction

22

Part 2: Expanding Domain Sentiment Lexicon through Part 2: Expanding Domain Sentiment Lexicon through Double PropagationDouble Propagation

Experiments and DiscussionsExperiments and DiscussionsResults of sentiment word extractionResults of sentiment word extraction

23

Part 2: Expanding Domain Sentiment Lexicon through Part 2: Expanding Domain Sentiment Lexicon through Double PropagationDouble Propagation

Experiments and DiscussionsExperiments and DiscussionsResults of sentiment word extractionResults of sentiment word extraction

24

Part 2: Expanding Domain Sentiment Lexicon through Part 2: Expanding Domain Sentiment Lexicon through Double PropagationDouble Propagation

Experiments and DiscussionsExperiments and DiscussionsResults of polarity assignmentResults of polarity assignment

25

ConclusionsConclusions

Opinion mining is a promising topic in data mining, thereOpinion mining is a promising topic in data mining, there’’re re a lot of new problems in this area !a lot of new problems in this area !Sentiment word detection could be a starting point for Sentiment word detection could be a starting point for opinion miningopinion mining