Sentiment and Polarity Extraction Arzucan Ozgur SI/EECS 767 January 15, 2010

Sentiment and Polarity Extraction

Arzucan Ozgur

SI/EECS 767January 15, 2010

Introduction

Suppose you would like to buy a digital camera. How do you decide which camera to buy?

Product specification/price Ask for friends' opinions Read on-line product reviews

This is great camera. Its a very quick camera, the auto feature works very well as the red eye correction, picture quality is excellent and the image estabilization work great. I will recommend this camera to everyone looking for a great camera.

Normally I am a big fan of Canons but this model is horrible. The pictures are always out of focus and the image quality is so poor not nearly as good as some of the older models. No matter what you do, close ups, far away they all take crummy pictures. Don't waste your money.

Thumbs up

Thumbs down

Introduction

Growth in on-line discussion groups and review sites. Important characteristic of posted articles is their

sentiment (opinion towards the subject matter). Is a product review positive or negative.

Sentiment and polarity extraction Identifying sentiments, opinions, emotions expressed in

text.

Applications

Classify reviews (e.g. movie reviews or product reviews) as positive or negative.

Improved search: summary statistics for search engines. “Paris travel review”: 80% positive & 20% positive “Paris travel review: positive”

Summarization of reviews pick sentences with highest positive semantic orientation.

Filtering “flames” for newsgroups Analysis of survey responses to open ended

questions.

Approaches

Classifying words (or phrases) as having positive, negative, or (neutral) semantic orientation (polarity). + semantic orientation -> praise (e.g. excellent, honest) - semantic orientation -> criticism (e.g. bad, poor, negative) (Hatzivassiloglou & McKeown, 1997), (Takamura et al., 2005),

(Turney & Littman, 2003).

Classifying documents (e.g. reviews) based on the overall sentiment expressed by authors as positive (thumbs up), negative (thumbs down), or (neutral). (Turney, 2002), (Pang et al., 2002)

Subjectivity Analysis: classify sentences as subjective or objective. I bought this camera four days ago. (Objective) This is a great camera. (Subjective) (Riloff & Wiebe, 2003)

Predicting the semantic orientation of adjectives

(Hatzivassiloglou & McKeown, 1997)

Introduction

Task: Classify adjectives as having positive or negative semantic orientation.

Motivation: Use semantic orientation as a component in a larger system to identify antonyms or near synonyms. Antonyms usually have different semantic

orientations (e.g. good vs. bad) Some near synonyms have different semantic

orientation - implies desirability or not (e.g. simple vs. simplistic).

Approach: Corpus-based approach, infer semantic orientation using the conjunctions between adjectives.

Conjunctions between adjectives -> semantic orientation Adjectives conjoined with “and” usually of the same

orientation. fair and legitimate corrupt and brutal fair and brutal (not natural, semantically anomalous)

Adjectives conjoined with “but” usually of different orientation. fair but brutal fair but legitimate (semantically anomalous)

The tax proposal was simple and well-received simplistic but well-received simplistic and well-received -- incorrect

by the public.

System Overview

All conjunctions of adjectives are extracted from the corpus. A supervised learning algorithm classifies each pair of

conjoined adjectives as having the same or different orientation. The result is a graph, where nodes are adjectives and links indicate the inferred same or different semantic orientation.

A clustering algorithm is applied to the graph to separate the adjectives into two groups of different orientation (place as many same orientation words into the same cluster as possible).

The group with the higher average frequency is labeled as having positive semantic orientation.

Corpus

21 million word 1987 Wall Street Journal corpus. Training set:

adjectives > 20 times remove adjectives that don't have orientation (e.g.

medical, domestic) remove adjectives for which unique label can't be

assigned (label depends on context) cheap (+ when synonym for inexpensive; - when

implies inferior quality) Final set: 1,336 adjectives:

657 positive, 679 negative Agreement: 96.97%

Validation of Conjunction Hypothesis

15,048 conjunctions extracted from the corpus (4024 both members in the set of pre-selected adjectives). 9296 distinct conjoined adjective pairs (2748 both pairs in the set of pre-selected adjectives)

Conjoined adjectives not evenly split between same and different orientation.Conjoined adjective usually of the same orientation (except “but”).

Each token classified by the parser according to three variables:- conjunction used (and, but, either – or, neither – nor)- type of modification (attributive, predicative, appositive, regulative)- number of the modified noun (singular or plural)

Prediction of Link Type

Morphological relationships between adjectives adequate – inadequate thoughtful – thoughtless 97.06% accurate (applies only to very few of the possible

pairs)

Log-linear regression model: - Feature vector for an adjective pair: observed counts in the various conjunction categories (e.g. conjunction used: and, type of modification: attributive, modified noun: singular)- only small improvement, but rates each prediction between 0 and 1

Clustering Adjectives

Construct a graph, where nodes are adjectives Links associated with dissimilarity values [0, 1]

same-orientation adjectives: low dissimilarity 1 – P(classification correct)

different-orientation adjectives: high dissimilarity P(classification correct)

non-conjoined adjectives: neutral dissimilarity (0.5) Log-linear model: dissimilarity = (1-y), y is 1 if same

orientation.

Clustering: Minimize objective function:

subject to

Results

- Experimented with different test sets by varying graph connectivity.* Denser and sparser test sets

- Test set: all the adjectives that have at least alpha connections- Training set: The rest of the adjectives.

Goodness of fit of each word: can be used as a quantitative measure of orientation.

Measuring praise and criticism: Inference of

semantic orientation from association

(Turney & Littman, 2003)

Introduction

Infer semantic orientation of a word from its statistical association with a set of positive and negative seed words.

Hypothesis: Semantic orientation of a word tends to correspond to the semantic orientation of its neighbors.

SO-A(word) > 0, positive semantic orientationSO-A(word) < 0, negative semantic orientation|SO-A(word)|: strength of the semantic orientation

- chosen for their lack of sensitivity to context- opposing pairs

Unsupervised approach. Only 14 labeled seed words

Semantic Orientation from PMI

Uses Pointwise Mutual Information (PMI) to calculate strength of semantic association between words.

positive: words tend to co-occurnegative: presence of one word makes it likely that the other one is absent

- PMI estimated by issuing queries to the AltaVista search engine (350 million web pages) using NEAR operator (words within 10 words to each other). - Word occurrence: number of hits (matching documents).

Semantic Orientation from LSA

Applies Latent Semantic Analysis (LSA) to calculate strength of the association between words.

- The k largest singular values, and their corresponding singular vectors from U and V, gives the rank k approximation to X with the smallest error.- Compressed version of the original matrix: rank lowering is expected to merge the dimensions associated with terms that have similar meanings.

Evaluation

Two Different Lexicons HM (Adjectives from (Hatzivassiloglou & McKeown, 1997)) GI (General Inquirer) Lexicon

3596 adjectives, adverbs, nouns, verbs (1614 positive, 1982 negative)

Ambiguous words eliminated “mind” in the sense of intellect – Positive “mind” in the sense of beware – Negative

Three different corpora of different sizes AV-ENG: ~350 million English web pages indexed in

AltaVista AV-CA ~ 7 million English web pages in the Canadian

domain (~2 billion words) TASA ~ 10 million words - short documents from novels,

newspaper articles.

Evaluation

- Accuracy of HM Algorithm between 78.08% and 92.37% - Comparable with SO-PMI medium size corpus.- When large corpus used -> SO-PMI better

Effect of Corpus Size

GI lexicon slightly lower results, but same trend.

Effect of Neighborhood Size

TASA small corpus: neighborhood > 100 better

Near operator better than AND

LSA vs PMI

LSA better for small corpora, but not scalable to large corpora.

Effect of Seed Words

Selection of seed words important.

Effect of Seed Words

pick, raise, capital are negative in only certain contexts. - context dependente.g. raise a protest, capital offense

Extracting semantic orientations of words using

spin model

(Takamura et al., 2005)

Introduction

Each electron has a direction of spin (up or down). Each word has a semantic orientation (positive or

negative). Regard words as a set of electrons and use spin

models for electrons to identify semantic orientations of words.

Spin Model

Also called Ising spin model. A spin system:

An array of N electrons Each electron has a spin of “+1 (up)” or “-1 (down)”. Two electrons next to each other energetically tend to

have same spin value. The energy function of a spin system:

Variable x follows Boltzmann Distribution.2N configurations of spins (computationally difficult)

Z(W) normalization factorβ: constant (called: inverse temperature)

Spin configuration with higher energy has smaller probability

Mean Field Approximation

- Approximate P(x|W) with a simple function Q(x; θ).- Select parameters θ such that P and Q are as similar to each other as possible.- Distance between P and Q: Variation free energy F -> Difference between the mean energy with respect to Q and the entropy of Q.

Mean field equation:

solved by iterative update rule

Construction of Lexical Networks

Link two words if one appears in the gloss of the other word. - Gloss (G) network. SL: same-orientation links DL: different-orientation links if one word precedes a negation word in the gloss of the

other word, the link is DL.

d(i): degree of word i

- Gloss-Thesaurus Network (GT): Link synonyms, antonyms, and hypernyms. Only antonym links as in DL.

- Gloss-Thesaurus-Corpus (GTC): Use the method by (Hatzivassiloglou & McKeown, 1997)

* If adjectives connected with “and”, the link is in SL* If adjectives connected with “but”, the link is in DL

Prediction of β: Magnetization

- At high temperatures: spins randomly oriented (paramagnetic phase) => m ~ 0 - At low temperatures: most spins same direction (ferromagnetic phase) => m ≠ 0 - Phase transition: at some intermediate temperature ferromagnetic phase changes to paramagnetic phase.- Slightly before phase transition: spins are locally polarized -> strongly connected spins have same polarity.

- State of the lexical network is locally polarized.- Calculate values of m with different values of ϐ, select the value just before phase transition.

Evaluation

- Construct English lexical network using glosses, synonyms, antonyms, hypernyms of WordNet.- 88,000 words- 804 conjunctive expressions from Wall Street Journal

- GI Lexicon gold standard.(Turney & Littman): 82.84% with 14 seed words.

Evaluation

Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification

of reviews

(Turney, 2002)

Introduction

Unsupervised learning algorithm for classifying reviews as recommended (thumbs up) or not recommended (thumbs down).

Semantic orientation of phrases containing adjectives or adverbs are calculated using PMI-IR (Pointwise Mutual Information – Information Retrieval).

Extract two-word phrases containing adjectives or adverbs. Try to capture more context information compared to

using single words. unpredictable steering (negative in car review) unpredictable plot (positive in movie review)

If average semantic orientation of the phrases in a review: positive – recommended negative – not recommended

Semantic Orientation of a Phrase

Estimate semantic orientation of a phrase using PMI-IR. Five-star review rating system -> 5: excellent, 1: poor SO(phrase) = PMI(phrase, “excellent”) - PMI(phrase,

“poor”) SO(phrase) is positive, if phrase more strongly associated

with “excellent”; negative if more strongly associated with “poor”.

AltaVista search engine with NEAR operator used.

hits(query): number of hits returned

Example processed reviews

Recommended review (of Bank of America)

Not recommended review(of Bank of America)

Results

410 reviews from Epinions, randomly sampled from four different domains170 (41%) - not recommended240 (59%) - recommended

- Little variation in accuracy within a domain (except travel)

- Strong positive correlation between average semantic orientation and author's rating out of five stars

Movie reviews hard to classify

- Positive reviews mention unpleasant things (e.g. violent scenes)- Negative reviews mention pleasant things (e.g. talented actor)- Different elements in movie reviews: actors, events, style, art. e.g. talented actors might not add up to a good movie

Thumbs up? sentiment classification using machine

learning techniques

(Pang et al., 2002)

Introduction

Classifying documents (movie reviews) by overall sentiment (positive or negative).

Examine the effectiveness of applying machine learning techniques (Naïve Bayes, Maximum Entropy Classification, Support Vector Machines) to the sentiment classification problem.

Sentiment classification more challenging than topic-based classification. “How could anyone sit through this movie?”

no word that is obviously negative Sentiment seems to require more understanding than the

usual topic-based classification.

Movie Reviews Domain

Data source: The IMDB archive of the rec.arts.movies.reviews newsgroup.

Selected reviews where author rating was expressed with stars or numerical value Automatically converted to one of three categories:

positive, negative, or neutral. Impose a limit of fewer that 20 reviews per author per

sentiment category. 1301 positive reviews 752 negative reviews 144 reviewers

Closer Look at the Problem

Results

- Randomly selected 700 positive and 700 negative documents- Accuracy for 3-fold cross-validation

Model negation: “good” vs. “not very good”Add the tag NOT_ to every word between a negation word (“not”, “isn’t”, “didn’t”, etc.) and the first punctuation mark following the negation word.

- unigrams, bigrams, part-of-speech, adjectives, position: first quarter, last quarter, or middle half of document.

- NB tends to do the worst and SVMs tend to do the best.- Unigram presence information -> most effective (contradicts topic-based classification results)

Discussion

“thwarted expectations” narrative: Author sets up deliberate contrast to earlier discussion.

"This film should be brilliant. It sounds like a great plot, the actors are first grade, and the supporting cast is good as well, and Stallone is attempting to deliver a good performance. However, it can’t hold up."

Some form of discourse analysis is necessary (using more sophisticated techniques than the positional features mentioned above), or at least some way of determining the focus of each sentence.

Learning extraction patterns for subjective expressions

(Riloff & Wiebe, 2003)

Introduction

Classify sentences as subjective or objective. Subjective language can be expressed with a

variety of words or phrases, some might be very rare: strongly subjective adjectives: preposterous, unseemly metaphorical or idiomatic phrases: drives (someone) up

the wall, swept off one's feet.

To acquire a broad and comprehensive subjectivity vocabulary -> subjectivity learning systems must be trained on large text collections

Use bootstrapping to allow subjectivity classifiers learn from unannotated text.

Bootstrapping Process

HP-Subj: at least 2 strongly subjective clues (91.5% precision, 31.9% recall)HP-Obj: No strongly subjective clues, at most one weekly subjective clue in the current, previous, and next sentence (82.6% precision, 16.4% recall)

Learning Subjective Extraction Patterns

Apply syntactic templates to the training corpus to extract patterns.

Templates and example patterns learned Patterns with interesting behavior

Rank patterns using conditional probability.

freq(patterni) >= Q

1

Pr(subjective | patterni) >= Q

2

Evaluation

Evaluation of learned patterns

Evaluation of Bootstrapping

Summary and Discussion

Three approaches to sentiment and polarity extraction. Semantic orientation of words

(Hatzivassiloglou & McKeown, 1997), (Takamura et al., 2005), (Turney & Littman, 2003)

Performances comparable to each other. (Hatzivassiloglou & McKeown, 1997) only for adjectives. Can it be extended to other classes of words?

Adverbs: “He ran quickly but awkwardly.” Nouns & verbs?

the rise and fall of the Roman Empire love and death

(Turney & Littman, 2003): Time to query AltaVista, LSA not scalable to large corpora. (Takamura et al., 2005): Slightly lower performance, strong

theoretical model


Non of the methods deal with ambiguous words. “lose one's mind”: negative “right mind”: positive semantic orientation depends on the context. Can Word Sense Disambiguation help?

“unpredictable steering”: negative “unpredictable plot”: positive

What methods can be used?


Classify documents (Turney, 2002): Average semantic orientation of phrases in the

text. Classifying movies more difficult

(Pang et al., 2002): Traditional ML methods More difficult than topic-based classification

Assume: A document expresses either a positive or a negative sentiment about an subject. Might talk about several different aspects of an object. (e.g. good actors, but bad movie)

What methods can be applied?


Subjectivity Analysis: Classify sentences as subjective/objective. Bootstrapping improved performance. Can it be used to improve performance of word-based or

document-based sentiment extraction? How?

Thank you!

Documents

Sentiment and Polarity Extraction Arzucan Ozgur SI/EECS 767 January 15, 2010