A Survey on Sentiment Mining Techniques

Survey paper submitted for CSE590 Networks and Data Mining Techniques on Sep 26, 2013

A Survey of Sentiment Mining Techniques

Khan Mostafa Graduate Student, Computer Science, Stony Brook University, NY 11794, USA

Email: [email protected]

Student ID# 109365509

ABSTRACT

A survey on publications addressing challenges in and techniques of sentiment mining.

1 INTRODUCTION

Text convey subjective and objective information, as well as sentiments associated with it. It is an

intuitive task for human to identify associated sentiment of any text. However, to identify collective,

as well as individual sentiments amongst a large collection of textual data can be an enormous task.

This requires data mining and classifying techniques to automatically associate sentiments of textual

data. Sentiment mining can be used to identify how people feel about a product, topic or more generally

an entity. This is useful to manufactures from business point of view. In recent years, there had been

much academic research in sentiment analysis as well as practical commercial applications.

Generally, sentiment can be negative or positive. Nevertheless, every text do not convey sentiment,

some are merely objective statements. Thus, application needs to classify texts as positive, negative or

neutral while mining sentiment.

Sentiment analysis has been studied from perspectives of data mining, machine learning, natural

language processing and statistical analysis. In this article, I would try to address several aspects of

sentiment mining. I survey several papers starting with a text which familiarizes readers to basic ideas

on automatic sentiment analysis. Then, I briefly address a well cited paper which instigated much of

sentiment mining research as a specialized classification task. Next article discusses on utilizing

microblogging sites like Twitter for sentiment analysis and opinion mining. Next papers discuss

different approaches for sentiment classification. One specially focuses on mining large real time

streaming data and the last paper gives hints to the case of ironic speeches.

Each of these surveyed papers address slightly different aspects in sentiment mining and covers subtly

overall problem domain.

2 SENTIMENT ANALYSIS AND OPINION MINING

2.1 Automatic Sentiment Analysis in On-line Text

I would open my survey by first addressing to a relatively old but not so ancient text (Boiy, et al. 2007)

about sentiment analysis which introduces readers to basic concepts, methodology, techniques and

challenges in related topic. They first objectify sentiment by introducing concepts of emotions.

Emotions can occur in text as appraisal, direct expressions, elements of action and remarks.

mailto:[email protected]

Khan Mostafa Student ID# 109365509

2

Then they introduces readers to methodologies for identifying emotion (and thus sentiments) of text.

They explores symbolic techniques and machine learning techniques. To employ machine learning

techniques, first we need to select some features. In search of potential candidate features terminologies

like Parts of Speech (POS), unigrams, n-grams, lemmas, negations, opinion words, adjectives are

prevalent. Authors, then mentions support vector machines, naïve Bayes multinomial and maximum

entropy as three example supervised method.

Authors also lay focus on several challenges. One challenge is that, often in many texts persons express

sentiment about different topics – some being negative and some being positive. Therefore, it can be

useful to investigate relation topic sentiment relation. Again, many texts are not subjective but merely

neutral objective statements. So, before estimating sentiment polarity, it is useful to identify whether

they really bear some sentiment. Similar challenge is cross domain classification. Another important

issue is, the text quality; especially when gathered from the web – text are intertwined with fair amount

of junk. This requires decent amount of text filtration.

2.2 Thumbs up? Sentiment Classification using Machine Learning Techniques

Pang, et al. (Pang, Lee and Vaithyanathan 2002), amongst many, investigated in the field of sentiment

classification at an early stage and posed several challenges in the field. They aimed to, “examine whether

it suffices to treat sentiment classification simply as a special case of topic-based categorization or whether special sentiment-

categorization methods need to be developed.” They tried to employ three machine learning techniques, which

performs well in topic categorization, namely: - (a) naïve Bayes, (b) maximum entropy classification

and (c) support vector machines only to find that, they do not perform satisfactorily in sentiment

classification. Thus, they ended with an open question for researchers to investigate intensely.

2.3 Twitter as a Corpus for Sentiment Analysis and Opinion Mining

A. Pak & P. Paroubek (Pak and Paroubek 2010) studies how microblogging platform can be used for

sentiment analysis. They mined Twitter to automatically collect a corpus of negative and positive

sentiment (subjective) as well as objective (neutral) posts. They cleverly exploited the use of emoticons

to associate sentiment to tweets; similar approach was exemplified by J. Read (Read. 2005). They

queried Twitter for two types of emoticons:

Happy emoticons: “:-)”, “:)”, “=)”, “:D” etc.

Sad emoticons: “:-(”, “:(”, “=(”, “;(” etc.

In conjunction to that, they collected objective/neutral posts by retrieving posts from newspapers and

magazines.

Pak, et al. analyzed their collected corpus first by tagging posts in the corpus using TreeTagger (Schmid

1994) and then performing pairwise comparison of tags distribution over two sets. For subjective set

vs. objective set they observe that, POS tags are not evenly distributed and postulated that, such feature

can be used to classify objective and subjective posts. Similar observation was for positive vs. negative

sentiment posts too.

For training a sentiment classifier, they used the presence of n-grams as binary feature. They claimed

that, high order n-grams performs better at capturing sentiments while unigrams has good coverage

over data. While constructing n-grams they attached negation to adjacent terms. Then they use Naïve

Bayes classifier and claimed that, this performs better than SVM or CRF (Lafferty, McCallum and

Fernando 2001). They trained two Bayes classifiers, (a) n-gram based and (b) POS based. To attain a

final result, they estimate sentiment using both classifiers and calculate the log likelihood of each

sentiment. To increase accuracy, they suggested discarding common n-grams. For this, they only used


3

n-grams with low Shannon entropy values. They evaluated their system over hand annotated real

Twitter posts.

The methodology presented here is an ideal one for this particular case. Specially, automatic training

of classifier is a clever corpus building idea. Besides, combination of n-gram based and POS based

classifying significantly solves the challenge of topic-sentiment relation. However, this methodology

do not address how to handle streaming data which changes over time.

2.4 Using Appraisal Taxonomies for Sentiment Analysis

In their paper about sentiment analysis, Whitelaw, et al. (Whitelaw, Garg and Argamon 2005) suggests

using appraisal taxonomies for sentiment classification. They argued that, for semantic analysis

approaches should go beyond (a) bag of words and (b) mood classified words. They identified the need

for semantic analysis of attitude expression and also hypothesized that, atomic units of sentiment

expression are not individual word but rather appraisal groups.

They adopted four main types of attributes for appraisal groups: Attitude, Orientation, Graduation and

Polarity; adopted from Martin and White’s Appraisal Theory (Martin and White 2005). They discussed

a semi-automated technique to construct a lexicon of appraisal groups. To do so, they used terms from

(Martin and White 2005) as seed terms and generated candidate expansions using WordNet and two

other thesauri. They used coarse ranking of relevance to enlist such terms. However, they manually

inspected each ranked list to produce final set of terms. Then they tested several feature sets, e.g. Words

by Attitude, Systems by Attitude, Appraisal Group by Attitude & Orientation etc. They evaluated the

effectiveness of the feature sets for movie review classification on IMDb movie reviews. They found

that, union of bag-of-words and appraisal group by attitude & orientation (BoW+G:AO) yields best

result.

The approach demonstrated in this paper has several drawbacks in terms of scalability. Especially, as

the lexicon building involves much manual effort and the objective function for classification tends to

be computation intensive. However, their work draw the attention of researchers towards an important

notion that, sentiment analysis should concentrate more on key terms rather than the whole corpus.

Similar observation was found by (Benamara, et al. 2007) and (Subrahmanian and Reforgiato 2008)

stating that, “Adjectives and Adverbs are better than Adjectives Alone”. Alongside, the essence of the outcome

of (Whitelaw, Garg and Argamon 2005)’s work can be identified to be analogous to what (Pak and

Paroubek 2010) exploits in their work by classifying sentiments based on both POS and word groups

(n-grams).

2.5 Joint Sentiment/Topic Model for Sentiment Analysis

Lin, et al. (Lin and He 2009) addressed sentiment analysis in a slightly different perspective by

combining topic to it. They proposed an extension of the topic model, Latent Dirichlet Allocation

(LDA) by adding a sentiment layer to it. Their model is described as Joint Sentiment/Topic (JST)

model which is fully unsupervised and can detect sentiment and topic simultaneously in document

level.

They describe, “The existing framework of LDA has three hierarchical layers, where topics are associated with

documents, and words are associated with topics. In order to model document sentiments, we propose a joint sentiment/topic

(JST) model by adding an additional sentiment layer between the document and the topic layer. Hence, JST is effectively

a four-layer model, where sentiment labels are associated with documents, under which topics are associated with sentiment

labels and words are associated with both sentiment labels and topics.” They observed that, sentiment document

distribution plays important role in determining polarity of a document.


4

They have examined an alternative model, called Tying-JST, which incorporates single topic-document

distribution as opposed to individual distribution for each document in JST. However, Tying-JST

shows consistently poor performance than JST.

JST incorporates prior information with its model to enhance accuracy. They examined four model

priors:- (a) paradigm word list, (b) mutual information, (c) full subjectivity lexicon and (d) filtered

subjectivity lexicon.

They evaluates result accuracy for different prior models which demonstrates significant improvement

with incorporation of prior models as compared to results obtained from implementation without prior

models. Also, filtered subjectivity lexicon perceived to be best amongst studied models.

JST is stipulated to be a novel text mining approach for sentiment analysis and topic extraction. By

simultaneously identifying topic, this model addresses to the problem of domain dependence of

subjectivity. (i.e., a single word can have negative connotation in one domain whereas the same word

might be positive in another domain.) However, the complexity of this approach can pose a major

challenge is large scale commercial implementation of this method. This method considers document

level sentiment, while many applications are often interested in much granular sentiment, especially

sentiment towards entities.

2.6 Sentiment Knowledge Discovery in Twitter Streaming Data

Yet another perspective of sentiment analysis is investigate by (Bifet and Frank. 2010) addressing

challenges in mining streaming “data whose nature or distribution changes over time”. It specifically addresses

Twitter data stream where data arrives at high speed and prediction algorithms requires to perform in

real time. The paper addresses specifics of Twitter API and other implementation detail, which I would

keep aside from survey discussion.

In question of sentiment analysis, they note challenges posed due to succinctness of tweets and

possibility of sarcasm and irony. They also leverages the advantage of many tweets being annotated by

tweet-authors using emoticons – same idea utilized by (Pak and Paroubek 2010) to use such tweets as

training data for sentiment classifier. However to train, they filter tweets by (a) replacing mentions with

tag: USER, hyperlinks by tag: URL, (b) removing emoticons.

Authors argue that frequently used measure, “prequential accuracy is not well-suited for data streams with

unbalanced data, and that a prequential estimate of Kappa should be used instead.” Authors identifies the reason

is that, the classes are not balanced and can vary over time and often one class is much more frequent

than other class. Hence, a more appropriate measure would be something that normalizes a classifier

accuracy by chance predictor such as Kappa statistics (Cohen 1960). They postulates on a suggestion

by (Gama, Sebastião and Rodrigues 2009) which proposed to forget estimation either by (a) sliding a

window on most recent observation or (b) weighing observation with fading factors. Authors indicates

that, output on both approach are almost similar and thus suggest using sliding window with Kappa

statistics. Then the authors experimented three fast incremental methods: - (a) multinomial naïve Bayes,

(b) stochastic gradient descent (SGD) and (c) Hoeffding tree for mining this data stream. On the basis

of their demonstration, authors suggested using SGD.

This work successfully address the problem of streaming data and their novel solution can be viewed

as an ideal solution.

2.7 The case of irony

The last paper I would investigate is much recent one by Bosco, et al. (Bosco, Patti and Bolioli 2013)

– a portion of which addresses the case of irony. In our relevant perspective, irony can be identified as


5

a polarity reverser. That being said, question arises how to identify irony (and other figures of speech).

Authors suggest that, context knowledge is important to identify irony. In Facebook comment threads,

diagonal comments can be marked as ironic. But in context less circumstances (e.g. Twitter) world

knowledge is required. Again, interpretation of ironic speeches can be subjective. Hence, authors finds

the necessity of developing manually annotated corpora for irony detection and poses an open question

to investigate.

3 CONCLUSION

In this paper, I have tried to represent core ideas behind surveyed texts. These texts are all related to

sentiment mining, sentiment analysis problems domain and challenges in them. Each of them addresses

different aspects of this vast problem domain and provides insight on how to build a complete solution

for mining large text data and extract sentiment out of it. This survey defines what sentiment is, how

to classify them and use data mining and machine learning techniques to extract opinion from large

corpuses. It also discusses on few approaches addressing challenges of domain dependence, ironic

speeches, streaming data and so forth. Insights are found to identify opinion related to entities and

trace sentiment transition over time.

4 REFERENCES

Benamara, Farah, Carmine Cesarano, Antonio Picariello, Diego Reforgiato, and VS Subrahmanian.

2007. "Sentiment Analysis: Adjectives and Adverbs are better than Adjectives Alone."

International Conference on Weblogs and Social Media. Boulder, CO USA: ICWSM.

Bifet, Albert, and Eibe Frank. 2010. "Sentiment knowledge discovery in twitter streaming data." In

Discovery Science, 1-15. Berlin Heidelberg: Springer .

Boiy, Erik, Pieter Hens, Koen Deschacht, and Marie-francine Moens. 2007. "Automatic Sentiment

Analysis in On-line Text." Proceedings of Conference on Electronic Publishing. Vienna, Austria:

ELPUB. 349-360.

Bosco, Cristina, Viviana Patti, and Andrea Bolioli. 2013. "Developing Corpora for Sentiment Analysis:

The Case of Irony and Senti-TUT." IEEE Intelligent Systems (IEEE Computer Society) 55-63.

Cohen, Jacob. 1960. "A coefficient of agreement for nominal scales." Educational and Psychological

Measurement 37-46.

Gama, João, Raquel Sebastião, and Pedro Pereira Rodrigues. 2009. "Issues in evaluation of stream

learning algorithms." Proceedings of the 15th ACM SIGKDD International Conference. ACM. 329-

338.

Lafferty, John D., Andrew McCallum, and N.C. Fernando. 2001. "Conditional random fields:

Probabilistic." Proceedings of the Eighteenth International Conference on Machine Learning. San

Francisco, CA, USA.: Morgan Kaufmann Publishers Inc. 282-289.

Lin, Chenghua, and Yulan He. 2009. "Joint sentiment/topic model for sentiment analysis." Proceedings

of the 18th ACM conference on Information and knowledge management. ACM. 375-384.

Martin, J. R., and P. R. R. White. 2005. Language of Evaluation: Appraisal in English. London: Palgrave.

http://grammatics.com/appraisal/.


6

Pak, Alexander, and Patrick Paroubek. 2010. "Twitter as a Corpus for Sentiment Analysis and Opinion

Mining." Language Resources and Evaluation. 1320-1326.

Pang, Bo, Lillian Lee, and Shivakumar Vaithyanathan. 2002. "Thumbs up? Sentiment Classification

using Machine Learning Techniques." Proceedings of the ACL-02 conference on Empirical methods in

natural language processing. Philadelphia, PA, USA: Association for Computational Linguistics.

79-86.

Read., Jonathon. 2005. "Using emoticons to reduce dependency." The Association for Computer Linguistics.

Schmid, Helmut. 1994. "Probabilistic part-of-speech tagging using decision trees." Proceedings of the

International. 44-49.

Subrahmanian, Venkatramana S., and Diego Reforgiato. 2008. "AVA: Adjective-verb-adverb

combinations for sentiment analysis." Intelligent Systems (IEEE) 23 (4): 43-50.

Whitelaw, Casey, Navendu Garg, and Shlomo Argamon. 2005. "Using appraisal groups for sentiment

analysis." Proceedings of the 14th ACM international conference on Information and knowledge management.

ACM. 625-631.

Technology

A Survey on Sentiment Mining Techniques