6
Proceedings of the 2012 International Conference on Machine Leaing and Cybernetics, Ξ, 15-17 J uly, 2012 WO SENTIMENT POLATY DISAMBIGUITION BASED ON OPINION LEVEL CONTEXT HUAN ZHAO · , 2 , YUNQING XIA\ RAYMOND Y. K. LAU 3 , YI LIU 4 1 School of Information and Communication, Beijing University of Posts and Telecommunications, China 2 Deparent of Computer Science, Tsinghua University, China 3 Deparent of Information Systems, City University of Hong Kong, Hong Kong SAR 4 IMSL Shehen Key Laboratory, China E-MAIL: [email protected].yqxia@tsinghua.edu.cn.raylau@cityu.edu.hk.yi[email protected] Abstract: Many opmIOn keywords carry different polarities when they are used in different contexts, posing huge challenges to opinion mining research. To address the word sentiment polarity disambiguation (WSPD) task, the opinion level context information is studies in this paper, and an effective method is designed to make good use of the context information to resolve the sentiment polarity ambiguity. Different from the traditional way that considers surrounding n-grams, we specially consider the associated opinion target, modifying constituents and conjunctions as context of a given sentiment keyword. To locate the context information precisely, we make use of dependency relation between words. We then devise a statistical equation to calculate probability that the given keyword carries certain sentiment polarity. Preliminary results show that the method yields encouraging accuracy. Keywords: Word polarity disambiguation, sentiment analysis, opinion target, opinion mining. 1. Introduction 1.1. Problem statement Opinion mining research has achieved a significant progress in the past decade. Many systems have been developed, some of which were even commercialized, to achieve either article level sentiment analysis or fine-grained opinion mining. Amongst the challenging issues that the opinion mining research faces, opinion polarity ambiguity is deemed a hard one. Many opinion keywords cay different polarities when they are used in different context, posing huge challenges to opinion mining research. In this word sentiment polarity disambiguation (WSPD) study, we focus on opinion mining om product reviews in 978-1-4673-1487-9/12/$31.00 ©2012 IEEE this work. Our study shows that more than 25 percent opinion keyword occurrences are polarity ambiguous in product reviews. Some typical examples containing sentiment keyword 1�(di, low) are given in Table 1. T«LE 1. SOME SPLE EMPLE SENTENCES Sentence (Chinese) m It �{ ${ CPU ±��{ �m*{�${ �Il$fIH 'rt{ ��fi �{ mmMl ��{ Explanation Mobile phone price is low Screen resolution is relatively low CPU equency is low Repair rate is low Customer service effiency is low Performance pce ratio is low Mobile phone radiation value is low Polari Positive Negative Negative Positive Negative Negative Positive Note that every sentiment keyword appearing in the examples in Table 1 is associated with an opinion target explicitly. For example, "t11�(Mobile phone price is low)" associates the opinion target " (Mobile phone price)" to the sentiment keyword 1�(di, low). We also find some important modifiers in these examples. For example, "m. * 1 1� (Screen resolution is relatively low)" contains the modifier "M (relatively)", which is able to disclose a negative sentiment alone. In fact, we find 244 different combinations with sentiment keyword 1�(di, low) in our dataset (see Section 4.1). This indicates that, two types of context are usel for polarity disambiguation: associated opinion target and the modifier. If aining data is big enough, word sentiment polarity can be resolved by an opinion target and modifier sensitive probabilistic model (see Section 3.1). In fact, real cases could be complicated. Table 2 gives some examples, in which the opinion targets are missing. Table 2 gives the real examples that contain no opinion 2007

[IEEE 2012 International Conference on Machine Learning and Cybernetics (ICMLC) - Xian, Shaanxi, China (2012.07.15-2012.07.17)] 2012 International Conference on Machine Learning and

Embed Size (px)

Citation preview

Proceedings of the 2012 International Conference on Machine Learning and Cybernetics, Xian, 15-17 July, 2012

WORD SENTIMENT POLARITY DISAMBIGUITION

BASED ON OPINION LEVEL CONTEXT

HUAN ZHAO·,2, YUNQING XIA\ RAYMOND Y. K. LAU3, YI LIU4

1 School of Information and Communication, Beijing University of Posts and Telecommunications, China 2 Department of Computer Science, Tsinghua University, China

3 Department of Information Systems, City University of Hong Kong, Hong Kong SAR 4 IMSL Shenzhen Key Laboratory, China

E-MAIL: [email protected]@[email protected]@imsl.org.cn

Abstract: Many opmIOn keywords carry different polarities when

they are used in different contexts, posing huge challenges to

opinion mining research. To address the word sentiment

polarity disambiguation (WSPD) task, the opinion level context

information is studies in this paper, and an effective method is

designed to make good use of the context information to resolve

the sentiment polarity ambiguity. Different from the traditional

way that considers surrounding n-grams, we specially consider

the associated opinion target, modifying constituents and

conjunctions as context of a given sentiment keyword. To locate

the context information precisely, we make use of dependency relation between words. We then devise a statistical equation to

calculate probability that the given keyword carries certain

sentiment polarity. Preliminary results show that the method

yields encouraging accuracy.

Keywords: Word polarity disambiguation, sentiment analysis, opinion

target, opinion mining.

1. Introduction

1.1. Problem statement

Opinion mining research has achieved a significant progress in the past decade. Many systems have been developed, some of which were even commercialized, to achieve either article level sentiment analysis or fine-grained opinion mining. Amongst the challenging issues that the opinion mining research faces, opinion polarity ambiguity is deemed a hard one. Many opinion keywords carry different polarities when they are used in different context, posing huge challenges to opinion mining research.

In this word sentiment polarity disambiguation (WSPD) study, we focus on opinion mining from product reviews in

978-1-4673-1487-9/12/$31.00 ©2012 IEEE

this work. Our study shows that more than 25 percent opinion keyword occurrences are polarity ambiguous in product reviews. Some typical examples containing sentiment keyword 1�(di, low) are given in Table 1.

TABLE 1. SOME SIMPLE EXAMPLE SENTENCES

Sentence (Chinese)

'f-m ffr;fit It �{I£

J1fiJ:5HJf${JiIj{j£ CPU ±iJiJj���{1£

j!�'f-m*{�${1£

��IHill$fIHI£

'rt{fr ��:ffixt ���{j£

'f-mmM{ll ���{j£

Explanation Mobile phone price is low

Screen resolution is relatively low

CPU frequency is low

Repair rate is low

Customer service effiency is low

Performance price ratio is low

Mobile phone radiation value is low

Polarity Positive Negative Negative Positive Negative Negative Positive

Note that every sentiment keyword appearing in the examples in Table 1 is associated with an opinion target explicitly. For example, "-¥tJt1fl**1:t$5t1�(Mobile phone price is low)" associates the opinion target "-¥;fJ11f1 ** (Mobile phone price)" to the sentiment keyword 1�(di, low). We also find some important modifiers in these examples. For example, "m. * fft .$1'; 1� (Screen resolution is relatively low)" contains the modifier "M (relatively)", which is able to disclose a negative sentiment alone. In fact, we find 244 different combinations with sentiment keyword 1�(di, low) in our dataset (see Section 4.1). This indicates that, two types of context are useful for polarity disambiguation: associated opinion target and the modifier. If training data is big enough, word sentiment polarity can be resolved by an opinion target and modifier sensitive probabilistic model (see Section 3.1).

In fact, real cases could be complicated. Table 2 gives some examples, in which the opinion targets are missing.

Table 2 gives the real examples that contain no opinion

2007

Proceedings of the 2012 International Conference on Machine Learning and Cybernetics, Xian, 15-17 July, 2012

target but some important modifiers. Human is able to determine sentiment polarity according to these modifiers alone. This indicates that some modifiers are strong enough to handle the sentiment keywords that are not associated with any opinion target. With a training dataset, the modifiers can be statistically depicted in a model how they influence sentiment polarity.

TABLE 2. SOME COMPLICATED EXAMPLE SENTENCES

Sentence (Chinese)

�a**7 S*���/j,� B)L

Ij':f�qIH-*�A

i'i1J1�iJ:A7G�2�

Explanation It is too small

It should be a bit smaller

It is irresistably small

It is unendurable high

Polarity NEGATIVE NEGATIVE POSITIVE NEGATIVE

TABLE 3. SOME MORE COMPLICATED EXAMPLE SENTENCES

Sentence (Chinese) Explanation

;ffjmJ9Ui�i¥J::f>fi, Quality of the mobile phone is not bad,

{gff,ltf!::f>/j' but it is not small.

f?11if*, {gm'llrflt1J Th gh b ' b' .. {f"

ou emg Ig, It IS easy to carry.

J1fiJ:5HJf$flli'i1J, mJ S I " h' h d" b'

li* creen reso utlOn IS Ig , an It IS Ig.

Polarity

NEGATIVE

POSITIVE

POSITIVE

In our observation, we notice some more complicated cases (see Table 3). Seen from Table 3, the examples contains conjunctions that are able to connect two comments. Furthermore, we find the coordinating conjunctions such as "rro Ji(and)" organize two comments with same polarity, while the adversative conjunctions such as "1§.(but)" manage two comments within different polarity. This indicates that conjunctions with different types are also helpful in resolving sentiment polarity. Enlighten by this observation, we design some contextual rules (see Section 3.2) to make use of the conjunctions to further improve accuracy of sentiment polarity resolution.

To summarize, we find the context information observed in this work, i.e. opinion target, modifying constituent and conjunction, is helpful in resolving sentiment polarity ambiguity. To differ from our work from the previous one, we refer to the above context information as opinion level context.

1.2. Our proposal and contributions

Different from the traditional way that considers surrounding n-grams, we specially consider the collocating opinion target, modifying words and conjunctions as context of a given sentiment keyword. To locate opinion targets, we adopt the opinion target lexicon developed by Xia et al. (2009) [1] and design some simple regular patterns to recognize

opmlOn target within reviews. To extract the context information precisely, we make use of dependency relation between words. With these relations, we are able to find how an opinion target is associated to the sentiment keyword, and how a modifying word/phrase is used on the sentiment keyword. In this case, the HIT LTPI is adopted.

The following contributions are made in this work: (1) It is further observed in this work that, the opinion level context is helpful in resolving sentiment polarity ambiguity. (2) The probabilistic model is designed to calculate probability that the given keyword carries certain sentiment polarity. The equation considers not only the associated opinion targets, but also the modifying words/phrases. (3) Conjunctions are considered in this work and their functions are represented by some conjunctional rules. Preliminary results show that the proposed method yields encouraging accuracy.

1.3. Paper structure

The rest of the paper is organized as follows. In Section 2 we briefly present the related work. In Section 3 we present our method that incorporating the probabilistic SPD model and conjunction rules. In Section 4 a preliminary evaluation is reported. The paper is finally concluded in Section 5.

2. Literature Review

Related work is first on general sentiment analysis and opinion mining. We will thereafter intensively investigate recent work on sentiment polarity disambiguation.

2.1. Sentiment analysis and opinion mining

The sentiment analysis concept first arises from classifying text into positive, neutral and negative as an orientation determination problem [2]. Later, Turney (2002) proposed to classify positive and negative reviews by Thumbs Up and Thumbs Down [3]. Then the SentiWordNet was compiled by Esuli et al. (2004) [4]. Sentiment lexicon was proved necessary and important. Sentiment lexicons for other languages then appear, including Chinese [5], Japanese [6] and Thai [7]. Using sentiment lexicon is indispensable in sentiment analysis. Some researches use sentiment keywords directly as feature in machine learning algorithms for sentiment classification. Others use statistics on sentiment keywords [8]. As many sentiment keywords present different polarity in different context, further NLP techniques becomes the popular approach for polarity classification.

In fine-grained opinion mining on product reviews, the

1 HIT LTP: http://ir.hit.edu.cn/derno/ltp/

2008

Proceedings of the 2012 International Conference on Machine Learning and Cybernetics, Xian, 15-17 July, 2012

sentiment lexicon is a must. One sentiment may carry one or more opinions, while one sentence is too short and leads to sparse data problem in sentiment classification. However, using sentiment keyword directly in fine-grained opinion mining encounters serious polarity ambiguity challenge.

2.2. Word polarity disambiguation

Akkaya et al. (2009) introduced subjectivity word sense disambiguation (SWSD) [9], which seeks to automatically determine which word instances in a corpus are being used with subjective senses, and which are being used with objective senses. Some further work is done by Akkaya et al. (2011) to improve their method [10]. This work concentrates on word sentiment polarity disambiguation. We assume that one keyword must carry a sentiment polarity. This can be guaranteed that the keyword appears in the context with a specific part-of-speech tag. For example, word iWJ(high) in Chinese can play as adjective (i.e., a lot) or noun (i.e., height) in real sentences. This research only attempts to sentiment resolve polarity of the keyword that acts as adjective.

In word sentiment polarity disambiguation (WSPD), Yi et al. (2003) use a lexicon and manually developed patterns to classify contextual polarity [11]. Though the patterns are high-quality and yielding quite high precision over the set of expressions, the recall is rather low. Popescu and Etzioni (2005) use relaxation labeling algorithm to recognize the contextual polarity of words that are at the heads of select opinion phrases [12]. Features are used to represent conjunctions and dependency relations between polarity words. And expressions are limited either to those that target specific items of interest, such as products and product features, or to tuples of adjectives and nouns. Wilson et al.. (2009) propose to recognize contextual polarity of all instances of words from a large lexicon of subjectivity clues that appear in the corpus [13]. The lexicon includes not only adjectives, but nouns, verbs, adverbs, and even modals. In this work, negations of longer-distance types are handled.

This work is different from the related work in two aspects. Firstly, features are considered by a probabilistic model. Parameters can be estimated in development data, the method becomes more robust in the cases that are not included in the training data. Secondly, conjunctions are handled by rules, which is simple and effective. The conjunctional rules are applied before the statistical model, which helps to greatly improve accuracy.

Note that word sentiment polarity disambiguation is different from word sense disambiguation (WSD). Diana Maynard (2012) pointed out that WSPD is the problem of

ensuring correct target-opmlOn matching task 2 , and a rule-based approach using GATE3 is being developed [14]. Using-term level context in WSPD suffers serious sparse data problem because a given piece of comment is normally very short. It is very often that some comment does not match any rule. Thus in this work, we attempt to consider opinion-level context in WSPD.

3. The method

Targeting at the task of word sentiment polarity disambiguation, this work is delivered in following three steps.

Step 1: Texts are prepressed. For this case, we do word segmentation, part-of-speech tagging and dependency parsing.

Step 2: Sentence containing the predefined conjunctions are handled directly by conjunction-inferring module that infers sentiment polarity of the sentiment keyword with the conjunctional rules. More details are given in Section 3.1. If no conjunction is found in a sentence, it is passed to Step 3.

Step 3: Sentiment polarity of the sentiment keyword is resolved by the probabilistic inferring module that determines sentiment polarity of the sentiment keyword probabilistically with the opinion context. To estimate the parameters, both training data and development data are used. More details are given in Section 3.2.

3.1. The conjunction rules

Conjunctions are usually found in complicated sentences that contain two sub-sentences, i.e., dominating part and subordinate part. We select certain conjunctions so that the part containing a conjunction is considered as dominating part. For description convenience, the sub-sentence that contains a conjunction is called CUR sub-sentence, and the other sub-sentence is called OTH sub-sentence.

Conjunctions appear with many types and only two types are used in this work: coordinative and adversative, because conjunctions of the two types are deterministic in inferring polarity of the combining two sub-sentences. For example, the coordinating junctions connect two sub-sentence with same polarity, while the adversative conjunctions combine two ones with reversed polarity. This is the most important clue that the conjunction rules adopt to resolve sentiment polarity.

Every conjunction rule is manually compiled based on a conjunction word. The following elements (i.e., context) are

2009

2 http://www.linkedin.com/groups/Sentiment -Analysis-in­contextual- situation-115439.S.101502294 3 GATE: http://gate.ac.uk

Proceedings of the 2012 International Conference on Machine Learning and Cybernetics, Xian, 15-17 July, 2012

involved. (1) Type of the conjunction word, denoted by boolean

constant teONJ E {I, O}. In this special case, we assign teONJ = 1 if the type is coordinative and teoN J = -1 if adversative.

(2) Sentiment polarity of the OTH sentence, denoted by boolean constant SOTH E {I, O}. If the sentiment polarity is positive, we assign SOTH = 1 , and SOTH = 0 if negative. There are some cases that the other sentence does not carry any sentiment polarity, or neutral. For these cases, we ignore the second step and let the probabilistic inferring module handle the polarity.

(3) Presence of a negating constitution in the CUR sentence that holds ATT (attribute) dependency relation with the sentiment keyword, denoted by boolean constant neRU E {I, O} . We assign neRU = 1 if a negating constitution is found, otherwise neRU = O . Situation is usually complicated when the dependency relations entail more than one negations. We simply ignore such cases.

Polarity of the sentiment keyword SKWD is inferred with the following rule:

SKWD = SOTH ®teONJ ® (---,neuR) (1) If SKWD is finally assigned I, polarity of the sentiment

keyword is deemed positive, otherwise negative. For example, in sentence ";t§;fJ1mt.s.J§;i¥J;;'HI-L {£l.mc1VF/NQuality of the mobile phone is not bad, but it is not small)", we resolve polarity of the sentiment keyword /Nsrnall) as follows.

SOTH = 1 teONJ = 1 neUR = 1 SKWD = SOTH ® teONJ ® (-,neUR) = 0

This indicates that polarity of the sentiment keyword /j, (small) is negative.

3.2. The probabilistic WSPD model

We assume sentiment polarity of a sentiment keyword w is locally determined by the following elements (i.e., context) in the review sentence (or sub-sentence):

(1) The opinion target that holds ATT (attribute) dependency relation with the sentiment keyword, denoted by t. From the perspective of language dependency and opinion constitution, an opinion (sentiment) keyword is give on certain opinion target explicitly or implicitly (see the example in Table I 2, and 3).

(2) The modifying constituent that holds ATT (attribute)

dependency relation with the sentiment keyword, denoted by m. Sometimes, the modifying constituent is helpful in disclosing the aspect of the opinion sentiment.

We propose to resolve sentiment polarity of word w

within the given context with a probabilistic model as follows.

S* =argmaxp(slt,m,w) (2) Assuming opinion target t and modifying constituent

m are independent of each other in terms of distribution, we obtain:

s* = argmax p(slt, w)p(slm, w) (3) Applying Beyesian formula, we further obtain:

* p(s, wit) p(s, wlm)

S = argmax p(wlt) p(wlm)

(4)

In practice, we further classify opinion target into entity and attribute. For example, when people talks about price, which is referred to as attribute, he/she must be talking about camera, batter and so on, which is referred to as entity. Thus Equation (3) is further devised as follows.

s* = argmax max{p(sle, w),p(sla, w)}p(slm, w), (5) in which e denotes the associated entity and a the associated attribute. In the cases that entity or attribute is omitted, we simply set p(sle, w) = 0 or p(sla, w) = 0 accordingly.

Parameter estimation

We notice that p(s, wle) , p(s, wla) and p(s, wlm) must be estimated from the training dataset. For p(wle), p(wla) and p(wlm), the developing dataset is a better choice, because no polarity annotation is required. Detecting keywords, entities, attributes and modifying constituents is error-prone. We adopt dependency relations to exclude the errors. Some complicated errors might also happen, but we assume the errors do not significantly influence their distributions in the large development dataset.

4. Evaluation

The primary goal with the experiments is to evaluate effectiveness of our method in disambiguating word sentiment polarity within two domains.

4.1. The datasets

Different from the previous evaluation setup that uses the complete database (e.g., MPQA corpus), we used only the reviews that contain the polarity-ambiguous sentiment keywords. This allows us concentrate on word sentiment polarity disambiguation, and helps to prove contribution of the proposed method directly. Thus, we selected the top 20

2010

Proceedings of the 2012 International Conference on Machine Learning and Cybernetics, Xian, 15-17 July, 2012

frequently occurred sentiment keywords from the lexicon. This work was conducted on Chinese reviews. So the

annotation dataset is extracted from the Opinmine corpus [15]. To make the evaluation more convincing, we apply our method in two product domains: mobile phone and digital camera. The selected polarity-ambiguous keywords for the two domains are presented in Table 4 and Table 5.

TABLE 4. THE POLARITY-AMBIGUOUS SENTIMENT KEYWORDS IN MOBILE PHONE DOMAIN (CN DENOTES CHINESE AND EN ENGLISH)

CN EN eN EN eN EN eN EN � a lot d> little �rI'!r imoprove �d> decrease

rWJ high {I£ low �tf:\ extra iiixifB sensitive

'1* quick $ thin fjJi§- serious �f[£ decrease

IJ\ small � light :tl'lnQ increase J\Ii( mean

faJ. simple £ heavy r� drop 't!l\.A surprising

TABLE 5. THE POLARITY-AMBIGUOUS SENTIMENT KEYWORDS IN DIGITAL CAMERA DOMAIN (CN DENOTES CHINESE AND EN ENGLISH)

eN EN eN EN eN EN eN EN � a lot '1* quick �rI'!r imoprove Jl1 thick

iWl high faJ. simple £ heavy ¥* deep

* big {I£ low $ thin @ hard

IJ\ small d> little :t!'lno increase r� drop

* long � light �tf:\ extra �f[£ decrease

We find the polarity-ambiguous keywords in the two domains are slightly different. With the keywords, we constructed the following datasets in this work.

Annotation datasets From the Opinmine corpus, we extracted 5,000 reviews

that contain the selected sentiment keywords. Basically, 5,000 reviews are not big enough for the evaluation. So we adopted the 5-fold cross validation approach in the experiments. That is, each dataset is randomly divided into five parts with equal size. In five runs, every part is used as test data and the remaining four parts as training data.

Development datasets We notice that training datasets are rather small. So we

decided to find some raw development data. Assisted with a crawler, we downloaded 706,784 mobile phone reviews and 126,667 digital camera reviews from the Internet with some crawler. Applying the selected sentiment keywords, we finally obtain 359,479 raw reviews in mobile phone domain and 85,959 raw reviews in digital camera domain. The two datasets are used in the evaluation as development datasets.

4.2. The evaluation metrics

The goal of the proposed method is to determine positive or negative polarity of a sentiment keyword in a

given context. So it is natural that we adopt accuracy in this evaluation. Accuracy is defined as proportion of the correctly determined reviews within all test reviews.

4.3. Results and discussions

The method was applied in mobile phone domain and digital camera domain, respectively. For each domain, we first estimated parameters from the training data and development data. Then the method was executed on the test data. As 5-fold cross-validation approach is adopted, five runs for each domain were conducted. Experimental results for the two domains are given in Table 6 and Table 7, respectively.

TABLE 6. EXPERIMENAL RESULTS IN MOBILE PHONE DOMAIN

RunID # of all test reviews # of correct decisions Accuracy Runl 984 751 0.763 Run2 984 700 0.711 Run3 984 780 0.793 Run4 984 697 0.708

Run5 1017 743 0.731

Average 0.741

TABLE 7. EXPERIMENAL RESULTS IN DIGITAL CAMERA DOMAIN

RunID # of all test reviews # of correct decisions Accuracy Runl 949 674 0.710 Run2 949 623 0.656 Run3 949 662 0.698 Run4 949 669 0.705

Run5 1001 736 0.735 Average 0.701

Seen from Table 6 and Table 7, the proposed method achieves 0.741 in the mobile phone domain and 0.701 in the digital camera domain. We notice in Table 6 that the accuracy is relative high (i.e., 0.793 in Run3) and relative low (i.e., 0.708 in Run 4). Similar results are found in Table 7. Looking into the test data, we find that this is because the proposed method yields slightly different results on different sentiment keywords. We then present accuracy values in the two domains for different sentiment keywords in Table 8 and Table 9, respectively.

Seen from Table 8 and Table 9, the proposed method performs best on word "11£(low)" (i.e., 0.918) and worst on word "fl:�(sensitive)" (i.e., 0.400) in the mobile phone domain. For the digital camera domain, the best case is on "lI!(heavy)" (i.e., 0.969) and the worst one is on "�(light)" (i.e., 0.416). This leads to performance deviation in different runs of 5-fold cross-validation. We ascribe performance variance of individual words in two domains to different distribution of sentiment keywords in different domains.

2011

Proceedings of the 2012 International Conference on Machine Learning and Cybernetics, Xian, 15-17 July, 2012

TABLE 8. ACCURA Y V ALVES OF INDNIDUAL SENTIMENT KEYWORDS IN MOBILE PHONE DOMAIN

eN EN Accuracy eN EN Accuracy � a lot 0.755 �r'i'Q imoprove 0.568

?ill high 0.539 �l±! extra 0.700

'� quick 0.804 fJJiIf serious 0.830

Ij, small 0.817 �tm increase 0.834

fij� simple 0.807 r� drop 0.700

d> little 0.759 �d> decrease 0.540

fl£ low 0.918 �� sensitive 0.400 $ thin 0.710 �fl£ decrease 0.700

� light 0.683 JW! mean 0.800

!I! heavy 0.858 't!l\.A surprising 0.600

TABLE 9. ACCURA Y V ALVES OF INDNIDUAL SENTIMENT KEYWORDS IN DIGITAL CAMERA DOMAIN

eN EN � a lot

?ill high

A big

Ij, small

* long

,� quick

faJ� simple

fl£ low

d> little

� light

5. Conclusions

Accuracy 0.634

0.671

0.687

0.812

0.587

0.819

0.761

0.829

0.878

0.418

eN EN Accuracy �= fl'[) imoprove 0.673

!I! heavy 0.969 ¥if thin 0.626

�tm increase 0.487

�l±! extra 0.700

if. thick 0.750

1* deep 0.675

� hard 0.701

r� drop 0.867

�fl£ decrease 0.633

This paper addresses word sentiment polarity disambiguation (WSPD) issue using opinion-level context, including opinion related elements and inter-sentence conjunctions that connect two opinions. This work is deemed novel due to the probabilistic WSPD model and the conjunction rules. Encouraging results are obtained in our experiments. However, the reported work is still preliminary. We plan the following future work. Firstly, more experiments will be conducted to compare our method against the related work. Secondly, we will evaluate how much the proposed work improves the opinion mining system.

Acknowledgements

This paper is supported by NSFC (60703051) and MOST of China (2009DFAI2970). We thank the reviewers for the valuable comments.

References

[1] Y. Xia, B. Hao, K.-F.Wong. "Opinion Target Network and Bootstrapping Method for Chinese Opinion Target Extraction". AIRS 2009: 339-350

[2] H. Vasileios and K. R. McKeown. "Predicting the Semantic Orientation of Adjectives". 35th ACL, pp 174-181.

[3] P. D. Turney and M. L. Littman. "Measuring praise and criticism: Inference of semantic orientation from association". ACM Transactions on Information Systems (TOIS), 21(4):315-346, 2003.

[4] A Esuli and F. Sebastiani. "SENTIWORDNET: A Publicly Available Lexical Resource for Opinion". LREC-2006.

[5] Y. He, H. Alani and D. Zhou. Exploring English Lexicon Knowledge for Chinese Sentiment Analysis. In: CIPS-SIGHAN, pp 28-29, 2010.

[6] Y. Torii, D. Das, S. Bandyopadhyay and O. Manabu. "Developing Japanese WordNet Affect for Analyzing Emotions". WASSA 2011, ACL, pp. 80-86.

[7] H. Choochart, K. Alisa, P. Pompimon and S. Chatchawal. "Constructing Thai Opinion Mining Resource: A Case Study on Hotel Reviews". ALR, pp 64-71.

[8] Y. Xia, L. Wang, K.-F. Wong, M. Xu. "Lyric-based Song Sentiment Classification with Sentiment Vector Space Model". ACL (Short Papers) 2008: 133-136

[9] C. Akkaya, J. Wiebe, and R. Miha1cea. Subjectivity word sense disambiguation. EMNLP 2009.

[10] C. Akkaya, J. Wiebe, A Courad. "Improving the Impact of SubjectivityWord Sense Disambiguation on Contextual Opinion Analysis". CONLL2011.

[11] J. Yi, T. Nasukawa, R. Bunescu, and W. Niblack. "Sentiment analyzer: Extracting sentiments about a given topic using natural language processing techniques". IEEE ICDM-2003.

[12] A-M. Popescu and O. Etzioni. "Extracting product features and opinions from reviews". HLTIEMNLP-2005 :339-346.

[13] T. Wilson, J. Wiebe and P. Hoffmann. "Recognizing Contextual Polarity: An Exploration of Features for Phrase-Level Sentiment Analysis". Computational Linguistics, v(35):399-433, 2009.

[14] D. Maynard and K. Bontcheva and D. Rout. "Challenges in developing opinion mining tools for social media". @NLP can u tag #Usergeneratedcontent?! Workshop at LREC 2012.

[15] A Weichselbraun, S. Gindl, A Schad. "A Context-Dependent Supervised Learning Approach to Sentiment Detection in Large Textual Databases". Journal of Information and Data Management, vol. 1, issue 3, pp. 329 - 342, 2010.

[16] R. Xu, Y. Xia, K.-F. Wong and W. Li. 2008. Opinion Annotation in On-line Chinese Product Reviews. In Proc. of LREC-2008.

2012