CS3730 Fall 2008 Subjectivity and Sentiment Analysis

Preview:

DESCRIPTION

CS3730 Fall 2008 Subjectivity and Sentiment Analysis. Lecture (Day 2): Introduction to linguistic subjectivity. Definitions and Annotation Scheme. Manual annotation: human markup of corpora (bodies of text) Why? Understand the problem Create gold standards (and training data) - PowerPoint PPT Presentation

Citation preview

CS3730 Fall 2008Subjectivity and Sentiment

AnalysisLecture (Day 2): Introduction to

linguistic subjectivity

2

Definitions and Annotation Scheme

• Manual annotation: human markup of corpora (bodies of text)

• Why? – Understand the problem– Create gold standards (and training data)

Wiebe, Wilson, Cardie LRE 2005Wilson & Wiebe ACL-2005 workshopSomasundaran, Wiebe, Hoffmann, Litman ACL-2006 workshopSomasundaran, Ruppenhofer, Wiebe SIGdial 2007Wilson 2008 PhD dissertation

3

What is Subjectivity?

• The linguistic expression of somebody’s opinions, sentiments, emotions, evaluations, beliefs, speculations (private states)

Private state: state that is not open to objective observation or verification Quirk, Greenbaum, Leech, Svartvik (1985). A Comprehensive Grammar of the English Language.

4

Overview

• Fine-grained: expression-level rather than sentence or document level

• Annotate – Subjective expressions– material attributed to a source, but presented

objectively

5

Overview

• Focus on three ways private states are expressed in language

6

Direct Subjective Expressions

• Direct mentions of private states

The United States fears a spill-over from the anti-terrorist campaign.

• Private states expressed in speech events

“We foresaw electoral fraud but not daylight robbery,” Tsvangirai said.

7

Expressive Subjective Elements [Banfield 1982]

• “We foresaw electoral fraud but not daylight robbery,” Tsvangirai said

• The part of the US human rights report about China is full of absurdities and fabrications

8

Objective Speech Events

• Material attributed to a source, but presented as objective fact

The government, it added, has amended the Pakistan Citizenship Act 10 of 1951 to enable women of Pakistani descent to claim Pakistani nationality for their children born to foreign husbands.

9

10

Nested Sources

“The report is full of absurdities,’’ Xirao-Nima said the next day.

11

Nested Sources

“The report is full of absurdities,’’ Xirao-Nima said the next day.

(Writer)

12

Nested Sources

“The report is full of absurdities,’’ Xirao-Nima said the next day.

(Writer, Xirao-Nima)

13

Nested Sources

“The report is full of absurdities,’’ Xirao-Nima said the next day.

(Writer Xirao-Nima)(Writer Xirao-Nima)

14

Nested Sources

“The report is full of absurdities,’’ Xirao-Nima said the next day.

(Writer Xirao-Nima)(Writer Xirao-Nima)

(Writer)

15

“The report is full of absurdities,” Xirao-Nima said the next day.

Objective speech event anchor: the entire sentence source: <writer> implicit: true

Direct subjective anchor: said source: <writer, Xirao-Nima> intensity: high expression intensity: neutral

Expressive subjective element anchor: full of absurdities source: <writer, Xirao-Nima> intensity: high

19

“The US fears a spill-over’’, said Xirao-Nima, a

professor of foreign affairs at the Central University

for Nationalities.

20

“The US fears a spill-over’’, said Xirao-Nima, a

professor of foreign affairs at the Central University

for Nationalities.

(Writer)

21

“The US fears a spill-over’’, said Xirao-Nima, a

professor of foreign affairs at the Central University

for Nationalities.

(writer, Xirao-Nima)

22

“The US fears a spill-over’’, said Xirao-Nima, a

professor of foreign affairs at the Central University

for Nationalities.

(writer, Xirao-Nima, US)

23

“The US fears a spill-over’’, said Xirao-Nima, a

professor of foreign affairs at the Central University

for Nationalities.

(writer, Xirao-Nima, US) (writer, Xirao-Nima)(Writer)

24

Objective speech event anchor: the entire sentence source: <writer> implicit: true

Objective speech event anchor: said source: <writer, Xirao-Nima>

Direct subjective anchor: fears source: <writer, Xirao-Nima, US> intensity: medium expression intensity: medium

“The US fears a spill-over’’, said Xirao-Nima, a

professor of foreign affairs at the Central University

for Nationalities.

25

The report has been strongly criticized and condemned bymany countries.

26

Objective speech event anchor: the entire sentence source: <writer> implicit: true

Direct subjective anchor: strongly criticized and condemned source: <writer, many-countries> intensity: high expression intensity: high

The report has been strongly criticized and condemned bymany countries.

27

As usual, the US state Department published its annual report on human rights practices in world countries last Monday.

And as usual, the portion about China contains little truth and many absurdities, exaggerations and fabrications.

28

Objective speech event anchor : the entire 1st sentence source : <writer> implicit : true

Direct subjective anchor : the entire 2nd sentence source : <writer> implicit : true intensity : high

As usual, the US state Department published its annual report on human rights practices in world countries last Monday.

And as usual, the portion about China contains little truth and many absurdities, exaggerations and fabrications.

Expressive subjective element anchor : little truth source : <writer> intensity : medium

… Expressive subjective element anchor : many absurdities, exaggerations, and fabrications source : <writer> intensity : medium

Expressive subjective element anchor : And as usual source : <writer> intensity : low

29

Example

The Foreign Ministry said Thursday that it was “surprised, to put it mildly”

by the U.S. State Department’s criticism of Russia’s human rights

record and objected in particular to the “odious” section on Chechnya.

30

Example

The Foreign Ministry said Thursday that it was “surprised, to put it mildly”

by the U.S. State Department’s criticism of Russia’s human rights

record and objected in particular to the “odious” section on Chechnya.

(writer,FM,FM)(writer,FM) (writer,FM)

(writer,FM,FM,SD)

(writer,FM) (writer,FM)

32

(General) Subjectivity Types[Wilson 2008]

Other (including cognitive)Note: similar ideas:polarity, semantic orientation, sentiment

33

Extensions [Wilson 2008]

I think people are happy because Chavez has fallen.

direct subjective span: are happy source: <writer, I, People> attitude:

inferred attitude span: are happy because Chavez has fallen type: neg sentiment intensity: medium target:

target span: Chavez has fallen

target span: Chavez

attitude span: are happy type: pos sentiment intensity: medium target:

direct subjective span: think source: <writer, I> attitude:

attitude span: think type: positive arguing intensity: medium target:

target span: people are happy because Chavez has fallen

34

As usual, the US State Department published its annual report on human rights practices in world countries last Monday.

GATE_objective-speech-event (2, 2) nested-source=w implicit=true [] GATE_agent (46, 108) id=report ['its', 'annual', 'report', 'on', 'human', 'right',

'practice', 'in', 'world', 'country']

And as usual, the portion about China contains little truth and many absurdities, exaggerations and fabrications.

GATE_expressive-subjectivity (128, 140) nested-source=w polarity=neutral intensity=low ['and', 'as', 'usual']

GATE_direct-subjective (128, 128) nested-source=w attitude-link=a100 intensity=high implicit=true []

GATE_target (142, 165) id=t100 ['the', 'portion', 'about', 'china'] GATE_agent (160, 165) id=china ['china'] GATE_attitude (166, 240) intensity=high id=a100 attitude-type=sentiment-neg

target-link=t100 ['contain', 'little', 'truth', 'and', 'many', 'absurdity', 'exaggeration', 'and', 'fabrication']

GATE_expressive-subjectivity (175, 187) nested-source=w polarity=negative intensity=medium ['little', 'truth']

GATE_expressive-subjectivity (192, 240) nested-source=w polarity=negative intensity=high ['many', 'absurdity', 'exaggeration', 'and', 'fabrication']

35

Its aim of the 2001 report is to tarnish China's image and exert political pressure on the Chinese Government, human rights experts said at a seminar held by the China Society for Study of Human Rights (CSSHR) on Friday.

GATE_objective-speech-event (248, 248) nested-source=w implicit=true [] GATE_direct-subjective (380, 384) nested-source=w,experts expression-

intensity=neutral attitude-link=a110 intensity=medium ['say']GATE_attitude (248, 357) intensity=medium-high id=a110 attitude-type=sentiment-

neg target-link=t2 ['its', 'aim', 'of', 'the', 'report', … 'the', 'chinese', 'government'] GATE_target (259, 274) id=t2 ['the', 'report'] GATE_expressive-subjectivity (281, 288) nested-source=w,experts

polarity=negative intensity=medium ['tarnish'] GATE_direct-subjective (252, 255) nested-source=w,experts,report

polarity=neutral expression-intensity=medium attitude-link=a120,a130 intensity=medium ['aim']

GATE_attitude (252, 255) intensity=medium id=a120 attitude-type=intention-pos target-link=t3 ['aim']

GATE_target (278, 357) id=t3 ['to', 'tarnish', 'china', "'s", 'image', 'and', 'exert', 'political', 'pressure', 'on', 'the', 'chinese', 'government']

GATE_attitude (252, 288) intensity=medium id=a130 attitude-type=sentiment-neg target-link=t4 ['aim', 'of', 'the', 'report', 'be', 'to', 'tarnish']

GATE_target (289, 294) id=t4 ['china'] GATE_agent (359, 379) nested-source=w,experts id=experts ['human', 'right',

'expert'] GATE_agent (259, 274) nested-source=w,experts,report nested-

target=w,experts,report ['the', 'report']

36

Continued on the next slide…

"The United States was slandering China again," said Xirao-Nima, a professor of Tibetan history at the Central University for Nationalities.

GATE_objective-speech-event (475, 475) nested-source=w implicit=true []

GATE_direct-subjective (523, 527) nested-source=w,nima expression-intensity=neutral attitude-link=a140 intensity=high ['say']

GATE_attitude (494, 508) intensity=high id=a140 attitude-type=sentiment-neg target-link=t5 ['be', 'slander']

GATE_target (476, 493) id=t5 ['the', 'unite', 'state']

GATE_expressive-subjectivity (498, 508) nested-source=w,nima polarity=negative intensity=high ['slander']

37

"The United States was slandering China again," said Xirao-Nima, a professor of Tibetan history at the Central University for Nationalities.

GATE_direct-subjective (494, 508) nested-source=w,nima,US polarity=negative expression-intensity=high attitude-link=a150 intensity=high ['be', 'slander']

GATE_attitude (494, 508) intensity=high id=a150 attitude-type=sentiment-neg target-link=t6 ['be', 'slander']

GATE_target (509, 514) id=t6 ['china'] GATE_agent (528, 538) nested-source=w,nima id=nima ['xirao-', 'nima'] GATE_agent (476, 493) nested-source=w,nima,US nested-

target=w,nima,US id=US ['the', 'unite', 'state'] GATE_agent (509, 514) nested-target=w,nima,US,china ['china']

38

These are all the annotationsIt shows that these so-called truths are not true at all," said Xirao-NimaGATE_objective-speech-event (3111, 3111) nested-source=w implicit=true []GATE_direct-subjective (3170, 3174) attitude-type=negative intensity=high

attitude-link=a350,a355 expression-intensity=neutral nested-source=w,nima attitude-toward=report ['say']

GATE_attitude (3111, 3167) intensity=medium-high id=a355 attitude-type=arguing-neg target-link=t101 ['it', 'show', 'that', 'these', 'so-call', 'truth', 'be', 'not', 'true', 'at', 'all']

GATE_attitude (3111, 3167) intensity=high id=a350 attitude-type=sentiment-neg target-link=t101 ['it', 'show', 'that', 'these', 'so-call', 'truth', 'be', 'not', 'true', 'at', 'all']

GATE_target (3125, 3147) id=t101 ['these', 'so-call', 'truth']GATE_expressive-subjectivity (3131, 3147) nested-source=w,nima

polarity=negative intensity=medium ['so-call', 'truth']GATE_expressive-subjectivity (3152, 3167) nested-source=w,nima

polarity=negative intensity=high ['not', 'true', 'at', 'all'] GATE_agent (3175, 3185) nested-source=w,nima ['xirao-', 'nima']

39

Layering with Other Annotation Schemes

• E.g. Time, Lexical Semantics, Discourse…

• Richer interpretations via combination

• Potential disambiguation both ways

• Example with the Penn Discourse Treebank (PDTB) Version 2 recently released through

Language Data Consortium Joshi, Webber, Prasad, Miltsakaki, … http://www.seas.upenn.edu/~pdtb/

40

• Swapna will cover the following material later in the course in more detail. This is to give us an idea now.

41

The type “Cause” is used when the connective indicates that the situations described in Arg1 and Arg2 are causally influenced and the two are not in a conditional relation …

42

Polarity preserved across Result relation

Other firms "are dealing with the masses. I don't believe they have the culture" to adequately service high-net-worth individuals, he adds.

43

Polarity preserved across Result relation: PDTB

[Other firms "are dealing with the masses ARG1]. I don't believe IMPLICIT_SO [they have the culture" to adequately service high-net-worth individuals ARG2], he adds.

ARG2 is a result of ARG1

44

Polarity preserved across Result relation: PDTB

[Other firms "are dealing with the masses ARG1]. I don't believe IMPLICIT_SO [they have the culture" to adequately service high-net-worth individuals ARG2], he adds.

X said Y: “X said” X’s belief space

“I don’t believe” explicit in second sentence

“Swartz said” implicit in first sentence

ARG spans: Dis. Rel within Swartz’s belief space

45

Polarity preserved across Result relation: subjectivity

Other firms “[are dealing with the masses SENTIMENT-NEG]. I [don't believe they have the culture" to adequately service high-net-worth individuals SENTIMENT-NEG], he adds.

Attitude span includes “don’t believe”; schemes require different notions of spans

46

Polarity preserved across Result relation: subjectivity

Other firms “[are dealing with the masses SENTIMENT-NEG]. I [don't believe they have the culture" to adequately service high-net-worth individuals SENTIMENT-NEG], he adds.

Two negative properties, where the second is a result of the first

47

Polarity preserved across Result relation: subjectivity

Other firms “[are dealing with the masses SENTIMENT-NEG]. I [don't believe they have the culture" to adequately service high-net-worth individuals SENTIMENT-NEG], he adds.

Dis Rel between ARGS inside his belief space

48

Polarity preserved across Result relation: subjectivity

Other firms “[are dealing with the masses SENTIMENT-NEG]. I [don't believe they have the culture" to adequately service high-net-worth individuals SENTIMENT-NEG], he adds.

Semantics of result: specific subtype, where a negative state of affairs is the result of another one

49

The class tag “COMPARISON” applies when the connective indicates that a discourse relation is established between Arg1 and Arg2 in order to highlight prominent differences between the two situations.

50

In that suit, the SEC accused Mr. Antar of engaging in a "massive financial fraud" to overstate the earnings of Crazy Eddie, Edison, N.J., over a three-year period.Through his lawyers, Mr. Antar has denied allegations in the SEC suit and in civil suits previously filed by shareholders against Mr. Antar and others.

51

PDTB

[In that suit, the SEC accused Mr. Antar of engaging in a "massive financial fraud" to overstate the earnings of Crazy Eddie, Edison, N.J., over a three-year period. ARG1]IMPLICIT_CONTRAST [ Through his lawyers, Mr. Antar has denied allegations in the SEC suit and in civil suits previously filed by shareholders against Mr. Antar and others. ARG2]

Contrast between the SEC accusing Mr. Antar of something, and his denying the accusation

52

Subjectivity

In that suit, the SEC [[accused SENTIMENT-NEG] Mr. Antar of engaging in a "massive financial fraud" to overstate the earnings of Crazy Eddie, Edison, N.J. ARGUING-POS], over a three-year period.

Through his lawyers, Mr. Antar [has denied AGREE-NEG] allegations in the SEC suit and in civil suits previously filed by shareholders against Mr. Antar and others.

Two attitudes combined into one large disagreement between two parties

53

Subjectivity

In that suit, the SEC [[accused SENTIMENT-NEG] Mr. Antar of engaging in a "massive financial fraud" to overstate the earnings of Crazy Eddie, Edison, N.J. ARGUING-POS], over a three-year period.

Through his lawyers, Mr. Antar [has denied AGREE-NEG] allegations in the SEC suit and in civil suits previously filed by shareholders against Mr. Antar and others.

Subjectivity: arguing-pos and agree-neg with different sources; Hypothesis: common with contrast. Help recognize the implicit contrast.

54

Subjectivity

In that suit, the SEC [[accused SENTIMENT-NEG] Mr. Antar of engaging in a "massive financial fraud" to overstate the earnings of Crazy Eddie, Edison, N.J. ARGUING-POS], over a three-year period.

Through his lawyers, Mr. Antar [has denied AGREE-NEG] allegations in the SEC suit and in civil suits previously filed by shareholders against Mr. Antar and others.

Semantics of comparison: specific case of highlighting prominent differences in attitudes of different people

Recommended