22
Multilingual Opinion Holder Identification Using Author and Authority Viewpoints Yohei Seki, Noriko Kando ,Masaki Aono Toyohashi University of Technology and National Institute of Informatics, Japan Journal of Information Processing and Management 2009 Reporter: Chia-Ying Lee Advisor: Prof. Hsin- Hsi Chen

Multilingual Opinion Holder Identification Using Author and Authority Viewpoints Yohei Seki, Noriko Kando,Masaki Aono Toyohashi University of Technology

Embed Size (px)

Citation preview

Page 1: Multilingual Opinion Holder Identification Using Author and Authority Viewpoints Yohei Seki, Noriko Kando,Masaki Aono Toyohashi University of Technology

Multilingual Opinion Holder Identification Using Author and Authority Viewpoints

Yohei Seki, Noriko Kando ,Masaki AonoToyohashi University of Technology and National Institute of Informatics, JapanJournal of Information Processing and

Management 2009

Reporter: Chia-Ying Lee Advisor: Prof. Hsin-Hsi Chen

Page 2: Multilingual Opinion Holder Identification Using Author and Authority Viewpoints Yohei Seki, Noriko Kando,Masaki Aono Toyohashi University of Technology

2009/04/27 Cicilia Chia-ying Lee 2

Outline1. Problem Definition

2. Corpus: NTCIR-6 pilot

3. Approach in NTCIR-6

4. Revised Approach after NTCIR-6

5. Comparison and Discussion

6. Conclusion

Page 3: Multilingual Opinion Holder Identification Using Author and Authority Viewpoints Yohei Seki, Noriko Kando,Masaki Aono Toyohashi University of Technology

2009/04/27 Cicilia Chia-ying Lee 3

Problem Definition(1/2) Identify opinion holder in opinion sentence It is important because news articles contain

many opinions from different opinion holder Opinion holder:

1. The explicit noun phrases in the sentences

2. The inexplicit noun phrases (ex: anaphor)

3. The exophoric elements (ex: author)

Page 4: Multilingual Opinion Holder Identification Using Author and Authority Viewpoints Yohei Seki, Noriko Kando,Masaki Aono Toyohashi University of Technology

2009/04/27 Cicilia Chia-ying Lee 4

Problem Definition(2/2)Author: the writer of the documentAuthority: the third partiesFocused on different writing style

Difference in syntactic constructs or term usages.

Page 5: Multilingual Opinion Holder Identification Using Author and Authority Viewpoints Yohei Seki, Noriko Kando,Masaki Aono Toyohashi University of Technology

2009/04/27 Cicilia Chia-ying Lee 5

Corpus NTCIR-6 Opinion Analysis Pilot Task

Evaluation method

Page 6: Multilingual Opinion Holder Identification Using Author and Authority Viewpoints Yohei Seki, Noriko Kando,Masaki Aono Toyohashi University of Technology

2009/04/27 Cicilia Chia-ying Lee 6

Approach in NTCIR-6

Evaluation results in NTCIR-6

12

3

Page 7: Multilingual Opinion Holder Identification Using Author and Authority Viewpoints Yohei Seki, Noriko Kando,Masaki Aono Toyohashi University of Technology

2009/04/27 Cicilia Chia-ying Lee 7

Author and Authority Opinion Extraction(1/4)

Three opinion types (Wiebe et al 2005)1. Explicit mentions of private states by a person,

nation, or organization

2. Speech events expressing private states by an agent

3. Expressive subjective elements (author view)

Page 8: Multilingual Opinion Holder Identification Using Author and Authority Viewpoints Yohei Seki, Noriko Kando,Masaki Aono Toyohashi University of Technology

2009/04/27 Cicilia Chia-ying Lee 8

Author and Authority Opinion Extraction(2/4)

Japanese Train set: NTCIR-6, 4 training topics Features:

Syntactic pairs of grammatical subjects and predicates such as pronouns

Subjects : named entities, semantic primitives, and key terms

Predicates : semantic primitives from a thesaurus

Parser: Cabocha

Page 9: Multilingual Opinion Holder Identification Using Author and Authority Viewpoints Yohei Seki, Noriko Kando,Masaki Aono Toyohashi University of Technology

2009/04/27 Cicilia Chia-ying Lee 9

Author and Authority Opinion Extraction(3/4)

English Train set: MPQA Corpus

Author view: ‘‘nested source” attributes was a ‘‘w” (writer) and not nested

Feature: Syntactic pairs of the syntactic patterns such as nouns and adjectives/verbs

Parser: Minipar

Page 10: Multilingual Opinion Holder Identification Using Author and Authority Viewpoints Yohei Seki, Noriko Kando,Masaki Aono Toyohashi University of Technology

2009/04/27 Cicilia Chia-ying Lee 10

Author and Authority Opinion Extraction(4/4)

Page 11: Multilingual Opinion Holder Identification Using Author and Authority Viewpoints Yohei Seki, Noriko Kando,Masaki Aono Toyohashi University of Technology

2009/04/27 Cicilia Chia-ying Lee 11

Rule-based Holder Identification

(1) Bracketed elements of PER,ORG,LOC in the sentence.

(2) Grammatical subject elements of PER, ORG, LOC in the sentence.

(3) Grammatical subject elements of PER, ORG, LOC in the previous sentences.

(4) PER, ORG, LOC in the sentences other than those classified by (1) or (2).

Name entity extractor: NExT

Page 12: Multilingual Opinion Holder Identification Using Author and Authority Viewpoints Yohei Seki, Noriko Kando,Masaki Aono Toyohashi University of Technology

Evaluation results in NTCIR-6(1/3)

2009/04/27 Cicilia Chia-ying Lee 12

Page 13: Multilingual Opinion Holder Identification Using Author and Authority Viewpoints Yohei Seki, Noriko Kando,Masaki Aono Toyohashi University of Technology

2009/04/27 Cicilia Chia-ying Lee 13

Evaluation results in NTCIR-6(2/3)

Opinion holder extraction(1) Extraction using term sequences (Cornell, GATE)

(2) Lexicon-based heuristics (IIT)

(3) Named entity extraction approach (TUT and others)

Identify the author(1) To utilize author-related clues such as verbs (ICU-IR)

(2) To detect author opinion holders when there were no holder candidates surrounding the opinionated sentences (EHBN, Cornell)

Page 14: Multilingual Opinion Holder Identification Using Author and Authority Viewpoints Yohei Seki, Noriko Kando,Masaki Aono Toyohashi University of Technology

2009/04/27 Cicilia Chia-ying Lee 14

English: Author-opinionated sentences appeared more often

Evaluation results in NTCIR-6(3/3)

Page 15: Multilingual Opinion Holder Identification Using Author and Authority Viewpoints Yohei Seki, Noriko Kando,Masaki Aono Toyohashi University of Technology

2009/04/27 Cicilia Chia-ying Lee 15

Outline1. Problem Definition

2. Corpus: NTCIR-6 pilot

3. Approach in NTCIR-6

4. Revised Approach after NTCIR-61. More features

2. Direct-subjective Classifier

5. Comparison and Discussion

6. Conclusion

Page 16: Multilingual Opinion Holder Identification Using Author and Authority Viewpoints Yohei Seki, Noriko Kando,Masaki Aono Toyohashi University of Technology

2009/04/27 Cicilia Chia-ying Lee 16

More Features (1/3) Extend by ICU-IR approach

Phrase governed by “say”, “by” NP followed by “according to”, “by” Subjects governed by opinion verbs

Grammatical syntactic patterns Grammatical subject & verbs Auxiliary verb & verb

Page 17: Multilingual Opinion Holder Identification Using Author and Authority Viewpoints Yohei Seki, Noriko Kando,Masaki Aono Toyohashi University of Technology

2009/04/27 Cicilia Chia-ying Lee 17

More Features(2/3)

Page 18: Multilingual Opinion Holder Identification Using Author and Authority Viewpoints Yohei Seki, Noriko Kando,Masaki Aono Toyohashi University of Technology

2009/04/27 Cicilia Chia-ying Lee 18

More Features (3/3)

Features selected based on χ-square tests on the MPQA corpus

three count features: cntopnoun, cntopadj, and cntopadv in the subjective lexicon (Wilson et al)

Page 19: Multilingual Opinion Holder Identification Using Author and Authority Viewpoints Yohei Seki, Noriko Kando,Masaki Aono Toyohashi University of Technology

Direct-subjective Classifier(1/2)

Goal: Filtering the author-opinionated sentences

Method: Combine opinion type 1 and 2 Train set : MPQA Classifier: SVM-light

2009/04/27 Cicilia Chia-ying Lee 19

Page 20: Multilingual Opinion Holder Identification Using Author and Authority Viewpoints Yohei Seki, Noriko Kando,Masaki Aono Toyohashi University of Technology

Direct-subjective Classifier(2/2)

2009/04/27 Cicilia Chia-ying Lee 20

↗0.1

↗0.08

Page 21: Multilingual Opinion Holder Identification Using Author and Authority Viewpoints Yohei Seki, Noriko Kando,Masaki Aono Toyohashi University of Technology

2009/04/27 Cicilia Chia-ying Lee 21

Comparison and Discussion

Baseline: The algorithm from authority opinion Features selected based on χ-square tests on the MPQA

corpus for the opinionated sentence extraction 7 topics contained more than 30% of author-opinionated

sentences attained higher F-value

Page 22: Multilingual Opinion Holder Identification Using Author and Authority Viewpoints Yohei Seki, Noriko Kando,Masaki Aono Toyohashi University of Technology

2009/04/27 Cicilia Chia-ying Lee 22

Conclusion Proposed an opinion holder identification

system in both Japanese and English Features selected based on χ-square tests and

direct-subjective classifier improve the result in English

Future work: Public opinion Multilingual blogs