Intelligent Database Systems Lab Presenter : JIAN-REN CHEN Authors : Ahmed Abbasi, Stephen France,...

Preview:

Citation preview

Intelligent Database Systems Lab

Presenter : JIAN-REN CHEN

Authors : Ahmed Abbasi, Stephen France, Zhu Zhang,

     and Hsinchun Chen

2011 , IEEE TKDE

Selecting Attributes for Sentiment Classification Using Feature Relation Networks

Intelligent Database Systems Lab

Outlines

MotivationObjectivesMethodologyExperimentsConclusionsComments

Intelligent Database Systems Lab

MotivationSentiment analysis has emerged as a method for

mining opinions from such text archives.

challenging problem:

1. requires the use of large quantities of linguistic features

2. integrate these heterogeneous n-gram categories into a single

feature set

- noise 、 redundancy and computational limitations

1) polarity 2) intensityI don’t like you 、 I hate you

Intelligent Database Systems Lab

n-gram - (Markov model)天氣:晴天、陰天、雨天

美麗 vs 美痢

“HAPAX” and “DIS” tagsI hate Jimreplaced with “I hate HAPAX”

Intelligent Database Systems Lab

Objectives• Feature Relation Network (FRN) considers semantic information

and also leverages the syntactic relationships between n-gram

features.

- enhanced sentiment classification on extended sets of

heterogeneous n-gram features.

Intelligent Database Systems Lab

Methodology-Extended N-Gram Feature Set

Intelligent Database Systems Lab

Methodology - Subsumption Relations

A subsumes B(A → B) “I love chocolate”

  unigram :   I, LOVE, CHOCOLATE  bigrams :   I LOVE, LOVE CHOCOLATE  trigrams :   I LOVE CHOCOLATE

W hat about the bigrams and trigrams?

It depends on their weight.Their weight exceeds that of their general lower order counterparts by threshold t.

Intelligent Database Systems Lab

Methodology - Parallel RelationsA parallel B (A - B)

POS tag: “ADMIRE_VP”   → “ like”     semantic class: “SYN-Affection”  → “ love”

A and B have a correlation coefficient greater than some threshold p, one of the attributes is removed to avoid redundancy.

Intelligent Database Systems Lab

Methodology - The Complete Network

Intelligent Database Systems Lab

Methodology - Incorporating Semantic  Information

Intelligent Database Systems Lab

Experiments - Datasets

Intelligent Database Systems Lab

Experiments – FRN vs Univariate

Intelligent Database Systems Lab

Experiments - FRN vs Univariate (WithinOne)

Intelligent Database Systems Lab

Experiments - FRN vs Multivariate

Intelligent Database Systems Lab

Experiments - FRN vs Multivariate (WithinOne)

Intelligent Database Systems Lab

Experiments - FRN vs Hybrid

Intelligent Database Systems Lab

Experiments - FRN vs Hybrid (WithinOne)

Intelligent Database Systems Lab

Experiments - Ablation

Intelligent Database Systems Lab

Experiments - Parametert (0.0005, 0.005, 0.05, and 0.5)p (0.80, 0.90, and 1.00)

Intelligent Database Systems Lab

Experiments - Average Runtimes

Intelligent Database Systems Lab

Conclusions

• FRN had significantly higher best accuracy and best

percentage within-one across three testbeds.

• The ablation and parameter testing results play an

important role for the subsumption and parallel

relation thresholds.

Intelligent Database Systems Lab

Comments• Advantages

- accuracy 、 computationally efficient• Disadvantage

- ablation and parameter is sensitive• Applications

- sentiment classification- feature selection method

Recommended