16
Intelligent Database Systems Presenter : Kung, Chien-Hao Authors : Yoong Keok Lee and Hwee Tou Ng 2002,EMNLP An Empirical Evaluation of Knowledge Sources and learining Algorithms for Word Sense Disambiguation

Intelligent Database Systems Lab Presenter : Kung, Chien-Hao Authors : Yoong Keok Lee and Hwee Tou Ng 2002,EMNLP An Empirical Evaluation of Knowledge Sources

Embed Size (px)

Citation preview

Page 1: Intelligent Database Systems Lab Presenter : Kung, Chien-Hao Authors : Yoong Keok Lee and Hwee Tou Ng 2002,EMNLP An Empirical Evaluation of Knowledge Sources

Intelligent Database Systems Lab

Presenter : Kung, Chien-Hao

Authors : Yoong Keok Lee and Hwee Tou Ng

2002,EMNLP

An Empirical Evaluation of Knowledge Sources and learining Algorithms for

Word Sense Disambiguation

Page 2: Intelligent Database Systems Lab Presenter : Kung, Chien-Hao Authors : Yoong Keok Lee and Hwee Tou Ng 2002,EMNLP An Empirical Evaluation of Knowledge Sources

Intelligent Database Systems Lab

Outlines

MotivationObjectivesMethodologyExperimentsConclusionsComments

Page 3: Intelligent Database Systems Lab Presenter : Kung, Chien-Hao Authors : Yoong Keok Lee and Hwee Tou Ng 2002,EMNLP An Empirical Evaluation of Knowledge Sources

Intelligent Database Systems Lab

Motivation• Natural language is inherently ambiguous.

• A word can have multiple meanings(or senses).

Page 4: Intelligent Database Systems Lab Presenter : Kung, Chien-Hao Authors : Yoong Keok Lee and Hwee Tou Ng 2002,EMNLP An Empirical Evaluation of Knowledge Sources

Intelligent Database Systems Lab

Objectives• This paper evaluates a variety of knowledge sources

and supervised learning algorithms for word sense

disambiguation on SENSEVAL-2 and SENSEVAL-1 data.

Page 5: Intelligent Database Systems Lab Presenter : Kung, Chien-Hao Authors : Yoong Keok Lee and Hwee Tou Ng 2002,EMNLP An Empirical Evaluation of Knowledge Sources

Intelligent Database Systems Lab

Methodology

Part of speech (POS) of Neighboring Words

Single Words in the Surrounding Context

Local CollocationsSyntactic Relations

Knowledge Sources

Page 6: Intelligent Database Systems Lab Presenter : Kung, Chien-Hao Authors : Yoong Keok Lee and Hwee Tou Ng 2002,EMNLP An Empirical Evaluation of Knowledge Sources

Intelligent Database Systems Lab

Methodology• Part-of-Speech(POS) of Neighboring Words– This paper use 7 features to encode this knowledge source – Setence segmentation program

(Reynar and Ratnaparkhi , 1997)– POS tagger

(Ratnaparkhi , 1996)

Reid saw me looking at the iron bars. barsand

NNP VBD PRP VBG IN DT NN NNS .

{IN,DT,NN,NNS,.,,}

Page 7: Intelligent Database Systems Lab Presenter : Kung, Chien-Hao Authors : Yoong Keok Lee and Hwee Tou Ng 2002,EMNLP An Empirical Evaluation of Knowledge Sources

Intelligent Database Systems Lab

Methodology• Single Words in the Surrounding Context– Feature selection method• Parameter:M2

{chocolate, iron, beer}

Reid saw me looking at the iron bars.

bars

<0,1,0>

Page 8: Intelligent Database Systems Lab Presenter : Kung, Chien-Hao Authors : Yoong Keok Lee and Hwee Tou Ng 2002,EMNLP An Empirical Evaluation of Knowledge Sources

Intelligent Database Systems Lab

Methodology• Local Collocations– This paper extracted 11 features.

C-1,-1 ,C1,1,C-2,-2,C2,2,C-2,-1,C-1,1,C1,2,C-3,-1,C-2,1,C-1,2,C1,3

{ a_chocolate , the_wine , the_iron }

Reid saw me looking at the iron bars.

bars

<the_iron>C-2,-1

Page 9: Intelligent Database Systems Lab Presenter : Kung, Chien-Hao Authors : Yoong Keok Lee and Hwee Tou Ng 2002,EMNLP An Empirical Evaluation of Knowledge Sources

Intelligent Database Systems Lab

Methodology• Syntactic Relations

(a) Show w and its POS(b) Show the sentence where w occurs(c) Show the feature vector corresponding to syntactic relations

Page 10: Intelligent Database Systems Lab Presenter : Kung, Chien-Hao Authors : Yoong Keok Lee and Hwee Tou Ng 2002,EMNLP An Empirical Evaluation of Knowledge Sources

Intelligent Database Systems Lab

• Learning Algorithms– Support Vector Machines– AdaBoost– Naïve Bayes– Decision Trees

• Evaluation Data Sets– SENSEVAL-2– SENSEVAL-1

Methodology

Page 11: Intelligent Database Systems Lab Presenter : Kung, Chien-Hao Authors : Yoong Keok Lee and Hwee Tou Ng 2002,EMNLP An Empirical Evaluation of Knowledge Sources

Intelligent Database Systems Lab

Experiments

Page 12: Intelligent Database Systems Lab Presenter : Kung, Chien-Hao Authors : Yoong Keok Lee and Hwee Tou Ng 2002,EMNLP An Empirical Evaluation of Knowledge Sources

Intelligent Database Systems Lab

Experiments

Page 13: Intelligent Database Systems Lab Presenter : Kung, Chien-Hao Authors : Yoong Keok Lee and Hwee Tou Ng 2002,EMNLP An Empirical Evaluation of Knowledge Sources

Intelligent Database Systems Lab

Experiments

Page 14: Intelligent Database Systems Lab Presenter : Kung, Chien-Hao Authors : Yoong Keok Lee and Hwee Tou Ng 2002,EMNLP An Empirical Evaluation of Knowledge Sources

Intelligent Database Systems Lab

Experiments

Page 15: Intelligent Database Systems Lab Presenter : Kung, Chien-Hao Authors : Yoong Keok Lee and Hwee Tou Ng 2002,EMNLP An Empirical Evaluation of Knowledge Sources

Intelligent Database Systems Lab

Conclusions

• Using all of these knowledge sources and SVM

achieves accuracy higher than the best official scores

on both SENSEVAL-2 and SENSEVAL-a test data.

Page 16: Intelligent Database Systems Lab Presenter : Kung, Chien-Hao Authors : Yoong Keok Lee and Hwee Tou Ng 2002,EMNLP An Empirical Evaluation of Knowledge Sources

Intelligent Database Systems Lab

Comments• Advantages– This paper easy to read.

• Applications– WSD