42
HowNet and Computation of Meaning Zhendong Dong [email protected] WWW.keenage.com GWC-06 Jeju, Korea 2006-01-22

HowNet and Computation of Meaning

  • Upload
    kaspar

  • View
    60

  • Download
    3

Embed Size (px)

DESCRIPTION

HowNet and Computation of Meaning. Zhendong Dong [email protected] WWW.keenage.com GWC-06 Jeju, Korea 2006-01-22. Outlines. Bird’s-eye view of HowNet Prominent features. Bird’s-eye view of HowNet. What is HowNet? History of HowNet Statistics on latest version - PowerPoint PPT Presentation

Citation preview

HowNet and

Computation of Meaning

Zhendong Dong [email protected] WWW.keenage.com

GWC-06 Jeju, Korea 2006-01-22

Outlines

Bird’s-eye view of HowNet

Prominent features

Bird’s-eye view of HowNet

What is HowNet? History of HowNet Statistics on latest version Composition of HowNet

What is HowNet?

HowNet is an on-line extralinguistic knowledge system for the computation of meaning in HLT.

HowNet unveils inter-concept relations and inter-attribute relations of the concepts as connoted in its Chinese-English lexicon.

History of HowNet

1988 Basic research started1999 1st version released2000 Revision of KDML started2002 New version released

Statistics - general

Chinese word & expression 84102English word & expression 80250Chinese meaning 98530English meaning 100071Definition 25295Record 161743

A record in HowNet dictionary

NO.=076856W_C=买主G_C=N [mai3 zhu3]E_C=W_E=buyerG_E=NE_E=DEF={human|人 :domain={commerce|商业 },{buy|买 :

agent={~}}}

Statistics - semantic

Chinese EnglishThing 58153 58096Component 7025 7023Time 2238 2244Space 1071 1071Attribute 3776 4045Atttibute-value 9089 8478Event 12634 10076

Statistics – main syntactic categories

Chinese EnglishADJ 11705 9576ADV 1516 2084VERB 25929 21017NOUN 46867 48342PRON 112 71NUM 225 242PREP 128 113AUX 77 49CLA 424 0

Statistics – part of relations

Chinese synset: Set = 13463 Word Form = 54312

antonym: Set = 12777 converse: Set = 6753

English synset: Set = 18575 Word Form = 58488

antonym: Set = 12032 converse: Set = 6442

Composition

Database Tools for computation of meaning

Database

Dictionary Taxonomies Axiomatic relations & role shifting

Dictionary

Taxonomies - 10

Entity Event Attribute AttributeValue Secondary features Event roles Typical actors of event roles Event relations and role shifting Antonymous sememe pairs Converse sememe pairs

Tools for computation of meaning

Browser Secondary resources

Prominent features All syntactic classes of words included Sememes and semantic roles Defining concepts in KDML on the basis of

sememes and semantic roles Relations – the soul of HowNet Relations obtained by computing rather than

manually-coding Identical representation in various linguistic

structures

Sememes

Sememes 2099Entity 151

thing (physical, mental, fact) component (part, fitting) time space (direction, location)

Event (relation, state; action) 812Attribute 247AttributeValue 889Secondary feature 121

Semantic roles 91

(1) Main semantic roles (a) principal semantic roles: 6 (b) affected semantic roles: 11(2) peripheral semantic roles (a) time: 12 (f) basis: 6 (b) space: 11 (g) comparison: 2 (c) resultant: 8 (h) coordination: 6 (d) manner: 11 (i) commentary: 2 (e) modifier: 16

Defining concepts (1)

W_E=doctorG_E=VDEF={doctor|医治 }

W_E=doctorG_E=NDEF={human|人 :HostOf={Occupation|职位 },domain={medical|医 },

{doctor|医治 :agent={~}}}

W_E=doctorG_E=NE_E=DEF={human|人 :{own|有 :possession={Status|身分 :

domain={education|教育 },modifier={HighRank|高等 :degree={most|最 }}},possessor={~}}}

Defining concepts (2)W_E=buyG_E=VDEF={buy|买 }

cf. (WordNet) obtain by purchase; acquire by means of finacial transaction

W_E=buyG_E=VDEF={GiveAsGift|赠 :manner={guilty|有罪 },

purpose={entice|勾引 }} cf. (WordNet) make illegal payments to in exchange for favors or influence

Relations – the soul of HowNet

Meaning is represented by relations Computation of meaning is based on

relations

1. Event Frame ~ Verb frame- {event| 事件 } ├ {static| 静态 } {event| 事件 } │ ├ {relation| 关系 } {static| 静态 } │ │ ├ {possession|领属关系 } {relation| 关系 } │ │ │ ├ {own| 有 } {possession| 领属关系 :possessor={*},possession={*}} │ │ │ │ ├ {obtain| 得到 } {own| 有 :possessor={*},possession={*},source={*}} └ {act| 行动 } {event| 事件 :agent={*}} ├ {ActGeneral| 泛动 } {act| 行动 :agent={*}} └ {ActSpecific| 实动 } {act| 行动 :agent={*}} └ {AlterSpecific| 实变 } {ActSpecific| 实动 :agent={*}} ├ {AlterRelation| 变关系 } {AlterSpecific| 实变 :agent={*}} │ ├ {AlterPossession|变领属 } {AlterRelation| 变关系 :agent={*},possession={*}} │ │ ├ {take|取 } {AlterPossession|变领属 :agent={*},possession={*},source={*}} │ │ │ ├ {buy|买 } {take|取 :agent={*},

possession={*}, source={*}, cost={*}, beneficiary={*}

2. Typical actors of event roles ~ VerbNet

│ ├ {buy|买 } {take|取 :agent={human|人 }{group|群体 ->}, possession={artifact|人工物 ->}, source={human|人 }{InstitutePlace|场所 }, cost={money|货币 }, beneficiary={human|人 }{group|群体 -

>}, domain={economy|经济 }}

Axiomatic Relations & Role Shifting - 1

{buy|买 } <----> {obtain|得到 } [consequence]; agent OF {buy|买 }=possessor OF {obtain|得到 }; possession OF {buy|买 }=possession OF {obtain|得到 }. {buy|买 } <----> {obtain|得到 } [consequence]; beneficiary OF {buy|买 }=possessor OF {obtain|得到 }; possession OF {buy|买 }=possession OF {obtain|得到 }.

{buy|买 } <----> {obtain|得到 } [consequence]; source OF {buy|买 }=source OF {obtain|得到 }; possession OF {buy|买 }=possession OF {obtain|得到 }.

Axiomatic Relations & Role Shifting - 2

{buy|买 } [entailment] <----> {choose|选择 }; agent OF {buy|买 }=agent OF {choose|选择 }; possession OF {buy|买 }=content OF {choose|选择 }; source OF {buy|买 }=location OF {choose|选择 }. {buy|买 } [entailment] <----> {pay|付 }; agent OF {buy|买 }=agent OF {pay|付 }; cost OF {buy|买 }=possession OF {pay|付 }; source OF {buy|买 }=taget OF {pay|付 }.

Axiomatic Relations & Role Shifting - 3

{buy|买 } (X) <----> {sell|卖 } (Y) [mutual implication]; agent OF {buy|买 }=target OF {sell|卖 }; source OF {buy|买 }=agent OF {sell|卖 }; possession OF {buy|买 }=possession OF {sell|卖 }; cost OF {buy|买 }=cost OF {sell|卖 }.

Identical representation - 1

W_E=smuggleG_E=VDEF={transport|运送 :manner={guilty|有罪 }}

W_E=drugG_E=NDEF={addictive|嗜好物 :modifier={guilty|有罪 }}

Identical representation - 2

W_E=smuggling of drugsG_E=NDEF={fact|事情 :CoEvent={transport|运送 :

manner={guilty|有罪 },patient={addictive|嗜好物 :modifier={guilty|有罪 }}}}

W_E=drug smugglerG_E=NDEF={community|团体 :{transport|运送 :agent={~},

manner={unlawful|非法 },patient={addictive|嗜好物 },purpose={sell|卖 }}}

Types of relations

Motivation to develop secondary resources

To check from different angles HowNet knowledge data for their preciseness and consistency

To provide users with tools for application Practible for any sense of any word

Secondary resources

Concept Relevance Calculator (CRC) Concept Similarity Measure (CSM) Query Expansion Tool (QET) Chinese Morphological Processor (CMP)Chinese Message Analyzer (CMA)

Concept similarity

doctor 2 <> dentist 0.300000doctor 1<> dentist 0.883333doctor 1<> nurse1 0.620000doctor 1<> nurse2 0.454545doctor 1<> patient 0.203636

walk <> run 0.144444walk <> jump 0.144444walk <> swim 0.130159walk <> fly 0.124444walk <> buy 0.018605

Conclusion

Extralinguistic knowledge is indispensable for HLT

The knowledge should be a system which is computer-oriented

It should be big enough, exemplary toy is useless

It can conduct computation of meaning

Thank youThank you

Welcome towww.keenage.com!

Download and try Mini-HowNet