Upload
kaspar
View
60
Download
3
Embed Size (px)
DESCRIPTION
HowNet and Computation of Meaning. Zhendong Dong [email protected] WWW.keenage.com GWC-06 Jeju, Korea 2006-01-22. Outlines. Bird’s-eye view of HowNet Prominent features. Bird’s-eye view of HowNet. What is HowNet? History of HowNet Statistics on latest version - PowerPoint PPT Presentation
Citation preview
HowNet and
Computation of Meaning
Zhendong Dong [email protected] WWW.keenage.com
GWC-06 Jeju, Korea 2006-01-22
Bird’s-eye view of HowNet
What is HowNet? History of HowNet Statistics on latest version Composition of HowNet
What is HowNet?
HowNet is an on-line extralinguistic knowledge system for the computation of meaning in HLT.
HowNet unveils inter-concept relations and inter-attribute relations of the concepts as connoted in its Chinese-English lexicon.
History of HowNet
1988 Basic research started1999 1st version released2000 Revision of KDML started2002 New version released
Statistics - general
Chinese word & expression 84102English word & expression 80250Chinese meaning 98530English meaning 100071Definition 25295Record 161743
A record in HowNet dictionary
NO.=076856W_C=买主G_C=N [mai3 zhu3]E_C=W_E=buyerG_E=NE_E=DEF={human|人 :domain={commerce|商业 },{buy|买 :
agent={~}}}
Statistics - semantic
Chinese EnglishThing 58153 58096Component 7025 7023Time 2238 2244Space 1071 1071Attribute 3776 4045Atttibute-value 9089 8478Event 12634 10076
Statistics – main syntactic categories
Chinese EnglishADJ 11705 9576ADV 1516 2084VERB 25929 21017NOUN 46867 48342PRON 112 71NUM 225 242PREP 128 113AUX 77 49CLA 424 0
Statistics – part of relations
Chinese synset: Set = 13463 Word Form = 54312
antonym: Set = 12777 converse: Set = 6753
English synset: Set = 18575 Word Form = 58488
antonym: Set = 12032 converse: Set = 6442
Taxonomies - 10
Entity Event Attribute AttributeValue Secondary features Event roles Typical actors of event roles Event relations and role shifting Antonymous sememe pairs Converse sememe pairs
Prominent features All syntactic classes of words included Sememes and semantic roles Defining concepts in KDML on the basis of
sememes and semantic roles Relations – the soul of HowNet Relations obtained by computing rather than
manually-coding Identical representation in various linguistic
structures
Sememes
Sememes 2099Entity 151
thing (physical, mental, fact) component (part, fitting) time space (direction, location)
Event (relation, state; action) 812Attribute 247AttributeValue 889Secondary feature 121
Semantic roles 91
(1) Main semantic roles (a) principal semantic roles: 6 (b) affected semantic roles: 11(2) peripheral semantic roles (a) time: 12 (f) basis: 6 (b) space: 11 (g) comparison: 2 (c) resultant: 8 (h) coordination: 6 (d) manner: 11 (i) commentary: 2 (e) modifier: 16
Defining concepts (1)
W_E=doctorG_E=VDEF={doctor|医治 }
W_E=doctorG_E=NDEF={human|人 :HostOf={Occupation|职位 },domain={medical|医 },
{doctor|医治 :agent={~}}}
W_E=doctorG_E=NE_E=DEF={human|人 :{own|有 :possession={Status|身分 :
domain={education|教育 },modifier={HighRank|高等 :degree={most|最 }}},possessor={~}}}
Defining concepts (2)W_E=buyG_E=VDEF={buy|买 }
cf. (WordNet) obtain by purchase; acquire by means of finacial transaction
W_E=buyG_E=VDEF={GiveAsGift|赠 :manner={guilty|有罪 },
purpose={entice|勾引 }} cf. (WordNet) make illegal payments to in exchange for favors or influence
Relations – the soul of HowNet
Meaning is represented by relations Computation of meaning is based on
relations
1. Event Frame ~ Verb frame- {event| 事件 } ├ {static| 静态 } {event| 事件 } │ ├ {relation| 关系 } {static| 静态 } │ │ ├ {possession|领属关系 } {relation| 关系 } │ │ │ ├ {own| 有 } {possession| 领属关系 :possessor={*},possession={*}} │ │ │ │ ├ {obtain| 得到 } {own| 有 :possessor={*},possession={*},source={*}} └ {act| 行动 } {event| 事件 :agent={*}} ├ {ActGeneral| 泛动 } {act| 行动 :agent={*}} └ {ActSpecific| 实动 } {act| 行动 :agent={*}} └ {AlterSpecific| 实变 } {ActSpecific| 实动 :agent={*}} ├ {AlterRelation| 变关系 } {AlterSpecific| 实变 :agent={*}} │ ├ {AlterPossession|变领属 } {AlterRelation| 变关系 :agent={*},possession={*}} │ │ ├ {take|取 } {AlterPossession|变领属 :agent={*},possession={*},source={*}} │ │ │ ├ {buy|买 } {take|取 :agent={*},
possession={*}, source={*}, cost={*}, beneficiary={*}
2. Typical actors of event roles ~ VerbNet
│ ├ {buy|买 } {take|取 :agent={human|人 }{group|群体 ->}, possession={artifact|人工物 ->}, source={human|人 }{InstitutePlace|场所 }, cost={money|货币 }, beneficiary={human|人 }{group|群体 -
>}, domain={economy|经济 }}
Axiomatic Relations & Role Shifting - 1
{buy|买 } <----> {obtain|得到 } [consequence]; agent OF {buy|买 }=possessor OF {obtain|得到 }; possession OF {buy|买 }=possession OF {obtain|得到 }. {buy|买 } <----> {obtain|得到 } [consequence]; beneficiary OF {buy|买 }=possessor OF {obtain|得到 }; possession OF {buy|买 }=possession OF {obtain|得到 }.
{buy|买 } <----> {obtain|得到 } [consequence]; source OF {buy|买 }=source OF {obtain|得到 }; possession OF {buy|买 }=possession OF {obtain|得到 }.
Axiomatic Relations & Role Shifting - 2
{buy|买 } [entailment] <----> {choose|选择 }; agent OF {buy|买 }=agent OF {choose|选择 }; possession OF {buy|买 }=content OF {choose|选择 }; source OF {buy|买 }=location OF {choose|选择 }. {buy|买 } [entailment] <----> {pay|付 }; agent OF {buy|买 }=agent OF {pay|付 }; cost OF {buy|买 }=possession OF {pay|付 }; source OF {buy|买 }=taget OF {pay|付 }.
Axiomatic Relations & Role Shifting - 3
{buy|买 } (X) <----> {sell|卖 } (Y) [mutual implication]; agent OF {buy|买 }=target OF {sell|卖 }; source OF {buy|买 }=agent OF {sell|卖 }; possession OF {buy|买 }=possession OF {sell|卖 }; cost OF {buy|买 }=cost OF {sell|卖 }.
Identical representation - 1
W_E=smuggleG_E=VDEF={transport|运送 :manner={guilty|有罪 }}
W_E=drugG_E=NDEF={addictive|嗜好物 :modifier={guilty|有罪 }}
Identical representation - 2
W_E=smuggling of drugsG_E=NDEF={fact|事情 :CoEvent={transport|运送 :
manner={guilty|有罪 },patient={addictive|嗜好物 :modifier={guilty|有罪 }}}}
W_E=drug smugglerG_E=NDEF={community|团体 :{transport|运送 :agent={~},
manner={unlawful|非法 },patient={addictive|嗜好物 },purpose={sell|卖 }}}
Motivation to develop secondary resources
To check from different angles HowNet knowledge data for their preciseness and consistency
To provide users with tools for application Practible for any sense of any word
Secondary resources
Concept Relevance Calculator (CRC) Concept Similarity Measure (CSM) Query Expansion Tool (QET) Chinese Morphological Processor (CMP)Chinese Message Analyzer (CMA)
Concept similarity
doctor 2 <> dentist 0.300000doctor 1<> dentist 0.883333doctor 1<> nurse1 0.620000doctor 1<> nurse2 0.454545doctor 1<> patient 0.203636
walk <> run 0.144444walk <> jump 0.144444walk <> swim 0.130159walk <> fly 0.124444walk <> buy 0.018605
Conclusion
Extralinguistic knowledge is indispensable for HLT
The knowledge should be a system which is computer-oriented
It should be big enough, exemplary toy is useless
It can conduct computation of meaning