The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Vrije Universiteit Amsterdam 6 th International Plain.

  • Published on
    27-Mar-2015

  • View
    212

  • Download
    0

Transcript

<ul><li>Slide 1</li></ul> <p>The Global Wordnet Grid: anchoring languages to universal meaning Piek Vossen Irion Technologies/Vrije Universiteit Amsterdam 6 th International Plain Language Conference, October 11-14 th, 2007, Amsterdam Slide 2 6 th International PLAIN language Conference 11-14 th October, Amsterdam Overview: Problem: effective language and communication From human to human From human to machine From machine to machine From human to machine and back to human, maybe via other machines... Solution: anchoring language to universal meaning Wordnets: network of words related through meaning The Global Wordnet Grid: wordnets for languages connected to each other through an ontology Future: Equal access to the knowledge and information on the Internet to all people, regardless of language and background Systems that start to understand language Slide 3 6 th International PLAIN language Conference 11-14 th October, Amsterdam Problem Slide 4 6 th International PLAIN language Conference 11-14 th October, Amsterdam Language is inherently vague and ambiguous Communication through language: mediates between the expectation of the Speaker and the Hearer =&gt; half a word is enough Language is not fully descriptive but minimally sufficient: Do not bother the Hearer with information that is already known =&gt; rely on background knowledge Use a minimal set of words and expressions to avoid memory overloading =&gt; words and expressions have multiple meaning Slide 5 6 th International PLAIN language Conference 11-14 th October, Amsterdam Concept in our head Plato with beard "gavagai" W.V.O.Quine (1964): inscrutability of reference rabbit with carrots and rosemary devine appearance announcing spring sweet pet wanna hug Understanding is fundamentally impossible Slide 6 6 th International PLAIN language Conference 11-14 th October, Amsterdam Full understanding is fundamentally impossible BUT? People do communicate... People even communicate with computers... As long as language is effective: meaning= to have the desired effect ! Link language to useful content! Slide 7 6 th International PLAIN language Conference 11-14 th October, Amsterdam What is effective computer-mediated language? Computers store information and knowledge in textual form: People search information and knowledge by 'querying' computers Effective Computer Mediated Communication (CMC) = find what you need and nothing else Computers analyze information and knowledge: Collect data and send alerts, reports and facts Computers connect people: Support communication across people by analyzing communication or translating languages Slide 8 6 th International PLAIN language Conference 11-14 th October, Amsterdam Strings Expression in language Words. Expression in language .Words Strings Concept Query Concept Information Seeker Information Provider Information ape . energy . mass . zebra Index of Strings Slide 9 6 th International PLAIN language Conference 11-14 th October, Amsterdam Strings Expression in language my cell phone. Expression in language .mobile Strings Concept QueryInformation Seeker Information Provider ape . mobile . zebra Index of Strings Conceptual match Linguistic mismatch Slide 10 6 th International PLAIN language Conference 11-14 th October, Amsterdam Strings Expression in language my cell phone. Expression in language .nerve cells Strings Concept QueryInformation Seeker Information Provider ape . cell . zebra Index of Strings Conceptual mismatch Linguistic match Slide 11 6 th International PLAIN language Conference 11-14 th October, Amsterdam Strings Expression in language police cell . Expression in language . nerve cells Strings Concept QueryInformation Seeker Information Provider ape . cell . zebra Index of Strings Conceptual mismatch Linguistic match Slide 12 6 th International PLAIN language Conference 11-14 th October, Amsterdam Strings Expression in language neuron . Expression in language .nerve cells Strings Concept QueryInformation Seeker Information Provider ape . cell . zebra Index of Strings Conceptual match Linguistic mismatch Slide 13 6 th International PLAIN language Conference 11-14 th October, Amsterdam Recall &amp; Precision query: cell Search engine for database with all documents cell phone mobile phones nerve cell police cell recall = doorsnede / relevant precision = doorsnede / gevonden foundintersectionrelevant Recall &lt; 20% for basic search engines! (Blair &amp; Maron 1985) Slide 14 6 th International PLAIN language Conference 11-14 th October, Amsterdam Useless dialogues with Alice-bot Slide 15 6 th International PLAIN language Conference 11-14 th October, Amsterdam It is useful to anchor meaning! Anchoring already takes place all over the world through standardization: measures and units: meter, liter, kilo terminological databases, legal definitions, contracts international cooperation ontologies: definition of the meaning of concepts in a formal knowledge presentation system, (1 st order logic) so that a computer can reason with it Slide 16 6 th International PLAIN language Conference 11-14 th October, Amsterdam Solution Slide 17 6 th International PLAIN language Conference 11-14 th October, Amsterdam How can we anchor the meaning of words? We can anchor words to each other: semantic network or wordnet We can anchor words to logical implications: a formal ontology Slide 18 6 th International PLAIN language Conference 11-14 th October, Amsterdam Relational model of meaning manwoman boygirl cat kitten dog puppy animal man woman boy meisje cat kitten dog puppy animal Slide 19 6 th International PLAIN language Conference 11-14 th October, Amsterdam Princeton WordNet Developed by George Miller and his team at Princeton University, as the implementation of a mental model of the lexicon Organized around the notion of a synset: a set of synonyms in a language that represent a single concept Semantic relations between concepts Covers over 100,000 concepts and over 120,000 English words Slide 20 6 th International PLAIN language Conference 11-14 th October, Amsterdam Wordnet: a network of semantically related words {conveyance;transport} {vehicle} {motor vehicle; automotive vehicle} {car; auto; automobile; machine; motorcar} {bumper} {car door} {car window} {car mirror} {armrest} {doorlock} {hinge; flexible joint} {cruiser; squad car; patrol car; police car; prowl car} {cab; taxi; hack; taxicab} Slide 21RIGID TYPE This husky is not a working dog =&gt; ROLE, NON-RIGID"&gt; 6 th International PLAIN language Conference 11-14 th October, Amsterdam Concepts by ontological observations Types and Roles among the hyponyms of dog in Wordnet: husky, lapdog; toy dog; hunting dog; working dog; dalmatian, coach dog, carriage dog; basenji; pug, pug-dog; Leonberg; Newfoundland; Great Pyrenees; spitz; griffon, Brussels griffon, Belgian griffon; corgi, Welsh corgi; poodle, poodle dog; Mexican hairless; pooch, doggie, doggy, barker, bow-wow; cur, mongrel, mutt Current WordNet treatment: (1) a husky is a kind of dog (2) a husky is a kind of working dog Whats wrong? (2) is defeasible, (1) is not: *This husky is not a dog =&gt; RIGID TYPE This husky is not a working dog =&gt; ROLE, NON-RIGID Slide 31 6 th International PLAIN language Conference 11-14 th October, Amsterdam Ontology versus wordnet Hierarchy of disjunct types: Canine PoodleDog; NewfoundlandDog; GermanShepherdDog; Husky Wordnet: NAMES for TYPES: {poodle} EN, {poedel} NL, {pudoru} JP ((instance x Poodle) LABELS for ROLES: {watchdog} EN, {waakhond} NL, {banken} JP ((instance x Canine) and (role x GuardingProcess)) Slide 32 6 th International PLAIN language Conference 11-14 th October, Amsterdam Properties of the Ontology Minimal: terms are distinguished by essential properties only Comprehensive: includes all distinct concepts types of all Grid languages Allows definitions via KIF of all words that express non-rigid, non-essential properties of types Logically valid, allows inferencing Slide 33 6 th International PLAIN language Conference 11-14 th October, Amsterdam Ontology versus Wordnet Not added to the type hierarchy: {straathond} NL (a dog that lives in the streets) ((instance x Canine) and (habitat x Street)) Added to the type hierarchy: {klunen} NL (to walk on skates from one frozen body to the next over land) KluunProcess =&gt; WalkProcess Axioms: (and (instance x Human) (instance y Walk) (instance z Skates) (wear x z) (instance s1 Skate) (instance s2 Skate) (before s1 y) (before y s2) etc National dishes, customs, games,.... Slide 34 6 th International PLAIN language Conference 11-14 th October, Amsterdam Ontology versus Wordnet Refer to sets of types in specific circumstances or to concept that are dependent on these types, next to {rivierwater} NL there are many others: {theewater} NL (water used for making tea) {koffiewater} NL (water used for making coffee) {bluswater} NL (water used for making extinguishing file) Relate to linguistic phenomena: gender, perspective, aspect, diminutives, politeness, pejoratives, part-of-speech constraints Slide 35 6 th International PLAIN language Conference 11-14 th October, Amsterdam {teacher} EN ((instance x Human) and (agent x TeachingProcess)) {Lehrer} DE ((instance x Man) and (agent x TeachingProcess)) {Lehrerin} DE ((instance x Woman) and (agent x TeachingProcess)) KIF expression for gender marking Slide 36 6 th International PLAIN language Conference 11-14 th October, Amsterdam KIF expression for perspective sell: subj(x), direct obj(z),indirect obj(y) buy: subj(y), direct obj(z),indirect obj(x) FinancialTransaction (and (instance x Human)(instance y Human) (instance z Entity) (instance e FinancialTransaction) (source x e) (destination y e) (patient e) The same process but a different perspective by subject and object realization: marry in Russian two verbs, apprendre in French can mean teach and learn Slide 37 6 th International PLAIN language Conference 11-14 th October, Amsterdam Advantages of the Global Wordnet Grid Shared and uniform world knowledge: universal inferencing uniform text analysis and interpretation More compact and less redundant databases More clear notion how languages map to the knowledge better criteria for expressing knowledge better criteria for understanding variation Slide 38 6 th International PLAIN language Conference 11-14 th October, Amsterdam Future Slide 39 6 th International PLAIN language Conference 11-14 th October, Amsterdam Synonyms, Semantic network thesaurus golf club(s) Tiger Woods golf sticks Language technology: a hole in one! golf clubs Linguistic analysis Golf at the club clubs for golf Slide 40 6 th International PLAIN language Conference 11-14 th October, Amsterdam Index concepts rather than words Meaning of a word in context: Domain of the document: Juventus =&gt; football Topic of the paragraph: transfer scandal =&gt; business, crime Phrase: linguistically-motivated combination of words: [wing player] football player in [police cell] jail Topic of the query: Can I order chicken wings? =&gt; food Phrase: [chicken wings] dish Slide 41 6 th International PLAIN language Conference 11-14 th October, Amsterdam dog watchdog poodle street dog dachshund lapdog short hair dachshund long hair dachshund Expansion from a type to roles hunting dog Expansion with clear hyponymy puppy bitch Slide 42 6 th International PLAIN language Conference 11-14 th October, Amsterdam dog watchdog poodle street dog dachshund lapdog short hair dachshund long hair dachshund Expansion from a role to types and other roles hunting dog Expansion with clear hyponymy puppy bitch Slide 43 6 th International PLAIN language Conference 11-14 th October, Amsterdam Ontology Texts Objects in reality Thought Expression (keitaidenwa ) Knowledge &amp; information Useful and effective behavior: -reason over knowledge -collect information and data -deliver services and be helpful Slide 44 6 th International PLAIN language Conference 11-14 th October, Amsterdam Automotive ontology: (http://www.ontoprise.de) Slide 45 6 th International PLAIN language Conference 11-14 th October, Amsterdam Who uses ontologies? Slide 46 6 th International PLAIN language Conference 11-14 th October, Amsterdam Make word meanings effective! Irion Technologies makes smart language technology solutions: Knowledge mining: automatic extraction of knowledge from text Cooperative dialogue systems: Access to information and services: regardless of choice of words regardless of the structuring of the information possibly using a given structuring Cooperates with the user: Ask the user for help, instructions, confirmation and explanations Slide 47 6 th International PLAIN language Conference 11-14 th October, Amsterdam Docs Text Phrases Semantic Network Words Concepts Grammar Domain Classifier DomainParsing Concept Detection cell phone Tele Commu- nication Concepts accessories repair Ontology Concepts Relations Facts support ModelPriceIn stock Fact Extraction Strings Slide 48 6 th International PLAIN language Conference 11-14 th October, Amsterdam Dialogue system Dialogue Manager Can I help you? My head phone is broke. I want to buy a new one. Would you like repair or products? Can yousay more about products? It is for my cell phone. Can you give more details? It is a Nokia 6110 I got the following accessoires for you. Please have a look. User Model -Intention -Satisfaction -Emotion Information State: -Positive -Negative -Relations That is not what I want! Question Analysis Topic detection Search Engine reparair information accessories products Website Text Analysis Word mobile head phone Concept Slide 49 6 th International PLAIN language Conference 11-14 th October, Amsterdam Dialogue system that cooperates with user: Detect intention: complaint, buy, support, information Measure satisfaction: happy, emotions Create more context than simple key words and deliver more precision: answers instead of hits. Communicative dialog system Slide 50 6 th International PLAIN language Conference 11-14 th October, Amsterdam Prevent deadlocks: Detects vagueness and ambiguity (what meaning of cell?) Detect topic changes Uses negative feedback: No jails, I want cell phones! Can handle out-of-domain questions (users do not know what the system knows) : "We do not have hotel rooms but we do have electronic equipment". "No, we do not have portophones but we do have other electronic equipement such as cell phones" Communicative dialog system hotel room room space equipment cell phoneportophone object Slide 51 6 th International PLAIN language Conference 11-14 th October, Amsterdam THANK YOU FOR YOUR ATTENTION </p>

Recommended

View more >