23
Automatic Extraction Automatic Extraction and Incorporation of and Incorporation of Purpose Data into Purpose Data into PurposeNet PurposeNet P. Kiran Mayee P. Kiran Mayee Rajeev Sangal Rajeev Sangal Soma Paul Soma Paul SCONLI3 JNU NEW DELHI

Automatic Extraction and Incorporation of Purpose Data into PurposeNet

  • Upload
    gaetan

  • View
    33

  • Download
    0

Embed Size (px)

DESCRIPTION

P. Kiran Mayee Rajeev Sangal Soma Paul. SCONLI3 JNU NEW DELHI. Automatic Extraction and Incorporation of Purpose Data into PurposeNet. INTRODUCTION . Purpose Need for a knowledge base of objects and actions in which the knowledge is organized around purpose. . PurposeNet. - PowerPoint PPT Presentation

Citation preview

Page 1: Automatic Extraction and Incorporation of  Purpose Data into PurposeNet

Automatic Extraction Automatic Extraction and Incorporation of and Incorporation of

Purpose Data into Purpose Data into PurposeNetPurposeNet

P. Kiran MayeeP. Kiran MayeeRajeev SangalRajeev Sangal

Soma PaulSoma Paul

SCONLI3 JNU NEW DELHI

Page 2: Automatic Extraction and Incorporation of  Purpose Data into PurposeNet

INTRODUCTION INTRODUCTION

PurposePurpose Need for a knowledge base of objects Need for a knowledge base of objects

and actions in which the knowledge and actions in which the knowledge is organized around purpose. is organized around purpose.

Page 3: Automatic Extraction and Incorporation of  Purpose Data into PurposeNet

PurposeNetPurposeNet PurposeNet is an intelligent PurposeNet is an intelligent

knowledge-based system dealing knowledge-based system dealing with specialized attributes of artifacts with specialized attributes of artifacts – namely, their purpose, purpose of – namely, their purpose, purpose of their types, components, their types, components, accessories, as also data about their accessories, as also data about their birth, processes, side-effects, birth, processes, side-effects, maintenance and result on maintenance and result on destruction. destruction.

Page 4: Automatic Extraction and Incorporation of  Purpose Data into PurposeNet

PurposeNetPurposeNet

Page 5: Automatic Extraction and Incorporation of  Purpose Data into PurposeNet

Building the PurposeNetBuilding the PurposeNet

Template DesigningTemplate Designing Revision & Refinement of templateRevision & Refinement of template Selection of DomainSelection of Domain Information Retrieval from WebInformation Retrieval from Web Ontology populationOntology population TestingTesting

Page 6: Automatic Extraction and Incorporation of  Purpose Data into PurposeNet

Need for AutomationNeed for Automation

Acquisition bottleneckAcquisition bottleneck Massive availability of textMassive availability of text Availability of purpose cuesAvailability of purpose cues

Page 7: Automatic Extraction and Incorporation of  Purpose Data into PurposeNet

Purpose data requiredPurpose data required

Artifact -- garage Artifact -- garage Purpose Purpose

Action -- storeAction -- store Upon -- vehicleUpon -- vehicle

Page 8: Automatic Extraction and Incorporation of  Purpose Data into PurposeNet

Purpose CuesPurpose Cues Word(s)Word(s) Lexical entities in a particular orderLexical entities in a particular order Classification Classification

Sentences beginning with artifact nameSentences beginning with artifact name Sentences ending with artifact nameSentences ending with artifact name Sentence containing artifact nameSentence containing artifact name Hidden CuesHidden Cues

Page 9: Automatic Extraction and Incorporation of  Purpose Data into PurposeNet

Sentences commencing with Sentences commencing with artifact nameartifact name

Page 10: Automatic Extraction and Incorporation of  Purpose Data into PurposeNet

Sentences ending with Sentences ending with artifact nameartifact name

We cut trees with an axe.

action upon artifact

Page 11: Automatic Extraction and Incorporation of  Purpose Data into PurposeNet

Sentences containing Sentences containing artifact nameartifact name

Use the air+pump to fill the tyre.

Use the <artifact> to <action> the <upon>

Page 12: Automatic Extraction and Incorporation of  Purpose Data into PurposeNet

Methodology for purpose Methodology for purpose data extractiondata extraction

Page 13: Automatic Extraction and Incorporation of  Purpose Data into PurposeNet

Algorithm for Purpose Data Algorithm for Purpose Data ExtractionExtraction

Algorithm PurpDataExtract(corpus)

Step1 : Read first sentence in Corpus. Step2 : Loop until end-of-corpus – 2a. if contains(sentence, artifact) and match( sentence, cuetable) then extract(sentence, artifact) extract(sentence, to_action) extract(sentence, to_upon) add_to_ontology(artifact, to_action, to_upon) else 2b. goto step 3. Step3 : Read next sentence

Page 14: Automatic Extraction and Incorporation of  Purpose Data into PurposeNet

DataData

Wikipedia – 249 files Wikipedia – 249 files Wordnet – 81,837 descriptionsWordnet – 81,837 descriptions Princeton noun-artifact corpus – Princeton noun-artifact corpus –

82,115 sentences82,115 sentences

Page 15: Automatic Extraction and Incorporation of  Purpose Data into PurposeNet

Observations – summary Observations – summary resultsresults

Corpus Name Corpus size purpsen PurpData Density (%)Wordnet 81837 1251 1.53Princeton 82115 1023 1.25Wikipedia 243 109 44.86

Page 16: Automatic Extraction and Incorporation of  Purpose Data into PurposeNet

Purpose Data Extraction Purpose Data Extraction MissesMisses

Corpus Name PurpHits Purpmiss ( artifact name absent ) Purpmiss ( action_upon absent )Wordnet 1251 nil 4Princeton 1023 41 17Wikipedia 109 44 3

Page 17: Automatic Extraction and Incorporation of  Purpose Data into PurposeNet

IE Metrics for ExtractionIE Metrics for Extraction

Corpus Name Precision F-measureWordnet 99.6 99.79Princeton 94.6 97.22Wikipedia 69.8 82.21

Page 18: Automatic Extraction and Incorporation of  Purpose Data into PurposeNet

Result BreakUp per Cue Result BreakUp per Cue ClassClass

Corpus NameWordnet 70.19 0.01 24.7Princeton 71.4 1.21 21.22Wikipedia 84.2 1.6 12.21

Class1(begin cue)

Class2(ending cue)

Class3(embedded cue)

Page 19: Automatic Extraction and Incorporation of  Purpose Data into PurposeNet

Comparison with manually Comparison with manually built Ontologybuilt Ontology

Exponential increase in speedExponential increase in speed

High Error RateHigh Error Rate

Page 20: Automatic Extraction and Incorporation of  Purpose Data into PurposeNet

IssuesIssues RedundancyRedundancy Primary purpose not always obtainedPrimary purpose not always obtained Pronouns and brand namesPronouns and brand names Correctness and consistency not Correctness and consistency not

guaranteedguaranteed One-to-one mapping assumedOne-to-one mapping assumed Other sentence manifestationsOther sentence manifestations

Page 21: Automatic Extraction and Incorporation of  Purpose Data into PurposeNet

Further EnhancementsFurther Enhancements Parsed inputParsed input Cues for hidden caseCues for hidden case Better artifact lookup listBetter artifact lookup list Multipage lookup for consistencyMultipage lookup for consistency Cloud computingCloud computing Automating other attributes of PurposeNetAutomating other attributes of PurposeNet

Page 22: Automatic Extraction and Incorporation of  Purpose Data into PurposeNet

ConclusionsConclusions A methodology was proposed for A methodology was proposed for

automated ontology population of automated ontology population of purposenetpurposenet

The methodology was implemented The methodology was implemented on three corporaon three corpora

The time-taken for purposenet The time-taken for purposenet 'purpose' ontology population was a 'purpose' ontology population was a fraction of that by manual methodsfraction of that by manual methods

The Error rate was found to be highThe Error rate was found to be high

Page 23: Automatic Extraction and Incorporation of  Purpose Data into PurposeNet

Thank YouThank You