Upload
amit-sheth
View
14.450
Download
1
Embed Size (px)
DESCRIPTION
Amit Sheth, "The Mysteries of Metadata,"Workshop (Tutorial) at Content World 2001, Burlingame, CA. May 15, 2001
Citation preview
Confidential HP
The Mysteries of MetadataThe Mysteries of MetadataWorkshop at Content World 2001 Burlingame CA May 15 2001Workshop at Content World 2001 Burlingame CA May 15 2001
Amit Shethamittaaleecom
FounderCEO Taalee (wwwtaaleecom) [Taalee is now Semagix wwwsemagixcom ]
Also Director Large Scale Distributed Information Systems (LSDIS) Lab University Of Georgia(lsdiscsugaedu)
Metadata Extraction is a patented technology of Taalee IncSemantic Engine and WorldModel are trademarks of Taale Inc
HP 2
Workshop Agenda
What is Metadata
Metadata Descriptions and Standards
Metadata StorageExchangeInfrastructure
(Automated) Metadata CreationExtractionTagging
Metadata UsageApplications
HP 3
What is Metadata
Data about dataStatements contextsRecursive ndash data about ldquodata about datardquo
ApplicationsContent managementCataloguingInformation retrieval searchhellip
A Web content repository without metadata is like a library without an index - Jack Jia IWOV
HP 4
Information Interoperabilitykey metadata objective and benefit
System
Syntax
Structure
Semantics Protocols Metadata Domain ModelingOntologies
HP 5
Semantics
Meaning Understanding
Facts Context Reasoning
Related to exchange usage application
HP 6
A metadata classification
Data (Heterogeneous TypesMedia)(Heterogeneous TypesMedia)
Content Independent Metadata (creation(creation--date location typedate location type--ofof--sensor)sensor)
Content Dependent Metadata (size max colors rows columns)(size max colors rows columns)
Direct Content Based Metadata(inverted lists document vectors WAIS Glimpse LSI)(inverted lists document vectors WAIS Glimpse LSI)
Domain Independent (structural) Metadata(C++ class(C++ class--subclass relationships HTMLSGMLsubclass relationships HTMLSGML
Document Type Definitions C program structure)Document Type Definitions C program structure)
Domain Specific Metadataarea population (Census)area population (Census)
landland--cover relief (GIS)metadata cover relief (GIS)metadata concept descriptions from ontologiesconcept descriptions from ontologies
OntologiesClassificationsClassificationsDomain ModelsDomain Models
User
Move in thisMove in thisdirection todirection to
tackletackleinformationinformation
overloadoverload
HP 7
Types of Metadata for digital media
Media type-specific metadataegtexture of imagesfont sizehellip
Media processing-specific metadataegsearch retrieval personalized filtering
Content Specific metadataegrocket related video and documents
HP 8
Metadata for Digital DataMetadata for Digital Metadata for Digital Data
Metadata Data Type Metadata TypeQ-Features [Jain and Hampapur] Image Video Domain SpecificR-Features [Jain and Hampapur] Image Video Domain IndependentMeta-Features [Jain and Hampapur] Image Video Content IndependentImpression Vector [Kiyoki et al] Image Content DescriptiveNDVI Spatial Registration [Anderson and Stonebraker] Image Domain SpecificSpeech Feature Index [Glavitsch et al] Audio Direct Content BasedTopic Change Indices [Chen et al] Audio Direct Content BasedDocument Vectors [ Deerwester et al] Text Direct Content BasedInverted Indices [Kahle and Medlar] Text Direct Content BasedContent Classification Metadata [Bohm and Rakow] MultiMedia Domain SpecificDocument Composition Metadata [Bohm and Rakow] MultiMedia Domain IndependentMetadata Templates [Ordille and Miller] Media Independent Domain SpecificLand Cover Relief [Sheth and Kashyap] Media Independent Domain SpecificParent Child Relationships [Shklar et al] Text Domain IndependentContexts [Sciore et al Kashyap and Sheth] Structured Domain SpecificConcepts from Cyc [Collet et al] Structured Domain SpecificUserrsquos Data Attributes [Shoens et al] Text Structured Domain SpecificDomain Specific Ontologies [Mena et al] Media Independent Domain Specific
HP 9
Types of Specs and Standards(or MetaModels)
Domain Independent (MCF) RDF MOF DublinCore
Media Specific MPEG4 MPEG7 VoiceXML
DomainIndustry Specific (metamodels) MARC (Library) FGDC and UDK (Geographic) NewsML (News) PRISM (Publishing)
Application Specific ICE (Syndication)
ExchangeSharing XCM XMI
Orthogonal(Other) RDFS namespaces ontologies domain models (DAML OIL)
HP 10
what RDF can do for metadata
Designed to impose structural constraint on syntax to support consistent encoding exchange and processingof metadata
Domain Independent Metadata standard
HP 11
RDF (Resource Description Format)
PropertyValueResource
bullRDF data consists of nodes and attached attributevalue pairs
bullNodes can be any web resources (pages servers basically anything for which you can give a URI) even other instances of metadata
bullAttributes are named properties of the nodes and their values are either atomic (text strings numbers etc) or other resources or metadata instances
HP 12
RDF Example 1
URIAMITdccreator
dctitleMysteries of Metadata
URITALK
ltXML version=lsquo10rsquogt ltrdfRDF xmlnsrdf = ldquohttpwwww3orgTRREC-rdf-syntaxrdquo xmlnsdc = ldquohttppurlorgdcelements10rdquogt ltrdfDescription rdfabout = ldquoURITALKrdquogt ltdctitlegtMysteries of Metadataltdctitlegt ltdccreator rdfresource = ldquoURIAMITrdquogt ltrdfDescriptiongt ltrdfRDFgt
HP 13
RDF Example 2
URIAMITdccreator
URILIB amittaaleecom
BIBEmailBIBName
BIBAff
dctitleMysteries of Metadata
URITALK
Amit Sheth
HP 14
RDFS (RDF Schema)
Enables resource description communities to define
(and share) vocabularies (museum library e-
commercehellip)
Vocabulary (in RDFS) = the meaning characteristics
and relationships of a set of properties
HP 15
RDF Based Web
HTML
Resources
RDFXMLDescriptions
RDFSchemas
Sourcehttpwwww3crlacuk
HP 16
Dublin Core Metadata Initiative
Simple element set designed for resource description
International inter-discipline W3C community consensus
ldquoSemanticrdquo interface among resource description communities (very limited form of semantics)
Sourcewwwdesireorg
HP 17
Dublin Core RDF
ltxmlgtltnamespace href = httpw3orgrdf-schema as = RDFgtltnamespace href = httpmetadatanetDC as = DCgtltRDFAbbreviatedgtltRDFAssertion RDFHREF = httpwwwmysitecommydochtmlDCTitle = Ive Never Metadata Ive Never LikedldquoDCCreator = Mary CrystalldquoDCSubject = Metadata Dublin Core StuffgtltRDFAbbreviatedgtltxmlgt
HP 18
MOF (Metadata Object Facility) and XMI
MOF models metadata using a subset of UML that is
relevant to modeling metadata (class models - classes
associations and subtyping) a set of rules for mapping
the elements of the MOF Core to CORBA IDL
XML Metadata Interchange (XMI) is an extension of the
MOF into the XML space
HP 19
NewsML
NewsML is a packaging and metadata format for news contentNewsML is developed by the International Press Telecommunications Council (IPTC) a consortium of news providers mostly in the print or wire-service industries Since it deals only with packaging and metadata NewsML is complementary both to news content formats like NITF and to syndication protocols like ICE
HP 20
NewsMLhellip
It can be used by news providers to combine their pictures video text graphics and audio files in news output available on web sites mobile phones high end desktops interactive television and any other deviceaccurate objective set of description tools which help qualify the information and make the search more preciseNewsML allows a range of metadata to be attached to a multi-media story including a detailed computer-readable description of what an item is about
HP 21
Example of the end-to-end flow -NewsML
The content providersupplies NewsML packaged media content to the operator The content is categorized as current events finance sport etc and updated hourly
The operator receives NewsML data from the content provider The content server automatically pushes updated news articles to all news service subscribers
Consumers sign up for the news service directly on the device When using the news service the user browses through the categories and reads the news articles The news articles are presented in a continuous flow (one after the other) without end-user interaction
Sourcehttpwwwmediabrickscom
HP 22
PRISM
Publishing Requirements for Industry Standard MetadataVersion 10 April 2001Authors IDEAlliance (Adobe Vignette Kinecta et al)Idea ldquoa standard for interoperable content description interchange and reuse in both traditional and electronic publishing contextsrdquoWeb site httpwwwprismstandardorg
HP 23
PRISM Design
Built on existing standards like Dublin Core (DC) RDF XMLDesigned to be used in a simple straightforward way over the InternetCompatible with NewsMLIntegrates easily with ICE (for syndication)Vocabulary
Basic DCExtensions ldquoControlled Vocabulariesrdquo eg ldquoNorth American Industrial Classification Systemldquo (NAICS)
HP 24
PRISM Example
ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10
xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt
ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt
ltrdfDescriptiongtltrdfRDFgt
(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)
HP 25
VoiceXML
A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech
(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input
Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications
High- level voice-specific language simplifies application development
Source httpwwwvoicexmlorg
HP 26
Voice Based Internet Applications
Source httpwwwvoicexmlorg
HP 27
Voice XML Metadata
Voice Specific metadata
Supports Syntactic interoperablity
Text data to voice data
Voice XML = XML + Voice Metadata
HP 28
VoiceXML ndash Possible Services
Information retrieval ndash News sports traffic stock quotes
e- Transactions (e- commerce e- tailing etc)
Financial banking stock trading
Catalog browsing (generally as an adjunct to paper)
Telephone services
Personal voice dialing One- number find- me services
Intranet ndash Inventory HR services corporate portals
Unification ndash My Whatever personal portals personal agents unified messaging
Source httpwwwvoicexmlorg
HP 29
MPEG7
set of description scheme and descriptors to describe the content of multimedia data
Provides a language to specify description schemes
A scheme for coding the description
HP 30
Application Examples for MPEG7
A few application examples are
Digital libraries (image catalog musical dictionary)
Multimedia directory services (eg yellow pages)
Broadcast media selection (radio channel TV channel)
HP 31
Information and Content Exchange (ICE)
Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax
Authors Adobe Kinecta MS Sun Vignette et al
Status latest spec version 11 May 2000 submitted to W3C for review
Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip
Web Site httpwwwicestandardorg
HP 32
What is the ICE Protocol
Syndication Protocol for communication between
Syndicators and Subscribers
Metadata to define
roles and responsibilities of involved parties Subscriber vs
Syndicator Requestor vs Responder Sender vs Receiver
format and method of content exchange (eg sequenced
packages pull vs push model)
HP 33
ICE Applications
ICE vocabulary + domain vocabulary = complete application
ICEestablishes and manages the syndication
delivers data
logs events
=gt content-independent metadata
industry-specific vocabulary defines the content =gt domain-specific metadata
Source httpwwwicestandardorg
HP 34
ICE Explained
ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information
Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm
ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe
timestamp=05-15-2001T110001 iceversion=10 gt
ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321
subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt
ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt
PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)
ltcomic-stripgtltice-itemgt ltice-item-groupgt
ltice-responsegt ltice-payloadgt
Content (domain-specific
metadata)
HP 36
XCM (eXtended Content Management)
a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are
Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval
Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle
Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability
Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html
HP 37
XCM
eXtended Content Management
Content DevelopmentManagement
Content DeliveryApplication ContentManagement
Content AuthoringDigital Asset Management
Software ConfigurationManagement
Document ProcessManagement
Metadata ManagementRecombinationPersonalization
Edge Network Delivery
Streaming Media DeliveryCaching
Source httpwwwvignettecom
HP 38
Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain
FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph
hydrography transportation
TitleTitle Dakota Aquifer
Online linkageOnline linkagehttpgisdasckgsukansedudasc
Direct Spatial Reference MethodDirect Spatial Reference Method Vector
Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator
hellip hellip hellip
UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation
TopicTopic Dakota Aquifer
AdressAdress IdIdhttpgisdasckgsukansedudasc
Measuring TechniquesMeasuring Techniques Vector
CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator
hellip hellip hellip
Kansas StateKansas State
HP 39
Different views of Metadata
Domain Independent Specifications (RDF)
FrameworksInfrastructures (XCM)
MetadataApplication Specific
ICE
Media Specific
MPEG7 VoiceXML
Domain Specific
NewsML FGDCUDK
HP 40
Creating and Serving Metadata to Power the Life-cycle of Content
Taalee Infrastructure Services Taalee Content Applications
Where is the content
Whose is it
ProduceAggregate
CatalogIndex
What other content is it related to
Integrate Syndicate
What is the right content for this
user
Personalize
What is the best way to
monetize this interaction
Interactive Marketing
BroadcastWirelineWirelessInteractive TV
Taalee Semantic MetaBase
HP 41
Taaleersquos Intelligent Content Process
HP 42
Metadata Creation and Semanticization
bull Automatic Content ClassificationCategorization
bull Metadata CreationExtractionTypes of metadata created
Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc
HP 43
FormsTypesIngest of Content
Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)
HP 44
Content HandlingIngest
InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents
Centralized Distributed MobileMigratory
HP 45
Information Extraction for Metadata Creation
GlobalEnterpriseWeb Repositories
METADATAMETADATA
EXTRACTORSEXTRACTORS
Digital Maps
NexisUPIAP
Documents
Digital Audios
Data Stores
Digital Videos
Digital Images
HP 46
Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach
INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT
NATIONAL PREPAREDNESS LEVEL II
CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection
SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues
CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning
LAYOUT
Date =gt day month int lsquorsquo int
Traditional TextCategorization
StatisticalAI Techniques
Classify Place ina taxonomy
feed
Customer Training
Set
RoutingDistribution
Customer Article Feed
4715
Standard Metadata
Feed Source iSyndicate
Posted Date 11202000
Classification of Article 4715
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 2
Workshop Agenda
What is Metadata
Metadata Descriptions and Standards
Metadata StorageExchangeInfrastructure
(Automated) Metadata CreationExtractionTagging
Metadata UsageApplications
HP 3
What is Metadata
Data about dataStatements contextsRecursive ndash data about ldquodata about datardquo
ApplicationsContent managementCataloguingInformation retrieval searchhellip
A Web content repository without metadata is like a library without an index - Jack Jia IWOV
HP 4
Information Interoperabilitykey metadata objective and benefit
System
Syntax
Structure
Semantics Protocols Metadata Domain ModelingOntologies
HP 5
Semantics
Meaning Understanding
Facts Context Reasoning
Related to exchange usage application
HP 6
A metadata classification
Data (Heterogeneous TypesMedia)(Heterogeneous TypesMedia)
Content Independent Metadata (creation(creation--date location typedate location type--ofof--sensor)sensor)
Content Dependent Metadata (size max colors rows columns)(size max colors rows columns)
Direct Content Based Metadata(inverted lists document vectors WAIS Glimpse LSI)(inverted lists document vectors WAIS Glimpse LSI)
Domain Independent (structural) Metadata(C++ class(C++ class--subclass relationships HTMLSGMLsubclass relationships HTMLSGML
Document Type Definitions C program structure)Document Type Definitions C program structure)
Domain Specific Metadataarea population (Census)area population (Census)
landland--cover relief (GIS)metadata cover relief (GIS)metadata concept descriptions from ontologiesconcept descriptions from ontologies
OntologiesClassificationsClassificationsDomain ModelsDomain Models
User
Move in thisMove in thisdirection todirection to
tackletackleinformationinformation
overloadoverload
HP 7
Types of Metadata for digital media
Media type-specific metadataegtexture of imagesfont sizehellip
Media processing-specific metadataegsearch retrieval personalized filtering
Content Specific metadataegrocket related video and documents
HP 8
Metadata for Digital DataMetadata for Digital Metadata for Digital Data
Metadata Data Type Metadata TypeQ-Features [Jain and Hampapur] Image Video Domain SpecificR-Features [Jain and Hampapur] Image Video Domain IndependentMeta-Features [Jain and Hampapur] Image Video Content IndependentImpression Vector [Kiyoki et al] Image Content DescriptiveNDVI Spatial Registration [Anderson and Stonebraker] Image Domain SpecificSpeech Feature Index [Glavitsch et al] Audio Direct Content BasedTopic Change Indices [Chen et al] Audio Direct Content BasedDocument Vectors [ Deerwester et al] Text Direct Content BasedInverted Indices [Kahle and Medlar] Text Direct Content BasedContent Classification Metadata [Bohm and Rakow] MultiMedia Domain SpecificDocument Composition Metadata [Bohm and Rakow] MultiMedia Domain IndependentMetadata Templates [Ordille and Miller] Media Independent Domain SpecificLand Cover Relief [Sheth and Kashyap] Media Independent Domain SpecificParent Child Relationships [Shklar et al] Text Domain IndependentContexts [Sciore et al Kashyap and Sheth] Structured Domain SpecificConcepts from Cyc [Collet et al] Structured Domain SpecificUserrsquos Data Attributes [Shoens et al] Text Structured Domain SpecificDomain Specific Ontologies [Mena et al] Media Independent Domain Specific
HP 9
Types of Specs and Standards(or MetaModels)
Domain Independent (MCF) RDF MOF DublinCore
Media Specific MPEG4 MPEG7 VoiceXML
DomainIndustry Specific (metamodels) MARC (Library) FGDC and UDK (Geographic) NewsML (News) PRISM (Publishing)
Application Specific ICE (Syndication)
ExchangeSharing XCM XMI
Orthogonal(Other) RDFS namespaces ontologies domain models (DAML OIL)
HP 10
what RDF can do for metadata
Designed to impose structural constraint on syntax to support consistent encoding exchange and processingof metadata
Domain Independent Metadata standard
HP 11
RDF (Resource Description Format)
PropertyValueResource
bullRDF data consists of nodes and attached attributevalue pairs
bullNodes can be any web resources (pages servers basically anything for which you can give a URI) even other instances of metadata
bullAttributes are named properties of the nodes and their values are either atomic (text strings numbers etc) or other resources or metadata instances
HP 12
RDF Example 1
URIAMITdccreator
dctitleMysteries of Metadata
URITALK
ltXML version=lsquo10rsquogt ltrdfRDF xmlnsrdf = ldquohttpwwww3orgTRREC-rdf-syntaxrdquo xmlnsdc = ldquohttppurlorgdcelements10rdquogt ltrdfDescription rdfabout = ldquoURITALKrdquogt ltdctitlegtMysteries of Metadataltdctitlegt ltdccreator rdfresource = ldquoURIAMITrdquogt ltrdfDescriptiongt ltrdfRDFgt
HP 13
RDF Example 2
URIAMITdccreator
URILIB amittaaleecom
BIBEmailBIBName
BIBAff
dctitleMysteries of Metadata
URITALK
Amit Sheth
HP 14
RDFS (RDF Schema)
Enables resource description communities to define
(and share) vocabularies (museum library e-
commercehellip)
Vocabulary (in RDFS) = the meaning characteristics
and relationships of a set of properties
HP 15
RDF Based Web
HTML
Resources
RDFXMLDescriptions
RDFSchemas
Sourcehttpwwww3crlacuk
HP 16
Dublin Core Metadata Initiative
Simple element set designed for resource description
International inter-discipline W3C community consensus
ldquoSemanticrdquo interface among resource description communities (very limited form of semantics)
Sourcewwwdesireorg
HP 17
Dublin Core RDF
ltxmlgtltnamespace href = httpw3orgrdf-schema as = RDFgtltnamespace href = httpmetadatanetDC as = DCgtltRDFAbbreviatedgtltRDFAssertion RDFHREF = httpwwwmysitecommydochtmlDCTitle = Ive Never Metadata Ive Never LikedldquoDCCreator = Mary CrystalldquoDCSubject = Metadata Dublin Core StuffgtltRDFAbbreviatedgtltxmlgt
HP 18
MOF (Metadata Object Facility) and XMI
MOF models metadata using a subset of UML that is
relevant to modeling metadata (class models - classes
associations and subtyping) a set of rules for mapping
the elements of the MOF Core to CORBA IDL
XML Metadata Interchange (XMI) is an extension of the
MOF into the XML space
HP 19
NewsML
NewsML is a packaging and metadata format for news contentNewsML is developed by the International Press Telecommunications Council (IPTC) a consortium of news providers mostly in the print or wire-service industries Since it deals only with packaging and metadata NewsML is complementary both to news content formats like NITF and to syndication protocols like ICE
HP 20
NewsMLhellip
It can be used by news providers to combine their pictures video text graphics and audio files in news output available on web sites mobile phones high end desktops interactive television and any other deviceaccurate objective set of description tools which help qualify the information and make the search more preciseNewsML allows a range of metadata to be attached to a multi-media story including a detailed computer-readable description of what an item is about
HP 21
Example of the end-to-end flow -NewsML
The content providersupplies NewsML packaged media content to the operator The content is categorized as current events finance sport etc and updated hourly
The operator receives NewsML data from the content provider The content server automatically pushes updated news articles to all news service subscribers
Consumers sign up for the news service directly on the device When using the news service the user browses through the categories and reads the news articles The news articles are presented in a continuous flow (one after the other) without end-user interaction
Sourcehttpwwwmediabrickscom
HP 22
PRISM
Publishing Requirements for Industry Standard MetadataVersion 10 April 2001Authors IDEAlliance (Adobe Vignette Kinecta et al)Idea ldquoa standard for interoperable content description interchange and reuse in both traditional and electronic publishing contextsrdquoWeb site httpwwwprismstandardorg
HP 23
PRISM Design
Built on existing standards like Dublin Core (DC) RDF XMLDesigned to be used in a simple straightforward way over the InternetCompatible with NewsMLIntegrates easily with ICE (for syndication)Vocabulary
Basic DCExtensions ldquoControlled Vocabulariesrdquo eg ldquoNorth American Industrial Classification Systemldquo (NAICS)
HP 24
PRISM Example
ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10
xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt
ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt
ltrdfDescriptiongtltrdfRDFgt
(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)
HP 25
VoiceXML
A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech
(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input
Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications
High- level voice-specific language simplifies application development
Source httpwwwvoicexmlorg
HP 26
Voice Based Internet Applications
Source httpwwwvoicexmlorg
HP 27
Voice XML Metadata
Voice Specific metadata
Supports Syntactic interoperablity
Text data to voice data
Voice XML = XML + Voice Metadata
HP 28
VoiceXML ndash Possible Services
Information retrieval ndash News sports traffic stock quotes
e- Transactions (e- commerce e- tailing etc)
Financial banking stock trading
Catalog browsing (generally as an adjunct to paper)
Telephone services
Personal voice dialing One- number find- me services
Intranet ndash Inventory HR services corporate portals
Unification ndash My Whatever personal portals personal agents unified messaging
Source httpwwwvoicexmlorg
HP 29
MPEG7
set of description scheme and descriptors to describe the content of multimedia data
Provides a language to specify description schemes
A scheme for coding the description
HP 30
Application Examples for MPEG7
A few application examples are
Digital libraries (image catalog musical dictionary)
Multimedia directory services (eg yellow pages)
Broadcast media selection (radio channel TV channel)
HP 31
Information and Content Exchange (ICE)
Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax
Authors Adobe Kinecta MS Sun Vignette et al
Status latest spec version 11 May 2000 submitted to W3C for review
Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip
Web Site httpwwwicestandardorg
HP 32
What is the ICE Protocol
Syndication Protocol for communication between
Syndicators and Subscribers
Metadata to define
roles and responsibilities of involved parties Subscriber vs
Syndicator Requestor vs Responder Sender vs Receiver
format and method of content exchange (eg sequenced
packages pull vs push model)
HP 33
ICE Applications
ICE vocabulary + domain vocabulary = complete application
ICEestablishes and manages the syndication
delivers data
logs events
=gt content-independent metadata
industry-specific vocabulary defines the content =gt domain-specific metadata
Source httpwwwicestandardorg
HP 34
ICE Explained
ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information
Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm
ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe
timestamp=05-15-2001T110001 iceversion=10 gt
ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321
subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt
ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt
PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)
ltcomic-stripgtltice-itemgt ltice-item-groupgt
ltice-responsegt ltice-payloadgt
Content (domain-specific
metadata)
HP 36
XCM (eXtended Content Management)
a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are
Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval
Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle
Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability
Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html
HP 37
XCM
eXtended Content Management
Content DevelopmentManagement
Content DeliveryApplication ContentManagement
Content AuthoringDigital Asset Management
Software ConfigurationManagement
Document ProcessManagement
Metadata ManagementRecombinationPersonalization
Edge Network Delivery
Streaming Media DeliveryCaching
Source httpwwwvignettecom
HP 38
Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain
FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph
hydrography transportation
TitleTitle Dakota Aquifer
Online linkageOnline linkagehttpgisdasckgsukansedudasc
Direct Spatial Reference MethodDirect Spatial Reference Method Vector
Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator
hellip hellip hellip
UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation
TopicTopic Dakota Aquifer
AdressAdress IdIdhttpgisdasckgsukansedudasc
Measuring TechniquesMeasuring Techniques Vector
CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator
hellip hellip hellip
Kansas StateKansas State
HP 39
Different views of Metadata
Domain Independent Specifications (RDF)
FrameworksInfrastructures (XCM)
MetadataApplication Specific
ICE
Media Specific
MPEG7 VoiceXML
Domain Specific
NewsML FGDCUDK
HP 40
Creating and Serving Metadata to Power the Life-cycle of Content
Taalee Infrastructure Services Taalee Content Applications
Where is the content
Whose is it
ProduceAggregate
CatalogIndex
What other content is it related to
Integrate Syndicate
What is the right content for this
user
Personalize
What is the best way to
monetize this interaction
Interactive Marketing
BroadcastWirelineWirelessInteractive TV
Taalee Semantic MetaBase
HP 41
Taaleersquos Intelligent Content Process
HP 42
Metadata Creation and Semanticization
bull Automatic Content ClassificationCategorization
bull Metadata CreationExtractionTypes of metadata created
Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc
HP 43
FormsTypesIngest of Content
Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)
HP 44
Content HandlingIngest
InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents
Centralized Distributed MobileMigratory
HP 45
Information Extraction for Metadata Creation
GlobalEnterpriseWeb Repositories
METADATAMETADATA
EXTRACTORSEXTRACTORS
Digital Maps
NexisUPIAP
Documents
Digital Audios
Data Stores
Digital Videos
Digital Images
HP 46
Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach
INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT
NATIONAL PREPAREDNESS LEVEL II
CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection
SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues
CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning
LAYOUT
Date =gt day month int lsquorsquo int
Traditional TextCategorization
StatisticalAI Techniques
Classify Place ina taxonomy
feed
Customer Training
Set
RoutingDistribution
Customer Article Feed
4715
Standard Metadata
Feed Source iSyndicate
Posted Date 11202000
Classification of Article 4715
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 3
What is Metadata
Data about dataStatements contextsRecursive ndash data about ldquodata about datardquo
ApplicationsContent managementCataloguingInformation retrieval searchhellip
A Web content repository without metadata is like a library without an index - Jack Jia IWOV
HP 4
Information Interoperabilitykey metadata objective and benefit
System
Syntax
Structure
Semantics Protocols Metadata Domain ModelingOntologies
HP 5
Semantics
Meaning Understanding
Facts Context Reasoning
Related to exchange usage application
HP 6
A metadata classification
Data (Heterogeneous TypesMedia)(Heterogeneous TypesMedia)
Content Independent Metadata (creation(creation--date location typedate location type--ofof--sensor)sensor)
Content Dependent Metadata (size max colors rows columns)(size max colors rows columns)
Direct Content Based Metadata(inverted lists document vectors WAIS Glimpse LSI)(inverted lists document vectors WAIS Glimpse LSI)
Domain Independent (structural) Metadata(C++ class(C++ class--subclass relationships HTMLSGMLsubclass relationships HTMLSGML
Document Type Definitions C program structure)Document Type Definitions C program structure)
Domain Specific Metadataarea population (Census)area population (Census)
landland--cover relief (GIS)metadata cover relief (GIS)metadata concept descriptions from ontologiesconcept descriptions from ontologies
OntologiesClassificationsClassificationsDomain ModelsDomain Models
User
Move in thisMove in thisdirection todirection to
tackletackleinformationinformation
overloadoverload
HP 7
Types of Metadata for digital media
Media type-specific metadataegtexture of imagesfont sizehellip
Media processing-specific metadataegsearch retrieval personalized filtering
Content Specific metadataegrocket related video and documents
HP 8
Metadata for Digital DataMetadata for Digital Metadata for Digital Data
Metadata Data Type Metadata TypeQ-Features [Jain and Hampapur] Image Video Domain SpecificR-Features [Jain and Hampapur] Image Video Domain IndependentMeta-Features [Jain and Hampapur] Image Video Content IndependentImpression Vector [Kiyoki et al] Image Content DescriptiveNDVI Spatial Registration [Anderson and Stonebraker] Image Domain SpecificSpeech Feature Index [Glavitsch et al] Audio Direct Content BasedTopic Change Indices [Chen et al] Audio Direct Content BasedDocument Vectors [ Deerwester et al] Text Direct Content BasedInverted Indices [Kahle and Medlar] Text Direct Content BasedContent Classification Metadata [Bohm and Rakow] MultiMedia Domain SpecificDocument Composition Metadata [Bohm and Rakow] MultiMedia Domain IndependentMetadata Templates [Ordille and Miller] Media Independent Domain SpecificLand Cover Relief [Sheth and Kashyap] Media Independent Domain SpecificParent Child Relationships [Shklar et al] Text Domain IndependentContexts [Sciore et al Kashyap and Sheth] Structured Domain SpecificConcepts from Cyc [Collet et al] Structured Domain SpecificUserrsquos Data Attributes [Shoens et al] Text Structured Domain SpecificDomain Specific Ontologies [Mena et al] Media Independent Domain Specific
HP 9
Types of Specs and Standards(or MetaModels)
Domain Independent (MCF) RDF MOF DublinCore
Media Specific MPEG4 MPEG7 VoiceXML
DomainIndustry Specific (metamodels) MARC (Library) FGDC and UDK (Geographic) NewsML (News) PRISM (Publishing)
Application Specific ICE (Syndication)
ExchangeSharing XCM XMI
Orthogonal(Other) RDFS namespaces ontologies domain models (DAML OIL)
HP 10
what RDF can do for metadata
Designed to impose structural constraint on syntax to support consistent encoding exchange and processingof metadata
Domain Independent Metadata standard
HP 11
RDF (Resource Description Format)
PropertyValueResource
bullRDF data consists of nodes and attached attributevalue pairs
bullNodes can be any web resources (pages servers basically anything for which you can give a URI) even other instances of metadata
bullAttributes are named properties of the nodes and their values are either atomic (text strings numbers etc) or other resources or metadata instances
HP 12
RDF Example 1
URIAMITdccreator
dctitleMysteries of Metadata
URITALK
ltXML version=lsquo10rsquogt ltrdfRDF xmlnsrdf = ldquohttpwwww3orgTRREC-rdf-syntaxrdquo xmlnsdc = ldquohttppurlorgdcelements10rdquogt ltrdfDescription rdfabout = ldquoURITALKrdquogt ltdctitlegtMysteries of Metadataltdctitlegt ltdccreator rdfresource = ldquoURIAMITrdquogt ltrdfDescriptiongt ltrdfRDFgt
HP 13
RDF Example 2
URIAMITdccreator
URILIB amittaaleecom
BIBEmailBIBName
BIBAff
dctitleMysteries of Metadata
URITALK
Amit Sheth
HP 14
RDFS (RDF Schema)
Enables resource description communities to define
(and share) vocabularies (museum library e-
commercehellip)
Vocabulary (in RDFS) = the meaning characteristics
and relationships of a set of properties
HP 15
RDF Based Web
HTML
Resources
RDFXMLDescriptions
RDFSchemas
Sourcehttpwwww3crlacuk
HP 16
Dublin Core Metadata Initiative
Simple element set designed for resource description
International inter-discipline W3C community consensus
ldquoSemanticrdquo interface among resource description communities (very limited form of semantics)
Sourcewwwdesireorg
HP 17
Dublin Core RDF
ltxmlgtltnamespace href = httpw3orgrdf-schema as = RDFgtltnamespace href = httpmetadatanetDC as = DCgtltRDFAbbreviatedgtltRDFAssertion RDFHREF = httpwwwmysitecommydochtmlDCTitle = Ive Never Metadata Ive Never LikedldquoDCCreator = Mary CrystalldquoDCSubject = Metadata Dublin Core StuffgtltRDFAbbreviatedgtltxmlgt
HP 18
MOF (Metadata Object Facility) and XMI
MOF models metadata using a subset of UML that is
relevant to modeling metadata (class models - classes
associations and subtyping) a set of rules for mapping
the elements of the MOF Core to CORBA IDL
XML Metadata Interchange (XMI) is an extension of the
MOF into the XML space
HP 19
NewsML
NewsML is a packaging and metadata format for news contentNewsML is developed by the International Press Telecommunications Council (IPTC) a consortium of news providers mostly in the print or wire-service industries Since it deals only with packaging and metadata NewsML is complementary both to news content formats like NITF and to syndication protocols like ICE
HP 20
NewsMLhellip
It can be used by news providers to combine their pictures video text graphics and audio files in news output available on web sites mobile phones high end desktops interactive television and any other deviceaccurate objective set of description tools which help qualify the information and make the search more preciseNewsML allows a range of metadata to be attached to a multi-media story including a detailed computer-readable description of what an item is about
HP 21
Example of the end-to-end flow -NewsML
The content providersupplies NewsML packaged media content to the operator The content is categorized as current events finance sport etc and updated hourly
The operator receives NewsML data from the content provider The content server automatically pushes updated news articles to all news service subscribers
Consumers sign up for the news service directly on the device When using the news service the user browses through the categories and reads the news articles The news articles are presented in a continuous flow (one after the other) without end-user interaction
Sourcehttpwwwmediabrickscom
HP 22
PRISM
Publishing Requirements for Industry Standard MetadataVersion 10 April 2001Authors IDEAlliance (Adobe Vignette Kinecta et al)Idea ldquoa standard for interoperable content description interchange and reuse in both traditional and electronic publishing contextsrdquoWeb site httpwwwprismstandardorg
HP 23
PRISM Design
Built on existing standards like Dublin Core (DC) RDF XMLDesigned to be used in a simple straightforward way over the InternetCompatible with NewsMLIntegrates easily with ICE (for syndication)Vocabulary
Basic DCExtensions ldquoControlled Vocabulariesrdquo eg ldquoNorth American Industrial Classification Systemldquo (NAICS)
HP 24
PRISM Example
ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10
xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt
ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt
ltrdfDescriptiongtltrdfRDFgt
(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)
HP 25
VoiceXML
A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech
(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input
Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications
High- level voice-specific language simplifies application development
Source httpwwwvoicexmlorg
HP 26
Voice Based Internet Applications
Source httpwwwvoicexmlorg
HP 27
Voice XML Metadata
Voice Specific metadata
Supports Syntactic interoperablity
Text data to voice data
Voice XML = XML + Voice Metadata
HP 28
VoiceXML ndash Possible Services
Information retrieval ndash News sports traffic stock quotes
e- Transactions (e- commerce e- tailing etc)
Financial banking stock trading
Catalog browsing (generally as an adjunct to paper)
Telephone services
Personal voice dialing One- number find- me services
Intranet ndash Inventory HR services corporate portals
Unification ndash My Whatever personal portals personal agents unified messaging
Source httpwwwvoicexmlorg
HP 29
MPEG7
set of description scheme and descriptors to describe the content of multimedia data
Provides a language to specify description schemes
A scheme for coding the description
HP 30
Application Examples for MPEG7
A few application examples are
Digital libraries (image catalog musical dictionary)
Multimedia directory services (eg yellow pages)
Broadcast media selection (radio channel TV channel)
HP 31
Information and Content Exchange (ICE)
Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax
Authors Adobe Kinecta MS Sun Vignette et al
Status latest spec version 11 May 2000 submitted to W3C for review
Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip
Web Site httpwwwicestandardorg
HP 32
What is the ICE Protocol
Syndication Protocol for communication between
Syndicators and Subscribers
Metadata to define
roles and responsibilities of involved parties Subscriber vs
Syndicator Requestor vs Responder Sender vs Receiver
format and method of content exchange (eg sequenced
packages pull vs push model)
HP 33
ICE Applications
ICE vocabulary + domain vocabulary = complete application
ICEestablishes and manages the syndication
delivers data
logs events
=gt content-independent metadata
industry-specific vocabulary defines the content =gt domain-specific metadata
Source httpwwwicestandardorg
HP 34
ICE Explained
ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information
Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm
ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe
timestamp=05-15-2001T110001 iceversion=10 gt
ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321
subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt
ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt
PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)
ltcomic-stripgtltice-itemgt ltice-item-groupgt
ltice-responsegt ltice-payloadgt
Content (domain-specific
metadata)
HP 36
XCM (eXtended Content Management)
a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are
Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval
Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle
Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability
Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html
HP 37
XCM
eXtended Content Management
Content DevelopmentManagement
Content DeliveryApplication ContentManagement
Content AuthoringDigital Asset Management
Software ConfigurationManagement
Document ProcessManagement
Metadata ManagementRecombinationPersonalization
Edge Network Delivery
Streaming Media DeliveryCaching
Source httpwwwvignettecom
HP 38
Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain
FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph
hydrography transportation
TitleTitle Dakota Aquifer
Online linkageOnline linkagehttpgisdasckgsukansedudasc
Direct Spatial Reference MethodDirect Spatial Reference Method Vector
Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator
hellip hellip hellip
UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation
TopicTopic Dakota Aquifer
AdressAdress IdIdhttpgisdasckgsukansedudasc
Measuring TechniquesMeasuring Techniques Vector
CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator
hellip hellip hellip
Kansas StateKansas State
HP 39
Different views of Metadata
Domain Independent Specifications (RDF)
FrameworksInfrastructures (XCM)
MetadataApplication Specific
ICE
Media Specific
MPEG7 VoiceXML
Domain Specific
NewsML FGDCUDK
HP 40
Creating and Serving Metadata to Power the Life-cycle of Content
Taalee Infrastructure Services Taalee Content Applications
Where is the content
Whose is it
ProduceAggregate
CatalogIndex
What other content is it related to
Integrate Syndicate
What is the right content for this
user
Personalize
What is the best way to
monetize this interaction
Interactive Marketing
BroadcastWirelineWirelessInteractive TV
Taalee Semantic MetaBase
HP 41
Taaleersquos Intelligent Content Process
HP 42
Metadata Creation and Semanticization
bull Automatic Content ClassificationCategorization
bull Metadata CreationExtractionTypes of metadata created
Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc
HP 43
FormsTypesIngest of Content
Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)
HP 44
Content HandlingIngest
InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents
Centralized Distributed MobileMigratory
HP 45
Information Extraction for Metadata Creation
GlobalEnterpriseWeb Repositories
METADATAMETADATA
EXTRACTORSEXTRACTORS
Digital Maps
NexisUPIAP
Documents
Digital Audios
Data Stores
Digital Videos
Digital Images
HP 46
Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach
INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT
NATIONAL PREPAREDNESS LEVEL II
CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection
SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues
CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning
LAYOUT
Date =gt day month int lsquorsquo int
Traditional TextCategorization
StatisticalAI Techniques
Classify Place ina taxonomy
feed
Customer Training
Set
RoutingDistribution
Customer Article Feed
4715
Standard Metadata
Feed Source iSyndicate
Posted Date 11202000
Classification of Article 4715
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 4
Information Interoperabilitykey metadata objective and benefit
System
Syntax
Structure
Semantics Protocols Metadata Domain ModelingOntologies
HP 5
Semantics
Meaning Understanding
Facts Context Reasoning
Related to exchange usage application
HP 6
A metadata classification
Data (Heterogeneous TypesMedia)(Heterogeneous TypesMedia)
Content Independent Metadata (creation(creation--date location typedate location type--ofof--sensor)sensor)
Content Dependent Metadata (size max colors rows columns)(size max colors rows columns)
Direct Content Based Metadata(inverted lists document vectors WAIS Glimpse LSI)(inverted lists document vectors WAIS Glimpse LSI)
Domain Independent (structural) Metadata(C++ class(C++ class--subclass relationships HTMLSGMLsubclass relationships HTMLSGML
Document Type Definitions C program structure)Document Type Definitions C program structure)
Domain Specific Metadataarea population (Census)area population (Census)
landland--cover relief (GIS)metadata cover relief (GIS)metadata concept descriptions from ontologiesconcept descriptions from ontologies
OntologiesClassificationsClassificationsDomain ModelsDomain Models
User
Move in thisMove in thisdirection todirection to
tackletackleinformationinformation
overloadoverload
HP 7
Types of Metadata for digital media
Media type-specific metadataegtexture of imagesfont sizehellip
Media processing-specific metadataegsearch retrieval personalized filtering
Content Specific metadataegrocket related video and documents
HP 8
Metadata for Digital DataMetadata for Digital Metadata for Digital Data
Metadata Data Type Metadata TypeQ-Features [Jain and Hampapur] Image Video Domain SpecificR-Features [Jain and Hampapur] Image Video Domain IndependentMeta-Features [Jain and Hampapur] Image Video Content IndependentImpression Vector [Kiyoki et al] Image Content DescriptiveNDVI Spatial Registration [Anderson and Stonebraker] Image Domain SpecificSpeech Feature Index [Glavitsch et al] Audio Direct Content BasedTopic Change Indices [Chen et al] Audio Direct Content BasedDocument Vectors [ Deerwester et al] Text Direct Content BasedInverted Indices [Kahle and Medlar] Text Direct Content BasedContent Classification Metadata [Bohm and Rakow] MultiMedia Domain SpecificDocument Composition Metadata [Bohm and Rakow] MultiMedia Domain IndependentMetadata Templates [Ordille and Miller] Media Independent Domain SpecificLand Cover Relief [Sheth and Kashyap] Media Independent Domain SpecificParent Child Relationships [Shklar et al] Text Domain IndependentContexts [Sciore et al Kashyap and Sheth] Structured Domain SpecificConcepts from Cyc [Collet et al] Structured Domain SpecificUserrsquos Data Attributes [Shoens et al] Text Structured Domain SpecificDomain Specific Ontologies [Mena et al] Media Independent Domain Specific
HP 9
Types of Specs and Standards(or MetaModels)
Domain Independent (MCF) RDF MOF DublinCore
Media Specific MPEG4 MPEG7 VoiceXML
DomainIndustry Specific (metamodels) MARC (Library) FGDC and UDK (Geographic) NewsML (News) PRISM (Publishing)
Application Specific ICE (Syndication)
ExchangeSharing XCM XMI
Orthogonal(Other) RDFS namespaces ontologies domain models (DAML OIL)
HP 10
what RDF can do for metadata
Designed to impose structural constraint on syntax to support consistent encoding exchange and processingof metadata
Domain Independent Metadata standard
HP 11
RDF (Resource Description Format)
PropertyValueResource
bullRDF data consists of nodes and attached attributevalue pairs
bullNodes can be any web resources (pages servers basically anything for which you can give a URI) even other instances of metadata
bullAttributes are named properties of the nodes and their values are either atomic (text strings numbers etc) or other resources or metadata instances
HP 12
RDF Example 1
URIAMITdccreator
dctitleMysteries of Metadata
URITALK
ltXML version=lsquo10rsquogt ltrdfRDF xmlnsrdf = ldquohttpwwww3orgTRREC-rdf-syntaxrdquo xmlnsdc = ldquohttppurlorgdcelements10rdquogt ltrdfDescription rdfabout = ldquoURITALKrdquogt ltdctitlegtMysteries of Metadataltdctitlegt ltdccreator rdfresource = ldquoURIAMITrdquogt ltrdfDescriptiongt ltrdfRDFgt
HP 13
RDF Example 2
URIAMITdccreator
URILIB amittaaleecom
BIBEmailBIBName
BIBAff
dctitleMysteries of Metadata
URITALK
Amit Sheth
HP 14
RDFS (RDF Schema)
Enables resource description communities to define
(and share) vocabularies (museum library e-
commercehellip)
Vocabulary (in RDFS) = the meaning characteristics
and relationships of a set of properties
HP 15
RDF Based Web
HTML
Resources
RDFXMLDescriptions
RDFSchemas
Sourcehttpwwww3crlacuk
HP 16
Dublin Core Metadata Initiative
Simple element set designed for resource description
International inter-discipline W3C community consensus
ldquoSemanticrdquo interface among resource description communities (very limited form of semantics)
Sourcewwwdesireorg
HP 17
Dublin Core RDF
ltxmlgtltnamespace href = httpw3orgrdf-schema as = RDFgtltnamespace href = httpmetadatanetDC as = DCgtltRDFAbbreviatedgtltRDFAssertion RDFHREF = httpwwwmysitecommydochtmlDCTitle = Ive Never Metadata Ive Never LikedldquoDCCreator = Mary CrystalldquoDCSubject = Metadata Dublin Core StuffgtltRDFAbbreviatedgtltxmlgt
HP 18
MOF (Metadata Object Facility) and XMI
MOF models metadata using a subset of UML that is
relevant to modeling metadata (class models - classes
associations and subtyping) a set of rules for mapping
the elements of the MOF Core to CORBA IDL
XML Metadata Interchange (XMI) is an extension of the
MOF into the XML space
HP 19
NewsML
NewsML is a packaging and metadata format for news contentNewsML is developed by the International Press Telecommunications Council (IPTC) a consortium of news providers mostly in the print or wire-service industries Since it deals only with packaging and metadata NewsML is complementary both to news content formats like NITF and to syndication protocols like ICE
HP 20
NewsMLhellip
It can be used by news providers to combine their pictures video text graphics and audio files in news output available on web sites mobile phones high end desktops interactive television and any other deviceaccurate objective set of description tools which help qualify the information and make the search more preciseNewsML allows a range of metadata to be attached to a multi-media story including a detailed computer-readable description of what an item is about
HP 21
Example of the end-to-end flow -NewsML
The content providersupplies NewsML packaged media content to the operator The content is categorized as current events finance sport etc and updated hourly
The operator receives NewsML data from the content provider The content server automatically pushes updated news articles to all news service subscribers
Consumers sign up for the news service directly on the device When using the news service the user browses through the categories and reads the news articles The news articles are presented in a continuous flow (one after the other) without end-user interaction
Sourcehttpwwwmediabrickscom
HP 22
PRISM
Publishing Requirements for Industry Standard MetadataVersion 10 April 2001Authors IDEAlliance (Adobe Vignette Kinecta et al)Idea ldquoa standard for interoperable content description interchange and reuse in both traditional and electronic publishing contextsrdquoWeb site httpwwwprismstandardorg
HP 23
PRISM Design
Built on existing standards like Dublin Core (DC) RDF XMLDesigned to be used in a simple straightforward way over the InternetCompatible with NewsMLIntegrates easily with ICE (for syndication)Vocabulary
Basic DCExtensions ldquoControlled Vocabulariesrdquo eg ldquoNorth American Industrial Classification Systemldquo (NAICS)
HP 24
PRISM Example
ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10
xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt
ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt
ltrdfDescriptiongtltrdfRDFgt
(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)
HP 25
VoiceXML
A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech
(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input
Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications
High- level voice-specific language simplifies application development
Source httpwwwvoicexmlorg
HP 26
Voice Based Internet Applications
Source httpwwwvoicexmlorg
HP 27
Voice XML Metadata
Voice Specific metadata
Supports Syntactic interoperablity
Text data to voice data
Voice XML = XML + Voice Metadata
HP 28
VoiceXML ndash Possible Services
Information retrieval ndash News sports traffic stock quotes
e- Transactions (e- commerce e- tailing etc)
Financial banking stock trading
Catalog browsing (generally as an adjunct to paper)
Telephone services
Personal voice dialing One- number find- me services
Intranet ndash Inventory HR services corporate portals
Unification ndash My Whatever personal portals personal agents unified messaging
Source httpwwwvoicexmlorg
HP 29
MPEG7
set of description scheme and descriptors to describe the content of multimedia data
Provides a language to specify description schemes
A scheme for coding the description
HP 30
Application Examples for MPEG7
A few application examples are
Digital libraries (image catalog musical dictionary)
Multimedia directory services (eg yellow pages)
Broadcast media selection (radio channel TV channel)
HP 31
Information and Content Exchange (ICE)
Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax
Authors Adobe Kinecta MS Sun Vignette et al
Status latest spec version 11 May 2000 submitted to W3C for review
Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip
Web Site httpwwwicestandardorg
HP 32
What is the ICE Protocol
Syndication Protocol for communication between
Syndicators and Subscribers
Metadata to define
roles and responsibilities of involved parties Subscriber vs
Syndicator Requestor vs Responder Sender vs Receiver
format and method of content exchange (eg sequenced
packages pull vs push model)
HP 33
ICE Applications
ICE vocabulary + domain vocabulary = complete application
ICEestablishes and manages the syndication
delivers data
logs events
=gt content-independent metadata
industry-specific vocabulary defines the content =gt domain-specific metadata
Source httpwwwicestandardorg
HP 34
ICE Explained
ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information
Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm
ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe
timestamp=05-15-2001T110001 iceversion=10 gt
ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321
subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt
ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt
PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)
ltcomic-stripgtltice-itemgt ltice-item-groupgt
ltice-responsegt ltice-payloadgt
Content (domain-specific
metadata)
HP 36
XCM (eXtended Content Management)
a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are
Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval
Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle
Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability
Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html
HP 37
XCM
eXtended Content Management
Content DevelopmentManagement
Content DeliveryApplication ContentManagement
Content AuthoringDigital Asset Management
Software ConfigurationManagement
Document ProcessManagement
Metadata ManagementRecombinationPersonalization
Edge Network Delivery
Streaming Media DeliveryCaching
Source httpwwwvignettecom
HP 38
Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain
FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph
hydrography transportation
TitleTitle Dakota Aquifer
Online linkageOnline linkagehttpgisdasckgsukansedudasc
Direct Spatial Reference MethodDirect Spatial Reference Method Vector
Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator
hellip hellip hellip
UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation
TopicTopic Dakota Aquifer
AdressAdress IdIdhttpgisdasckgsukansedudasc
Measuring TechniquesMeasuring Techniques Vector
CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator
hellip hellip hellip
Kansas StateKansas State
HP 39
Different views of Metadata
Domain Independent Specifications (RDF)
FrameworksInfrastructures (XCM)
MetadataApplication Specific
ICE
Media Specific
MPEG7 VoiceXML
Domain Specific
NewsML FGDCUDK
HP 40
Creating and Serving Metadata to Power the Life-cycle of Content
Taalee Infrastructure Services Taalee Content Applications
Where is the content
Whose is it
ProduceAggregate
CatalogIndex
What other content is it related to
Integrate Syndicate
What is the right content for this
user
Personalize
What is the best way to
monetize this interaction
Interactive Marketing
BroadcastWirelineWirelessInteractive TV
Taalee Semantic MetaBase
HP 41
Taaleersquos Intelligent Content Process
HP 42
Metadata Creation and Semanticization
bull Automatic Content ClassificationCategorization
bull Metadata CreationExtractionTypes of metadata created
Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc
HP 43
FormsTypesIngest of Content
Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)
HP 44
Content HandlingIngest
InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents
Centralized Distributed MobileMigratory
HP 45
Information Extraction for Metadata Creation
GlobalEnterpriseWeb Repositories
METADATAMETADATA
EXTRACTORSEXTRACTORS
Digital Maps
NexisUPIAP
Documents
Digital Audios
Data Stores
Digital Videos
Digital Images
HP 46
Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach
INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT
NATIONAL PREPAREDNESS LEVEL II
CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection
SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues
CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning
LAYOUT
Date =gt day month int lsquorsquo int
Traditional TextCategorization
StatisticalAI Techniques
Classify Place ina taxonomy
feed
Customer Training
Set
RoutingDistribution
Customer Article Feed
4715
Standard Metadata
Feed Source iSyndicate
Posted Date 11202000
Classification of Article 4715
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 5
Semantics
Meaning Understanding
Facts Context Reasoning
Related to exchange usage application
HP 6
A metadata classification
Data (Heterogeneous TypesMedia)(Heterogeneous TypesMedia)
Content Independent Metadata (creation(creation--date location typedate location type--ofof--sensor)sensor)
Content Dependent Metadata (size max colors rows columns)(size max colors rows columns)
Direct Content Based Metadata(inverted lists document vectors WAIS Glimpse LSI)(inverted lists document vectors WAIS Glimpse LSI)
Domain Independent (structural) Metadata(C++ class(C++ class--subclass relationships HTMLSGMLsubclass relationships HTMLSGML
Document Type Definitions C program structure)Document Type Definitions C program structure)
Domain Specific Metadataarea population (Census)area population (Census)
landland--cover relief (GIS)metadata cover relief (GIS)metadata concept descriptions from ontologiesconcept descriptions from ontologies
OntologiesClassificationsClassificationsDomain ModelsDomain Models
User
Move in thisMove in thisdirection todirection to
tackletackleinformationinformation
overloadoverload
HP 7
Types of Metadata for digital media
Media type-specific metadataegtexture of imagesfont sizehellip
Media processing-specific metadataegsearch retrieval personalized filtering
Content Specific metadataegrocket related video and documents
HP 8
Metadata for Digital DataMetadata for Digital Metadata for Digital Data
Metadata Data Type Metadata TypeQ-Features [Jain and Hampapur] Image Video Domain SpecificR-Features [Jain and Hampapur] Image Video Domain IndependentMeta-Features [Jain and Hampapur] Image Video Content IndependentImpression Vector [Kiyoki et al] Image Content DescriptiveNDVI Spatial Registration [Anderson and Stonebraker] Image Domain SpecificSpeech Feature Index [Glavitsch et al] Audio Direct Content BasedTopic Change Indices [Chen et al] Audio Direct Content BasedDocument Vectors [ Deerwester et al] Text Direct Content BasedInverted Indices [Kahle and Medlar] Text Direct Content BasedContent Classification Metadata [Bohm and Rakow] MultiMedia Domain SpecificDocument Composition Metadata [Bohm and Rakow] MultiMedia Domain IndependentMetadata Templates [Ordille and Miller] Media Independent Domain SpecificLand Cover Relief [Sheth and Kashyap] Media Independent Domain SpecificParent Child Relationships [Shklar et al] Text Domain IndependentContexts [Sciore et al Kashyap and Sheth] Structured Domain SpecificConcepts from Cyc [Collet et al] Structured Domain SpecificUserrsquos Data Attributes [Shoens et al] Text Structured Domain SpecificDomain Specific Ontologies [Mena et al] Media Independent Domain Specific
HP 9
Types of Specs and Standards(or MetaModels)
Domain Independent (MCF) RDF MOF DublinCore
Media Specific MPEG4 MPEG7 VoiceXML
DomainIndustry Specific (metamodels) MARC (Library) FGDC and UDK (Geographic) NewsML (News) PRISM (Publishing)
Application Specific ICE (Syndication)
ExchangeSharing XCM XMI
Orthogonal(Other) RDFS namespaces ontologies domain models (DAML OIL)
HP 10
what RDF can do for metadata
Designed to impose structural constraint on syntax to support consistent encoding exchange and processingof metadata
Domain Independent Metadata standard
HP 11
RDF (Resource Description Format)
PropertyValueResource
bullRDF data consists of nodes and attached attributevalue pairs
bullNodes can be any web resources (pages servers basically anything for which you can give a URI) even other instances of metadata
bullAttributes are named properties of the nodes and their values are either atomic (text strings numbers etc) or other resources or metadata instances
HP 12
RDF Example 1
URIAMITdccreator
dctitleMysteries of Metadata
URITALK
ltXML version=lsquo10rsquogt ltrdfRDF xmlnsrdf = ldquohttpwwww3orgTRREC-rdf-syntaxrdquo xmlnsdc = ldquohttppurlorgdcelements10rdquogt ltrdfDescription rdfabout = ldquoURITALKrdquogt ltdctitlegtMysteries of Metadataltdctitlegt ltdccreator rdfresource = ldquoURIAMITrdquogt ltrdfDescriptiongt ltrdfRDFgt
HP 13
RDF Example 2
URIAMITdccreator
URILIB amittaaleecom
BIBEmailBIBName
BIBAff
dctitleMysteries of Metadata
URITALK
Amit Sheth
HP 14
RDFS (RDF Schema)
Enables resource description communities to define
(and share) vocabularies (museum library e-
commercehellip)
Vocabulary (in RDFS) = the meaning characteristics
and relationships of a set of properties
HP 15
RDF Based Web
HTML
Resources
RDFXMLDescriptions
RDFSchemas
Sourcehttpwwww3crlacuk
HP 16
Dublin Core Metadata Initiative
Simple element set designed for resource description
International inter-discipline W3C community consensus
ldquoSemanticrdquo interface among resource description communities (very limited form of semantics)
Sourcewwwdesireorg
HP 17
Dublin Core RDF
ltxmlgtltnamespace href = httpw3orgrdf-schema as = RDFgtltnamespace href = httpmetadatanetDC as = DCgtltRDFAbbreviatedgtltRDFAssertion RDFHREF = httpwwwmysitecommydochtmlDCTitle = Ive Never Metadata Ive Never LikedldquoDCCreator = Mary CrystalldquoDCSubject = Metadata Dublin Core StuffgtltRDFAbbreviatedgtltxmlgt
HP 18
MOF (Metadata Object Facility) and XMI
MOF models metadata using a subset of UML that is
relevant to modeling metadata (class models - classes
associations and subtyping) a set of rules for mapping
the elements of the MOF Core to CORBA IDL
XML Metadata Interchange (XMI) is an extension of the
MOF into the XML space
HP 19
NewsML
NewsML is a packaging and metadata format for news contentNewsML is developed by the International Press Telecommunications Council (IPTC) a consortium of news providers mostly in the print or wire-service industries Since it deals only with packaging and metadata NewsML is complementary both to news content formats like NITF and to syndication protocols like ICE
HP 20
NewsMLhellip
It can be used by news providers to combine their pictures video text graphics and audio files in news output available on web sites mobile phones high end desktops interactive television and any other deviceaccurate objective set of description tools which help qualify the information and make the search more preciseNewsML allows a range of metadata to be attached to a multi-media story including a detailed computer-readable description of what an item is about
HP 21
Example of the end-to-end flow -NewsML
The content providersupplies NewsML packaged media content to the operator The content is categorized as current events finance sport etc and updated hourly
The operator receives NewsML data from the content provider The content server automatically pushes updated news articles to all news service subscribers
Consumers sign up for the news service directly on the device When using the news service the user browses through the categories and reads the news articles The news articles are presented in a continuous flow (one after the other) without end-user interaction
Sourcehttpwwwmediabrickscom
HP 22
PRISM
Publishing Requirements for Industry Standard MetadataVersion 10 April 2001Authors IDEAlliance (Adobe Vignette Kinecta et al)Idea ldquoa standard for interoperable content description interchange and reuse in both traditional and electronic publishing contextsrdquoWeb site httpwwwprismstandardorg
HP 23
PRISM Design
Built on existing standards like Dublin Core (DC) RDF XMLDesigned to be used in a simple straightforward way over the InternetCompatible with NewsMLIntegrates easily with ICE (for syndication)Vocabulary
Basic DCExtensions ldquoControlled Vocabulariesrdquo eg ldquoNorth American Industrial Classification Systemldquo (NAICS)
HP 24
PRISM Example
ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10
xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt
ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt
ltrdfDescriptiongtltrdfRDFgt
(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)
HP 25
VoiceXML
A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech
(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input
Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications
High- level voice-specific language simplifies application development
Source httpwwwvoicexmlorg
HP 26
Voice Based Internet Applications
Source httpwwwvoicexmlorg
HP 27
Voice XML Metadata
Voice Specific metadata
Supports Syntactic interoperablity
Text data to voice data
Voice XML = XML + Voice Metadata
HP 28
VoiceXML ndash Possible Services
Information retrieval ndash News sports traffic stock quotes
e- Transactions (e- commerce e- tailing etc)
Financial banking stock trading
Catalog browsing (generally as an adjunct to paper)
Telephone services
Personal voice dialing One- number find- me services
Intranet ndash Inventory HR services corporate portals
Unification ndash My Whatever personal portals personal agents unified messaging
Source httpwwwvoicexmlorg
HP 29
MPEG7
set of description scheme and descriptors to describe the content of multimedia data
Provides a language to specify description schemes
A scheme for coding the description
HP 30
Application Examples for MPEG7
A few application examples are
Digital libraries (image catalog musical dictionary)
Multimedia directory services (eg yellow pages)
Broadcast media selection (radio channel TV channel)
HP 31
Information and Content Exchange (ICE)
Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax
Authors Adobe Kinecta MS Sun Vignette et al
Status latest spec version 11 May 2000 submitted to W3C for review
Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip
Web Site httpwwwicestandardorg
HP 32
What is the ICE Protocol
Syndication Protocol for communication between
Syndicators and Subscribers
Metadata to define
roles and responsibilities of involved parties Subscriber vs
Syndicator Requestor vs Responder Sender vs Receiver
format and method of content exchange (eg sequenced
packages pull vs push model)
HP 33
ICE Applications
ICE vocabulary + domain vocabulary = complete application
ICEestablishes and manages the syndication
delivers data
logs events
=gt content-independent metadata
industry-specific vocabulary defines the content =gt domain-specific metadata
Source httpwwwicestandardorg
HP 34
ICE Explained
ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information
Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm
ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe
timestamp=05-15-2001T110001 iceversion=10 gt
ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321
subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt
ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt
PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)
ltcomic-stripgtltice-itemgt ltice-item-groupgt
ltice-responsegt ltice-payloadgt
Content (domain-specific
metadata)
HP 36
XCM (eXtended Content Management)
a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are
Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval
Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle
Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability
Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html
HP 37
XCM
eXtended Content Management
Content DevelopmentManagement
Content DeliveryApplication ContentManagement
Content AuthoringDigital Asset Management
Software ConfigurationManagement
Document ProcessManagement
Metadata ManagementRecombinationPersonalization
Edge Network Delivery
Streaming Media DeliveryCaching
Source httpwwwvignettecom
HP 38
Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain
FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph
hydrography transportation
TitleTitle Dakota Aquifer
Online linkageOnline linkagehttpgisdasckgsukansedudasc
Direct Spatial Reference MethodDirect Spatial Reference Method Vector
Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator
hellip hellip hellip
UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation
TopicTopic Dakota Aquifer
AdressAdress IdIdhttpgisdasckgsukansedudasc
Measuring TechniquesMeasuring Techniques Vector
CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator
hellip hellip hellip
Kansas StateKansas State
HP 39
Different views of Metadata
Domain Independent Specifications (RDF)
FrameworksInfrastructures (XCM)
MetadataApplication Specific
ICE
Media Specific
MPEG7 VoiceXML
Domain Specific
NewsML FGDCUDK
HP 40
Creating and Serving Metadata to Power the Life-cycle of Content
Taalee Infrastructure Services Taalee Content Applications
Where is the content
Whose is it
ProduceAggregate
CatalogIndex
What other content is it related to
Integrate Syndicate
What is the right content for this
user
Personalize
What is the best way to
monetize this interaction
Interactive Marketing
BroadcastWirelineWirelessInteractive TV
Taalee Semantic MetaBase
HP 41
Taaleersquos Intelligent Content Process
HP 42
Metadata Creation and Semanticization
bull Automatic Content ClassificationCategorization
bull Metadata CreationExtractionTypes of metadata created
Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc
HP 43
FormsTypesIngest of Content
Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)
HP 44
Content HandlingIngest
InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents
Centralized Distributed MobileMigratory
HP 45
Information Extraction for Metadata Creation
GlobalEnterpriseWeb Repositories
METADATAMETADATA
EXTRACTORSEXTRACTORS
Digital Maps
NexisUPIAP
Documents
Digital Audios
Data Stores
Digital Videos
Digital Images
HP 46
Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach
INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT
NATIONAL PREPAREDNESS LEVEL II
CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection
SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues
CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning
LAYOUT
Date =gt day month int lsquorsquo int
Traditional TextCategorization
StatisticalAI Techniques
Classify Place ina taxonomy
feed
Customer Training
Set
RoutingDistribution
Customer Article Feed
4715
Standard Metadata
Feed Source iSyndicate
Posted Date 11202000
Classification of Article 4715
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 6
A metadata classification
Data (Heterogeneous TypesMedia)(Heterogeneous TypesMedia)
Content Independent Metadata (creation(creation--date location typedate location type--ofof--sensor)sensor)
Content Dependent Metadata (size max colors rows columns)(size max colors rows columns)
Direct Content Based Metadata(inverted lists document vectors WAIS Glimpse LSI)(inverted lists document vectors WAIS Glimpse LSI)
Domain Independent (structural) Metadata(C++ class(C++ class--subclass relationships HTMLSGMLsubclass relationships HTMLSGML
Document Type Definitions C program structure)Document Type Definitions C program structure)
Domain Specific Metadataarea population (Census)area population (Census)
landland--cover relief (GIS)metadata cover relief (GIS)metadata concept descriptions from ontologiesconcept descriptions from ontologies
OntologiesClassificationsClassificationsDomain ModelsDomain Models
User
Move in thisMove in thisdirection todirection to
tackletackleinformationinformation
overloadoverload
HP 7
Types of Metadata for digital media
Media type-specific metadataegtexture of imagesfont sizehellip
Media processing-specific metadataegsearch retrieval personalized filtering
Content Specific metadataegrocket related video and documents
HP 8
Metadata for Digital DataMetadata for Digital Metadata for Digital Data
Metadata Data Type Metadata TypeQ-Features [Jain and Hampapur] Image Video Domain SpecificR-Features [Jain and Hampapur] Image Video Domain IndependentMeta-Features [Jain and Hampapur] Image Video Content IndependentImpression Vector [Kiyoki et al] Image Content DescriptiveNDVI Spatial Registration [Anderson and Stonebraker] Image Domain SpecificSpeech Feature Index [Glavitsch et al] Audio Direct Content BasedTopic Change Indices [Chen et al] Audio Direct Content BasedDocument Vectors [ Deerwester et al] Text Direct Content BasedInverted Indices [Kahle and Medlar] Text Direct Content BasedContent Classification Metadata [Bohm and Rakow] MultiMedia Domain SpecificDocument Composition Metadata [Bohm and Rakow] MultiMedia Domain IndependentMetadata Templates [Ordille and Miller] Media Independent Domain SpecificLand Cover Relief [Sheth and Kashyap] Media Independent Domain SpecificParent Child Relationships [Shklar et al] Text Domain IndependentContexts [Sciore et al Kashyap and Sheth] Structured Domain SpecificConcepts from Cyc [Collet et al] Structured Domain SpecificUserrsquos Data Attributes [Shoens et al] Text Structured Domain SpecificDomain Specific Ontologies [Mena et al] Media Independent Domain Specific
HP 9
Types of Specs and Standards(or MetaModels)
Domain Independent (MCF) RDF MOF DublinCore
Media Specific MPEG4 MPEG7 VoiceXML
DomainIndustry Specific (metamodels) MARC (Library) FGDC and UDK (Geographic) NewsML (News) PRISM (Publishing)
Application Specific ICE (Syndication)
ExchangeSharing XCM XMI
Orthogonal(Other) RDFS namespaces ontologies domain models (DAML OIL)
HP 10
what RDF can do for metadata
Designed to impose structural constraint on syntax to support consistent encoding exchange and processingof metadata
Domain Independent Metadata standard
HP 11
RDF (Resource Description Format)
PropertyValueResource
bullRDF data consists of nodes and attached attributevalue pairs
bullNodes can be any web resources (pages servers basically anything for which you can give a URI) even other instances of metadata
bullAttributes are named properties of the nodes and their values are either atomic (text strings numbers etc) or other resources or metadata instances
HP 12
RDF Example 1
URIAMITdccreator
dctitleMysteries of Metadata
URITALK
ltXML version=lsquo10rsquogt ltrdfRDF xmlnsrdf = ldquohttpwwww3orgTRREC-rdf-syntaxrdquo xmlnsdc = ldquohttppurlorgdcelements10rdquogt ltrdfDescription rdfabout = ldquoURITALKrdquogt ltdctitlegtMysteries of Metadataltdctitlegt ltdccreator rdfresource = ldquoURIAMITrdquogt ltrdfDescriptiongt ltrdfRDFgt
HP 13
RDF Example 2
URIAMITdccreator
URILIB amittaaleecom
BIBEmailBIBName
BIBAff
dctitleMysteries of Metadata
URITALK
Amit Sheth
HP 14
RDFS (RDF Schema)
Enables resource description communities to define
(and share) vocabularies (museum library e-
commercehellip)
Vocabulary (in RDFS) = the meaning characteristics
and relationships of a set of properties
HP 15
RDF Based Web
HTML
Resources
RDFXMLDescriptions
RDFSchemas
Sourcehttpwwww3crlacuk
HP 16
Dublin Core Metadata Initiative
Simple element set designed for resource description
International inter-discipline W3C community consensus
ldquoSemanticrdquo interface among resource description communities (very limited form of semantics)
Sourcewwwdesireorg
HP 17
Dublin Core RDF
ltxmlgtltnamespace href = httpw3orgrdf-schema as = RDFgtltnamespace href = httpmetadatanetDC as = DCgtltRDFAbbreviatedgtltRDFAssertion RDFHREF = httpwwwmysitecommydochtmlDCTitle = Ive Never Metadata Ive Never LikedldquoDCCreator = Mary CrystalldquoDCSubject = Metadata Dublin Core StuffgtltRDFAbbreviatedgtltxmlgt
HP 18
MOF (Metadata Object Facility) and XMI
MOF models metadata using a subset of UML that is
relevant to modeling metadata (class models - classes
associations and subtyping) a set of rules for mapping
the elements of the MOF Core to CORBA IDL
XML Metadata Interchange (XMI) is an extension of the
MOF into the XML space
HP 19
NewsML
NewsML is a packaging and metadata format for news contentNewsML is developed by the International Press Telecommunications Council (IPTC) a consortium of news providers mostly in the print or wire-service industries Since it deals only with packaging and metadata NewsML is complementary both to news content formats like NITF and to syndication protocols like ICE
HP 20
NewsMLhellip
It can be used by news providers to combine their pictures video text graphics and audio files in news output available on web sites mobile phones high end desktops interactive television and any other deviceaccurate objective set of description tools which help qualify the information and make the search more preciseNewsML allows a range of metadata to be attached to a multi-media story including a detailed computer-readable description of what an item is about
HP 21
Example of the end-to-end flow -NewsML
The content providersupplies NewsML packaged media content to the operator The content is categorized as current events finance sport etc and updated hourly
The operator receives NewsML data from the content provider The content server automatically pushes updated news articles to all news service subscribers
Consumers sign up for the news service directly on the device When using the news service the user browses through the categories and reads the news articles The news articles are presented in a continuous flow (one after the other) without end-user interaction
Sourcehttpwwwmediabrickscom
HP 22
PRISM
Publishing Requirements for Industry Standard MetadataVersion 10 April 2001Authors IDEAlliance (Adobe Vignette Kinecta et al)Idea ldquoa standard for interoperable content description interchange and reuse in both traditional and electronic publishing contextsrdquoWeb site httpwwwprismstandardorg
HP 23
PRISM Design
Built on existing standards like Dublin Core (DC) RDF XMLDesigned to be used in a simple straightforward way over the InternetCompatible with NewsMLIntegrates easily with ICE (for syndication)Vocabulary
Basic DCExtensions ldquoControlled Vocabulariesrdquo eg ldquoNorth American Industrial Classification Systemldquo (NAICS)
HP 24
PRISM Example
ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10
xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt
ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt
ltrdfDescriptiongtltrdfRDFgt
(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)
HP 25
VoiceXML
A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech
(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input
Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications
High- level voice-specific language simplifies application development
Source httpwwwvoicexmlorg
HP 26
Voice Based Internet Applications
Source httpwwwvoicexmlorg
HP 27
Voice XML Metadata
Voice Specific metadata
Supports Syntactic interoperablity
Text data to voice data
Voice XML = XML + Voice Metadata
HP 28
VoiceXML ndash Possible Services
Information retrieval ndash News sports traffic stock quotes
e- Transactions (e- commerce e- tailing etc)
Financial banking stock trading
Catalog browsing (generally as an adjunct to paper)
Telephone services
Personal voice dialing One- number find- me services
Intranet ndash Inventory HR services corporate portals
Unification ndash My Whatever personal portals personal agents unified messaging
Source httpwwwvoicexmlorg
HP 29
MPEG7
set of description scheme and descriptors to describe the content of multimedia data
Provides a language to specify description schemes
A scheme for coding the description
HP 30
Application Examples for MPEG7
A few application examples are
Digital libraries (image catalog musical dictionary)
Multimedia directory services (eg yellow pages)
Broadcast media selection (radio channel TV channel)
HP 31
Information and Content Exchange (ICE)
Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax
Authors Adobe Kinecta MS Sun Vignette et al
Status latest spec version 11 May 2000 submitted to W3C for review
Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip
Web Site httpwwwicestandardorg
HP 32
What is the ICE Protocol
Syndication Protocol for communication between
Syndicators and Subscribers
Metadata to define
roles and responsibilities of involved parties Subscriber vs
Syndicator Requestor vs Responder Sender vs Receiver
format and method of content exchange (eg sequenced
packages pull vs push model)
HP 33
ICE Applications
ICE vocabulary + domain vocabulary = complete application
ICEestablishes and manages the syndication
delivers data
logs events
=gt content-independent metadata
industry-specific vocabulary defines the content =gt domain-specific metadata
Source httpwwwicestandardorg
HP 34
ICE Explained
ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information
Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm
ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe
timestamp=05-15-2001T110001 iceversion=10 gt
ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321
subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt
ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt
PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)
ltcomic-stripgtltice-itemgt ltice-item-groupgt
ltice-responsegt ltice-payloadgt
Content (domain-specific
metadata)
HP 36
XCM (eXtended Content Management)
a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are
Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval
Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle
Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability
Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html
HP 37
XCM
eXtended Content Management
Content DevelopmentManagement
Content DeliveryApplication ContentManagement
Content AuthoringDigital Asset Management
Software ConfigurationManagement
Document ProcessManagement
Metadata ManagementRecombinationPersonalization
Edge Network Delivery
Streaming Media DeliveryCaching
Source httpwwwvignettecom
HP 38
Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain
FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph
hydrography transportation
TitleTitle Dakota Aquifer
Online linkageOnline linkagehttpgisdasckgsukansedudasc
Direct Spatial Reference MethodDirect Spatial Reference Method Vector
Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator
hellip hellip hellip
UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation
TopicTopic Dakota Aquifer
AdressAdress IdIdhttpgisdasckgsukansedudasc
Measuring TechniquesMeasuring Techniques Vector
CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator
hellip hellip hellip
Kansas StateKansas State
HP 39
Different views of Metadata
Domain Independent Specifications (RDF)
FrameworksInfrastructures (XCM)
MetadataApplication Specific
ICE
Media Specific
MPEG7 VoiceXML
Domain Specific
NewsML FGDCUDK
HP 40
Creating and Serving Metadata to Power the Life-cycle of Content
Taalee Infrastructure Services Taalee Content Applications
Where is the content
Whose is it
ProduceAggregate
CatalogIndex
What other content is it related to
Integrate Syndicate
What is the right content for this
user
Personalize
What is the best way to
monetize this interaction
Interactive Marketing
BroadcastWirelineWirelessInteractive TV
Taalee Semantic MetaBase
HP 41
Taaleersquos Intelligent Content Process
HP 42
Metadata Creation and Semanticization
bull Automatic Content ClassificationCategorization
bull Metadata CreationExtractionTypes of metadata created
Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc
HP 43
FormsTypesIngest of Content
Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)
HP 44
Content HandlingIngest
InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents
Centralized Distributed MobileMigratory
HP 45
Information Extraction for Metadata Creation
GlobalEnterpriseWeb Repositories
METADATAMETADATA
EXTRACTORSEXTRACTORS
Digital Maps
NexisUPIAP
Documents
Digital Audios
Data Stores
Digital Videos
Digital Images
HP 46
Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach
INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT
NATIONAL PREPAREDNESS LEVEL II
CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection
SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues
CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning
LAYOUT
Date =gt day month int lsquorsquo int
Traditional TextCategorization
StatisticalAI Techniques
Classify Place ina taxonomy
feed
Customer Training
Set
RoutingDistribution
Customer Article Feed
4715
Standard Metadata
Feed Source iSyndicate
Posted Date 11202000
Classification of Article 4715
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 7
Types of Metadata for digital media
Media type-specific metadataegtexture of imagesfont sizehellip
Media processing-specific metadataegsearch retrieval personalized filtering
Content Specific metadataegrocket related video and documents
HP 8
Metadata for Digital DataMetadata for Digital Metadata for Digital Data
Metadata Data Type Metadata TypeQ-Features [Jain and Hampapur] Image Video Domain SpecificR-Features [Jain and Hampapur] Image Video Domain IndependentMeta-Features [Jain and Hampapur] Image Video Content IndependentImpression Vector [Kiyoki et al] Image Content DescriptiveNDVI Spatial Registration [Anderson and Stonebraker] Image Domain SpecificSpeech Feature Index [Glavitsch et al] Audio Direct Content BasedTopic Change Indices [Chen et al] Audio Direct Content BasedDocument Vectors [ Deerwester et al] Text Direct Content BasedInverted Indices [Kahle and Medlar] Text Direct Content BasedContent Classification Metadata [Bohm and Rakow] MultiMedia Domain SpecificDocument Composition Metadata [Bohm and Rakow] MultiMedia Domain IndependentMetadata Templates [Ordille and Miller] Media Independent Domain SpecificLand Cover Relief [Sheth and Kashyap] Media Independent Domain SpecificParent Child Relationships [Shklar et al] Text Domain IndependentContexts [Sciore et al Kashyap and Sheth] Structured Domain SpecificConcepts from Cyc [Collet et al] Structured Domain SpecificUserrsquos Data Attributes [Shoens et al] Text Structured Domain SpecificDomain Specific Ontologies [Mena et al] Media Independent Domain Specific
HP 9
Types of Specs and Standards(or MetaModels)
Domain Independent (MCF) RDF MOF DublinCore
Media Specific MPEG4 MPEG7 VoiceXML
DomainIndustry Specific (metamodels) MARC (Library) FGDC and UDK (Geographic) NewsML (News) PRISM (Publishing)
Application Specific ICE (Syndication)
ExchangeSharing XCM XMI
Orthogonal(Other) RDFS namespaces ontologies domain models (DAML OIL)
HP 10
what RDF can do for metadata
Designed to impose structural constraint on syntax to support consistent encoding exchange and processingof metadata
Domain Independent Metadata standard
HP 11
RDF (Resource Description Format)
PropertyValueResource
bullRDF data consists of nodes and attached attributevalue pairs
bullNodes can be any web resources (pages servers basically anything for which you can give a URI) even other instances of metadata
bullAttributes are named properties of the nodes and their values are either atomic (text strings numbers etc) or other resources or metadata instances
HP 12
RDF Example 1
URIAMITdccreator
dctitleMysteries of Metadata
URITALK
ltXML version=lsquo10rsquogt ltrdfRDF xmlnsrdf = ldquohttpwwww3orgTRREC-rdf-syntaxrdquo xmlnsdc = ldquohttppurlorgdcelements10rdquogt ltrdfDescription rdfabout = ldquoURITALKrdquogt ltdctitlegtMysteries of Metadataltdctitlegt ltdccreator rdfresource = ldquoURIAMITrdquogt ltrdfDescriptiongt ltrdfRDFgt
HP 13
RDF Example 2
URIAMITdccreator
URILIB amittaaleecom
BIBEmailBIBName
BIBAff
dctitleMysteries of Metadata
URITALK
Amit Sheth
HP 14
RDFS (RDF Schema)
Enables resource description communities to define
(and share) vocabularies (museum library e-
commercehellip)
Vocabulary (in RDFS) = the meaning characteristics
and relationships of a set of properties
HP 15
RDF Based Web
HTML
Resources
RDFXMLDescriptions
RDFSchemas
Sourcehttpwwww3crlacuk
HP 16
Dublin Core Metadata Initiative
Simple element set designed for resource description
International inter-discipline W3C community consensus
ldquoSemanticrdquo interface among resource description communities (very limited form of semantics)
Sourcewwwdesireorg
HP 17
Dublin Core RDF
ltxmlgtltnamespace href = httpw3orgrdf-schema as = RDFgtltnamespace href = httpmetadatanetDC as = DCgtltRDFAbbreviatedgtltRDFAssertion RDFHREF = httpwwwmysitecommydochtmlDCTitle = Ive Never Metadata Ive Never LikedldquoDCCreator = Mary CrystalldquoDCSubject = Metadata Dublin Core StuffgtltRDFAbbreviatedgtltxmlgt
HP 18
MOF (Metadata Object Facility) and XMI
MOF models metadata using a subset of UML that is
relevant to modeling metadata (class models - classes
associations and subtyping) a set of rules for mapping
the elements of the MOF Core to CORBA IDL
XML Metadata Interchange (XMI) is an extension of the
MOF into the XML space
HP 19
NewsML
NewsML is a packaging and metadata format for news contentNewsML is developed by the International Press Telecommunications Council (IPTC) a consortium of news providers mostly in the print or wire-service industries Since it deals only with packaging and metadata NewsML is complementary both to news content formats like NITF and to syndication protocols like ICE
HP 20
NewsMLhellip
It can be used by news providers to combine their pictures video text graphics and audio files in news output available on web sites mobile phones high end desktops interactive television and any other deviceaccurate objective set of description tools which help qualify the information and make the search more preciseNewsML allows a range of metadata to be attached to a multi-media story including a detailed computer-readable description of what an item is about
HP 21
Example of the end-to-end flow -NewsML
The content providersupplies NewsML packaged media content to the operator The content is categorized as current events finance sport etc and updated hourly
The operator receives NewsML data from the content provider The content server automatically pushes updated news articles to all news service subscribers
Consumers sign up for the news service directly on the device When using the news service the user browses through the categories and reads the news articles The news articles are presented in a continuous flow (one after the other) without end-user interaction
Sourcehttpwwwmediabrickscom
HP 22
PRISM
Publishing Requirements for Industry Standard MetadataVersion 10 April 2001Authors IDEAlliance (Adobe Vignette Kinecta et al)Idea ldquoa standard for interoperable content description interchange and reuse in both traditional and electronic publishing contextsrdquoWeb site httpwwwprismstandardorg
HP 23
PRISM Design
Built on existing standards like Dublin Core (DC) RDF XMLDesigned to be used in a simple straightforward way over the InternetCompatible with NewsMLIntegrates easily with ICE (for syndication)Vocabulary
Basic DCExtensions ldquoControlled Vocabulariesrdquo eg ldquoNorth American Industrial Classification Systemldquo (NAICS)
HP 24
PRISM Example
ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10
xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt
ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt
ltrdfDescriptiongtltrdfRDFgt
(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)
HP 25
VoiceXML
A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech
(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input
Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications
High- level voice-specific language simplifies application development
Source httpwwwvoicexmlorg
HP 26
Voice Based Internet Applications
Source httpwwwvoicexmlorg
HP 27
Voice XML Metadata
Voice Specific metadata
Supports Syntactic interoperablity
Text data to voice data
Voice XML = XML + Voice Metadata
HP 28
VoiceXML ndash Possible Services
Information retrieval ndash News sports traffic stock quotes
e- Transactions (e- commerce e- tailing etc)
Financial banking stock trading
Catalog browsing (generally as an adjunct to paper)
Telephone services
Personal voice dialing One- number find- me services
Intranet ndash Inventory HR services corporate portals
Unification ndash My Whatever personal portals personal agents unified messaging
Source httpwwwvoicexmlorg
HP 29
MPEG7
set of description scheme and descriptors to describe the content of multimedia data
Provides a language to specify description schemes
A scheme for coding the description
HP 30
Application Examples for MPEG7
A few application examples are
Digital libraries (image catalog musical dictionary)
Multimedia directory services (eg yellow pages)
Broadcast media selection (radio channel TV channel)
HP 31
Information and Content Exchange (ICE)
Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax
Authors Adobe Kinecta MS Sun Vignette et al
Status latest spec version 11 May 2000 submitted to W3C for review
Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip
Web Site httpwwwicestandardorg
HP 32
What is the ICE Protocol
Syndication Protocol for communication between
Syndicators and Subscribers
Metadata to define
roles and responsibilities of involved parties Subscriber vs
Syndicator Requestor vs Responder Sender vs Receiver
format and method of content exchange (eg sequenced
packages pull vs push model)
HP 33
ICE Applications
ICE vocabulary + domain vocabulary = complete application
ICEestablishes and manages the syndication
delivers data
logs events
=gt content-independent metadata
industry-specific vocabulary defines the content =gt domain-specific metadata
Source httpwwwicestandardorg
HP 34
ICE Explained
ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information
Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm
ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe
timestamp=05-15-2001T110001 iceversion=10 gt
ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321
subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt
ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt
PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)
ltcomic-stripgtltice-itemgt ltice-item-groupgt
ltice-responsegt ltice-payloadgt
Content (domain-specific
metadata)
HP 36
XCM (eXtended Content Management)
a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are
Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval
Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle
Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability
Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html
HP 37
XCM
eXtended Content Management
Content DevelopmentManagement
Content DeliveryApplication ContentManagement
Content AuthoringDigital Asset Management
Software ConfigurationManagement
Document ProcessManagement
Metadata ManagementRecombinationPersonalization
Edge Network Delivery
Streaming Media DeliveryCaching
Source httpwwwvignettecom
HP 38
Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain
FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph
hydrography transportation
TitleTitle Dakota Aquifer
Online linkageOnline linkagehttpgisdasckgsukansedudasc
Direct Spatial Reference MethodDirect Spatial Reference Method Vector
Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator
hellip hellip hellip
UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation
TopicTopic Dakota Aquifer
AdressAdress IdIdhttpgisdasckgsukansedudasc
Measuring TechniquesMeasuring Techniques Vector
CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator
hellip hellip hellip
Kansas StateKansas State
HP 39
Different views of Metadata
Domain Independent Specifications (RDF)
FrameworksInfrastructures (XCM)
MetadataApplication Specific
ICE
Media Specific
MPEG7 VoiceXML
Domain Specific
NewsML FGDCUDK
HP 40
Creating and Serving Metadata to Power the Life-cycle of Content
Taalee Infrastructure Services Taalee Content Applications
Where is the content
Whose is it
ProduceAggregate
CatalogIndex
What other content is it related to
Integrate Syndicate
What is the right content for this
user
Personalize
What is the best way to
monetize this interaction
Interactive Marketing
BroadcastWirelineWirelessInteractive TV
Taalee Semantic MetaBase
HP 41
Taaleersquos Intelligent Content Process
HP 42
Metadata Creation and Semanticization
bull Automatic Content ClassificationCategorization
bull Metadata CreationExtractionTypes of metadata created
Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc
HP 43
FormsTypesIngest of Content
Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)
HP 44
Content HandlingIngest
InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents
Centralized Distributed MobileMigratory
HP 45
Information Extraction for Metadata Creation
GlobalEnterpriseWeb Repositories
METADATAMETADATA
EXTRACTORSEXTRACTORS
Digital Maps
NexisUPIAP
Documents
Digital Audios
Data Stores
Digital Videos
Digital Images
HP 46
Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach
INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT
NATIONAL PREPAREDNESS LEVEL II
CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection
SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues
CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning
LAYOUT
Date =gt day month int lsquorsquo int
Traditional TextCategorization
StatisticalAI Techniques
Classify Place ina taxonomy
feed
Customer Training
Set
RoutingDistribution
Customer Article Feed
4715
Standard Metadata
Feed Source iSyndicate
Posted Date 11202000
Classification of Article 4715
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 8
Metadata for Digital DataMetadata for Digital Metadata for Digital Data
Metadata Data Type Metadata TypeQ-Features [Jain and Hampapur] Image Video Domain SpecificR-Features [Jain and Hampapur] Image Video Domain IndependentMeta-Features [Jain and Hampapur] Image Video Content IndependentImpression Vector [Kiyoki et al] Image Content DescriptiveNDVI Spatial Registration [Anderson and Stonebraker] Image Domain SpecificSpeech Feature Index [Glavitsch et al] Audio Direct Content BasedTopic Change Indices [Chen et al] Audio Direct Content BasedDocument Vectors [ Deerwester et al] Text Direct Content BasedInverted Indices [Kahle and Medlar] Text Direct Content BasedContent Classification Metadata [Bohm and Rakow] MultiMedia Domain SpecificDocument Composition Metadata [Bohm and Rakow] MultiMedia Domain IndependentMetadata Templates [Ordille and Miller] Media Independent Domain SpecificLand Cover Relief [Sheth and Kashyap] Media Independent Domain SpecificParent Child Relationships [Shklar et al] Text Domain IndependentContexts [Sciore et al Kashyap and Sheth] Structured Domain SpecificConcepts from Cyc [Collet et al] Structured Domain SpecificUserrsquos Data Attributes [Shoens et al] Text Structured Domain SpecificDomain Specific Ontologies [Mena et al] Media Independent Domain Specific
HP 9
Types of Specs and Standards(or MetaModels)
Domain Independent (MCF) RDF MOF DublinCore
Media Specific MPEG4 MPEG7 VoiceXML
DomainIndustry Specific (metamodels) MARC (Library) FGDC and UDK (Geographic) NewsML (News) PRISM (Publishing)
Application Specific ICE (Syndication)
ExchangeSharing XCM XMI
Orthogonal(Other) RDFS namespaces ontologies domain models (DAML OIL)
HP 10
what RDF can do for metadata
Designed to impose structural constraint on syntax to support consistent encoding exchange and processingof metadata
Domain Independent Metadata standard
HP 11
RDF (Resource Description Format)
PropertyValueResource
bullRDF data consists of nodes and attached attributevalue pairs
bullNodes can be any web resources (pages servers basically anything for which you can give a URI) even other instances of metadata
bullAttributes are named properties of the nodes and their values are either atomic (text strings numbers etc) or other resources or metadata instances
HP 12
RDF Example 1
URIAMITdccreator
dctitleMysteries of Metadata
URITALK
ltXML version=lsquo10rsquogt ltrdfRDF xmlnsrdf = ldquohttpwwww3orgTRREC-rdf-syntaxrdquo xmlnsdc = ldquohttppurlorgdcelements10rdquogt ltrdfDescription rdfabout = ldquoURITALKrdquogt ltdctitlegtMysteries of Metadataltdctitlegt ltdccreator rdfresource = ldquoURIAMITrdquogt ltrdfDescriptiongt ltrdfRDFgt
HP 13
RDF Example 2
URIAMITdccreator
URILIB amittaaleecom
BIBEmailBIBName
BIBAff
dctitleMysteries of Metadata
URITALK
Amit Sheth
HP 14
RDFS (RDF Schema)
Enables resource description communities to define
(and share) vocabularies (museum library e-
commercehellip)
Vocabulary (in RDFS) = the meaning characteristics
and relationships of a set of properties
HP 15
RDF Based Web
HTML
Resources
RDFXMLDescriptions
RDFSchemas
Sourcehttpwwww3crlacuk
HP 16
Dublin Core Metadata Initiative
Simple element set designed for resource description
International inter-discipline W3C community consensus
ldquoSemanticrdquo interface among resource description communities (very limited form of semantics)
Sourcewwwdesireorg
HP 17
Dublin Core RDF
ltxmlgtltnamespace href = httpw3orgrdf-schema as = RDFgtltnamespace href = httpmetadatanetDC as = DCgtltRDFAbbreviatedgtltRDFAssertion RDFHREF = httpwwwmysitecommydochtmlDCTitle = Ive Never Metadata Ive Never LikedldquoDCCreator = Mary CrystalldquoDCSubject = Metadata Dublin Core StuffgtltRDFAbbreviatedgtltxmlgt
HP 18
MOF (Metadata Object Facility) and XMI
MOF models metadata using a subset of UML that is
relevant to modeling metadata (class models - classes
associations and subtyping) a set of rules for mapping
the elements of the MOF Core to CORBA IDL
XML Metadata Interchange (XMI) is an extension of the
MOF into the XML space
HP 19
NewsML
NewsML is a packaging and metadata format for news contentNewsML is developed by the International Press Telecommunications Council (IPTC) a consortium of news providers mostly in the print or wire-service industries Since it deals only with packaging and metadata NewsML is complementary both to news content formats like NITF and to syndication protocols like ICE
HP 20
NewsMLhellip
It can be used by news providers to combine their pictures video text graphics and audio files in news output available on web sites mobile phones high end desktops interactive television and any other deviceaccurate objective set of description tools which help qualify the information and make the search more preciseNewsML allows a range of metadata to be attached to a multi-media story including a detailed computer-readable description of what an item is about
HP 21
Example of the end-to-end flow -NewsML
The content providersupplies NewsML packaged media content to the operator The content is categorized as current events finance sport etc and updated hourly
The operator receives NewsML data from the content provider The content server automatically pushes updated news articles to all news service subscribers
Consumers sign up for the news service directly on the device When using the news service the user browses through the categories and reads the news articles The news articles are presented in a continuous flow (one after the other) without end-user interaction
Sourcehttpwwwmediabrickscom
HP 22
PRISM
Publishing Requirements for Industry Standard MetadataVersion 10 April 2001Authors IDEAlliance (Adobe Vignette Kinecta et al)Idea ldquoa standard for interoperable content description interchange and reuse in both traditional and electronic publishing contextsrdquoWeb site httpwwwprismstandardorg
HP 23
PRISM Design
Built on existing standards like Dublin Core (DC) RDF XMLDesigned to be used in a simple straightforward way over the InternetCompatible with NewsMLIntegrates easily with ICE (for syndication)Vocabulary
Basic DCExtensions ldquoControlled Vocabulariesrdquo eg ldquoNorth American Industrial Classification Systemldquo (NAICS)
HP 24
PRISM Example
ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10
xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt
ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt
ltrdfDescriptiongtltrdfRDFgt
(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)
HP 25
VoiceXML
A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech
(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input
Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications
High- level voice-specific language simplifies application development
Source httpwwwvoicexmlorg
HP 26
Voice Based Internet Applications
Source httpwwwvoicexmlorg
HP 27
Voice XML Metadata
Voice Specific metadata
Supports Syntactic interoperablity
Text data to voice data
Voice XML = XML + Voice Metadata
HP 28
VoiceXML ndash Possible Services
Information retrieval ndash News sports traffic stock quotes
e- Transactions (e- commerce e- tailing etc)
Financial banking stock trading
Catalog browsing (generally as an adjunct to paper)
Telephone services
Personal voice dialing One- number find- me services
Intranet ndash Inventory HR services corporate portals
Unification ndash My Whatever personal portals personal agents unified messaging
Source httpwwwvoicexmlorg
HP 29
MPEG7
set of description scheme and descriptors to describe the content of multimedia data
Provides a language to specify description schemes
A scheme for coding the description
HP 30
Application Examples for MPEG7
A few application examples are
Digital libraries (image catalog musical dictionary)
Multimedia directory services (eg yellow pages)
Broadcast media selection (radio channel TV channel)
HP 31
Information and Content Exchange (ICE)
Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax
Authors Adobe Kinecta MS Sun Vignette et al
Status latest spec version 11 May 2000 submitted to W3C for review
Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip
Web Site httpwwwicestandardorg
HP 32
What is the ICE Protocol
Syndication Protocol for communication between
Syndicators and Subscribers
Metadata to define
roles and responsibilities of involved parties Subscriber vs
Syndicator Requestor vs Responder Sender vs Receiver
format and method of content exchange (eg sequenced
packages pull vs push model)
HP 33
ICE Applications
ICE vocabulary + domain vocabulary = complete application
ICEestablishes and manages the syndication
delivers data
logs events
=gt content-independent metadata
industry-specific vocabulary defines the content =gt domain-specific metadata
Source httpwwwicestandardorg
HP 34
ICE Explained
ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information
Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm
ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe
timestamp=05-15-2001T110001 iceversion=10 gt
ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321
subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt
ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt
PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)
ltcomic-stripgtltice-itemgt ltice-item-groupgt
ltice-responsegt ltice-payloadgt
Content (domain-specific
metadata)
HP 36
XCM (eXtended Content Management)
a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are
Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval
Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle
Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability
Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html
HP 37
XCM
eXtended Content Management
Content DevelopmentManagement
Content DeliveryApplication ContentManagement
Content AuthoringDigital Asset Management
Software ConfigurationManagement
Document ProcessManagement
Metadata ManagementRecombinationPersonalization
Edge Network Delivery
Streaming Media DeliveryCaching
Source httpwwwvignettecom
HP 38
Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain
FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph
hydrography transportation
TitleTitle Dakota Aquifer
Online linkageOnline linkagehttpgisdasckgsukansedudasc
Direct Spatial Reference MethodDirect Spatial Reference Method Vector
Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator
hellip hellip hellip
UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation
TopicTopic Dakota Aquifer
AdressAdress IdIdhttpgisdasckgsukansedudasc
Measuring TechniquesMeasuring Techniques Vector
CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator
hellip hellip hellip
Kansas StateKansas State
HP 39
Different views of Metadata
Domain Independent Specifications (RDF)
FrameworksInfrastructures (XCM)
MetadataApplication Specific
ICE
Media Specific
MPEG7 VoiceXML
Domain Specific
NewsML FGDCUDK
HP 40
Creating and Serving Metadata to Power the Life-cycle of Content
Taalee Infrastructure Services Taalee Content Applications
Where is the content
Whose is it
ProduceAggregate
CatalogIndex
What other content is it related to
Integrate Syndicate
What is the right content for this
user
Personalize
What is the best way to
monetize this interaction
Interactive Marketing
BroadcastWirelineWirelessInteractive TV
Taalee Semantic MetaBase
HP 41
Taaleersquos Intelligent Content Process
HP 42
Metadata Creation and Semanticization
bull Automatic Content ClassificationCategorization
bull Metadata CreationExtractionTypes of metadata created
Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc
HP 43
FormsTypesIngest of Content
Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)
HP 44
Content HandlingIngest
InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents
Centralized Distributed MobileMigratory
HP 45
Information Extraction for Metadata Creation
GlobalEnterpriseWeb Repositories
METADATAMETADATA
EXTRACTORSEXTRACTORS
Digital Maps
NexisUPIAP
Documents
Digital Audios
Data Stores
Digital Videos
Digital Images
HP 46
Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach
INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT
NATIONAL PREPAREDNESS LEVEL II
CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection
SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues
CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning
LAYOUT
Date =gt day month int lsquorsquo int
Traditional TextCategorization
StatisticalAI Techniques
Classify Place ina taxonomy
feed
Customer Training
Set
RoutingDistribution
Customer Article Feed
4715
Standard Metadata
Feed Source iSyndicate
Posted Date 11202000
Classification of Article 4715
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 9
Types of Specs and Standards(or MetaModels)
Domain Independent (MCF) RDF MOF DublinCore
Media Specific MPEG4 MPEG7 VoiceXML
DomainIndustry Specific (metamodels) MARC (Library) FGDC and UDK (Geographic) NewsML (News) PRISM (Publishing)
Application Specific ICE (Syndication)
ExchangeSharing XCM XMI
Orthogonal(Other) RDFS namespaces ontologies domain models (DAML OIL)
HP 10
what RDF can do for metadata
Designed to impose structural constraint on syntax to support consistent encoding exchange and processingof metadata
Domain Independent Metadata standard
HP 11
RDF (Resource Description Format)
PropertyValueResource
bullRDF data consists of nodes and attached attributevalue pairs
bullNodes can be any web resources (pages servers basically anything for which you can give a URI) even other instances of metadata
bullAttributes are named properties of the nodes and their values are either atomic (text strings numbers etc) or other resources or metadata instances
HP 12
RDF Example 1
URIAMITdccreator
dctitleMysteries of Metadata
URITALK
ltXML version=lsquo10rsquogt ltrdfRDF xmlnsrdf = ldquohttpwwww3orgTRREC-rdf-syntaxrdquo xmlnsdc = ldquohttppurlorgdcelements10rdquogt ltrdfDescription rdfabout = ldquoURITALKrdquogt ltdctitlegtMysteries of Metadataltdctitlegt ltdccreator rdfresource = ldquoURIAMITrdquogt ltrdfDescriptiongt ltrdfRDFgt
HP 13
RDF Example 2
URIAMITdccreator
URILIB amittaaleecom
BIBEmailBIBName
BIBAff
dctitleMysteries of Metadata
URITALK
Amit Sheth
HP 14
RDFS (RDF Schema)
Enables resource description communities to define
(and share) vocabularies (museum library e-
commercehellip)
Vocabulary (in RDFS) = the meaning characteristics
and relationships of a set of properties
HP 15
RDF Based Web
HTML
Resources
RDFXMLDescriptions
RDFSchemas
Sourcehttpwwww3crlacuk
HP 16
Dublin Core Metadata Initiative
Simple element set designed for resource description
International inter-discipline W3C community consensus
ldquoSemanticrdquo interface among resource description communities (very limited form of semantics)
Sourcewwwdesireorg
HP 17
Dublin Core RDF
ltxmlgtltnamespace href = httpw3orgrdf-schema as = RDFgtltnamespace href = httpmetadatanetDC as = DCgtltRDFAbbreviatedgtltRDFAssertion RDFHREF = httpwwwmysitecommydochtmlDCTitle = Ive Never Metadata Ive Never LikedldquoDCCreator = Mary CrystalldquoDCSubject = Metadata Dublin Core StuffgtltRDFAbbreviatedgtltxmlgt
HP 18
MOF (Metadata Object Facility) and XMI
MOF models metadata using a subset of UML that is
relevant to modeling metadata (class models - classes
associations and subtyping) a set of rules for mapping
the elements of the MOF Core to CORBA IDL
XML Metadata Interchange (XMI) is an extension of the
MOF into the XML space
HP 19
NewsML
NewsML is a packaging and metadata format for news contentNewsML is developed by the International Press Telecommunications Council (IPTC) a consortium of news providers mostly in the print or wire-service industries Since it deals only with packaging and metadata NewsML is complementary both to news content formats like NITF and to syndication protocols like ICE
HP 20
NewsMLhellip
It can be used by news providers to combine their pictures video text graphics and audio files in news output available on web sites mobile phones high end desktops interactive television and any other deviceaccurate objective set of description tools which help qualify the information and make the search more preciseNewsML allows a range of metadata to be attached to a multi-media story including a detailed computer-readable description of what an item is about
HP 21
Example of the end-to-end flow -NewsML
The content providersupplies NewsML packaged media content to the operator The content is categorized as current events finance sport etc and updated hourly
The operator receives NewsML data from the content provider The content server automatically pushes updated news articles to all news service subscribers
Consumers sign up for the news service directly on the device When using the news service the user browses through the categories and reads the news articles The news articles are presented in a continuous flow (one after the other) without end-user interaction
Sourcehttpwwwmediabrickscom
HP 22
PRISM
Publishing Requirements for Industry Standard MetadataVersion 10 April 2001Authors IDEAlliance (Adobe Vignette Kinecta et al)Idea ldquoa standard for interoperable content description interchange and reuse in both traditional and electronic publishing contextsrdquoWeb site httpwwwprismstandardorg
HP 23
PRISM Design
Built on existing standards like Dublin Core (DC) RDF XMLDesigned to be used in a simple straightforward way over the InternetCompatible with NewsMLIntegrates easily with ICE (for syndication)Vocabulary
Basic DCExtensions ldquoControlled Vocabulariesrdquo eg ldquoNorth American Industrial Classification Systemldquo (NAICS)
HP 24
PRISM Example
ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10
xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt
ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt
ltrdfDescriptiongtltrdfRDFgt
(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)
HP 25
VoiceXML
A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech
(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input
Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications
High- level voice-specific language simplifies application development
Source httpwwwvoicexmlorg
HP 26
Voice Based Internet Applications
Source httpwwwvoicexmlorg
HP 27
Voice XML Metadata
Voice Specific metadata
Supports Syntactic interoperablity
Text data to voice data
Voice XML = XML + Voice Metadata
HP 28
VoiceXML ndash Possible Services
Information retrieval ndash News sports traffic stock quotes
e- Transactions (e- commerce e- tailing etc)
Financial banking stock trading
Catalog browsing (generally as an adjunct to paper)
Telephone services
Personal voice dialing One- number find- me services
Intranet ndash Inventory HR services corporate portals
Unification ndash My Whatever personal portals personal agents unified messaging
Source httpwwwvoicexmlorg
HP 29
MPEG7
set of description scheme and descriptors to describe the content of multimedia data
Provides a language to specify description schemes
A scheme for coding the description
HP 30
Application Examples for MPEG7
A few application examples are
Digital libraries (image catalog musical dictionary)
Multimedia directory services (eg yellow pages)
Broadcast media selection (radio channel TV channel)
HP 31
Information and Content Exchange (ICE)
Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax
Authors Adobe Kinecta MS Sun Vignette et al
Status latest spec version 11 May 2000 submitted to W3C for review
Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip
Web Site httpwwwicestandardorg
HP 32
What is the ICE Protocol
Syndication Protocol for communication between
Syndicators and Subscribers
Metadata to define
roles and responsibilities of involved parties Subscriber vs
Syndicator Requestor vs Responder Sender vs Receiver
format and method of content exchange (eg sequenced
packages pull vs push model)
HP 33
ICE Applications
ICE vocabulary + domain vocabulary = complete application
ICEestablishes and manages the syndication
delivers data
logs events
=gt content-independent metadata
industry-specific vocabulary defines the content =gt domain-specific metadata
Source httpwwwicestandardorg
HP 34
ICE Explained
ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information
Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm
ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe
timestamp=05-15-2001T110001 iceversion=10 gt
ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321
subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt
ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt
PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)
ltcomic-stripgtltice-itemgt ltice-item-groupgt
ltice-responsegt ltice-payloadgt
Content (domain-specific
metadata)
HP 36
XCM (eXtended Content Management)
a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are
Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval
Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle
Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability
Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html
HP 37
XCM
eXtended Content Management
Content DevelopmentManagement
Content DeliveryApplication ContentManagement
Content AuthoringDigital Asset Management
Software ConfigurationManagement
Document ProcessManagement
Metadata ManagementRecombinationPersonalization
Edge Network Delivery
Streaming Media DeliveryCaching
Source httpwwwvignettecom
HP 38
Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain
FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph
hydrography transportation
TitleTitle Dakota Aquifer
Online linkageOnline linkagehttpgisdasckgsukansedudasc
Direct Spatial Reference MethodDirect Spatial Reference Method Vector
Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator
hellip hellip hellip
UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation
TopicTopic Dakota Aquifer
AdressAdress IdIdhttpgisdasckgsukansedudasc
Measuring TechniquesMeasuring Techniques Vector
CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator
hellip hellip hellip
Kansas StateKansas State
HP 39
Different views of Metadata
Domain Independent Specifications (RDF)
FrameworksInfrastructures (XCM)
MetadataApplication Specific
ICE
Media Specific
MPEG7 VoiceXML
Domain Specific
NewsML FGDCUDK
HP 40
Creating and Serving Metadata to Power the Life-cycle of Content
Taalee Infrastructure Services Taalee Content Applications
Where is the content
Whose is it
ProduceAggregate
CatalogIndex
What other content is it related to
Integrate Syndicate
What is the right content for this
user
Personalize
What is the best way to
monetize this interaction
Interactive Marketing
BroadcastWirelineWirelessInteractive TV
Taalee Semantic MetaBase
HP 41
Taaleersquos Intelligent Content Process
HP 42
Metadata Creation and Semanticization
bull Automatic Content ClassificationCategorization
bull Metadata CreationExtractionTypes of metadata created
Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc
HP 43
FormsTypesIngest of Content
Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)
HP 44
Content HandlingIngest
InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents
Centralized Distributed MobileMigratory
HP 45
Information Extraction for Metadata Creation
GlobalEnterpriseWeb Repositories
METADATAMETADATA
EXTRACTORSEXTRACTORS
Digital Maps
NexisUPIAP
Documents
Digital Audios
Data Stores
Digital Videos
Digital Images
HP 46
Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach
INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT
NATIONAL PREPAREDNESS LEVEL II
CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection
SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues
CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning
LAYOUT
Date =gt day month int lsquorsquo int
Traditional TextCategorization
StatisticalAI Techniques
Classify Place ina taxonomy
feed
Customer Training
Set
RoutingDistribution
Customer Article Feed
4715
Standard Metadata
Feed Source iSyndicate
Posted Date 11202000
Classification of Article 4715
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 10
what RDF can do for metadata
Designed to impose structural constraint on syntax to support consistent encoding exchange and processingof metadata
Domain Independent Metadata standard
HP 11
RDF (Resource Description Format)
PropertyValueResource
bullRDF data consists of nodes and attached attributevalue pairs
bullNodes can be any web resources (pages servers basically anything for which you can give a URI) even other instances of metadata
bullAttributes are named properties of the nodes and their values are either atomic (text strings numbers etc) or other resources or metadata instances
HP 12
RDF Example 1
URIAMITdccreator
dctitleMysteries of Metadata
URITALK
ltXML version=lsquo10rsquogt ltrdfRDF xmlnsrdf = ldquohttpwwww3orgTRREC-rdf-syntaxrdquo xmlnsdc = ldquohttppurlorgdcelements10rdquogt ltrdfDescription rdfabout = ldquoURITALKrdquogt ltdctitlegtMysteries of Metadataltdctitlegt ltdccreator rdfresource = ldquoURIAMITrdquogt ltrdfDescriptiongt ltrdfRDFgt
HP 13
RDF Example 2
URIAMITdccreator
URILIB amittaaleecom
BIBEmailBIBName
BIBAff
dctitleMysteries of Metadata
URITALK
Amit Sheth
HP 14
RDFS (RDF Schema)
Enables resource description communities to define
(and share) vocabularies (museum library e-
commercehellip)
Vocabulary (in RDFS) = the meaning characteristics
and relationships of a set of properties
HP 15
RDF Based Web
HTML
Resources
RDFXMLDescriptions
RDFSchemas
Sourcehttpwwww3crlacuk
HP 16
Dublin Core Metadata Initiative
Simple element set designed for resource description
International inter-discipline W3C community consensus
ldquoSemanticrdquo interface among resource description communities (very limited form of semantics)
Sourcewwwdesireorg
HP 17
Dublin Core RDF
ltxmlgtltnamespace href = httpw3orgrdf-schema as = RDFgtltnamespace href = httpmetadatanetDC as = DCgtltRDFAbbreviatedgtltRDFAssertion RDFHREF = httpwwwmysitecommydochtmlDCTitle = Ive Never Metadata Ive Never LikedldquoDCCreator = Mary CrystalldquoDCSubject = Metadata Dublin Core StuffgtltRDFAbbreviatedgtltxmlgt
HP 18
MOF (Metadata Object Facility) and XMI
MOF models metadata using a subset of UML that is
relevant to modeling metadata (class models - classes
associations and subtyping) a set of rules for mapping
the elements of the MOF Core to CORBA IDL
XML Metadata Interchange (XMI) is an extension of the
MOF into the XML space
HP 19
NewsML
NewsML is a packaging and metadata format for news contentNewsML is developed by the International Press Telecommunications Council (IPTC) a consortium of news providers mostly in the print or wire-service industries Since it deals only with packaging and metadata NewsML is complementary both to news content formats like NITF and to syndication protocols like ICE
HP 20
NewsMLhellip
It can be used by news providers to combine their pictures video text graphics and audio files in news output available on web sites mobile phones high end desktops interactive television and any other deviceaccurate objective set of description tools which help qualify the information and make the search more preciseNewsML allows a range of metadata to be attached to a multi-media story including a detailed computer-readable description of what an item is about
HP 21
Example of the end-to-end flow -NewsML
The content providersupplies NewsML packaged media content to the operator The content is categorized as current events finance sport etc and updated hourly
The operator receives NewsML data from the content provider The content server automatically pushes updated news articles to all news service subscribers
Consumers sign up for the news service directly on the device When using the news service the user browses through the categories and reads the news articles The news articles are presented in a continuous flow (one after the other) without end-user interaction
Sourcehttpwwwmediabrickscom
HP 22
PRISM
Publishing Requirements for Industry Standard MetadataVersion 10 April 2001Authors IDEAlliance (Adobe Vignette Kinecta et al)Idea ldquoa standard for interoperable content description interchange and reuse in both traditional and electronic publishing contextsrdquoWeb site httpwwwprismstandardorg
HP 23
PRISM Design
Built on existing standards like Dublin Core (DC) RDF XMLDesigned to be used in a simple straightforward way over the InternetCompatible with NewsMLIntegrates easily with ICE (for syndication)Vocabulary
Basic DCExtensions ldquoControlled Vocabulariesrdquo eg ldquoNorth American Industrial Classification Systemldquo (NAICS)
HP 24
PRISM Example
ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10
xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt
ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt
ltrdfDescriptiongtltrdfRDFgt
(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)
HP 25
VoiceXML
A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech
(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input
Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications
High- level voice-specific language simplifies application development
Source httpwwwvoicexmlorg
HP 26
Voice Based Internet Applications
Source httpwwwvoicexmlorg
HP 27
Voice XML Metadata
Voice Specific metadata
Supports Syntactic interoperablity
Text data to voice data
Voice XML = XML + Voice Metadata
HP 28
VoiceXML ndash Possible Services
Information retrieval ndash News sports traffic stock quotes
e- Transactions (e- commerce e- tailing etc)
Financial banking stock trading
Catalog browsing (generally as an adjunct to paper)
Telephone services
Personal voice dialing One- number find- me services
Intranet ndash Inventory HR services corporate portals
Unification ndash My Whatever personal portals personal agents unified messaging
Source httpwwwvoicexmlorg
HP 29
MPEG7
set of description scheme and descriptors to describe the content of multimedia data
Provides a language to specify description schemes
A scheme for coding the description
HP 30
Application Examples for MPEG7
A few application examples are
Digital libraries (image catalog musical dictionary)
Multimedia directory services (eg yellow pages)
Broadcast media selection (radio channel TV channel)
HP 31
Information and Content Exchange (ICE)
Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax
Authors Adobe Kinecta MS Sun Vignette et al
Status latest spec version 11 May 2000 submitted to W3C for review
Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip
Web Site httpwwwicestandardorg
HP 32
What is the ICE Protocol
Syndication Protocol for communication between
Syndicators and Subscribers
Metadata to define
roles and responsibilities of involved parties Subscriber vs
Syndicator Requestor vs Responder Sender vs Receiver
format and method of content exchange (eg sequenced
packages pull vs push model)
HP 33
ICE Applications
ICE vocabulary + domain vocabulary = complete application
ICEestablishes and manages the syndication
delivers data
logs events
=gt content-independent metadata
industry-specific vocabulary defines the content =gt domain-specific metadata
Source httpwwwicestandardorg
HP 34
ICE Explained
ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information
Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm
ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe
timestamp=05-15-2001T110001 iceversion=10 gt
ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321
subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt
ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt
PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)
ltcomic-stripgtltice-itemgt ltice-item-groupgt
ltice-responsegt ltice-payloadgt
Content (domain-specific
metadata)
HP 36
XCM (eXtended Content Management)
a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are
Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval
Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle
Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability
Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html
HP 37
XCM
eXtended Content Management
Content DevelopmentManagement
Content DeliveryApplication ContentManagement
Content AuthoringDigital Asset Management
Software ConfigurationManagement
Document ProcessManagement
Metadata ManagementRecombinationPersonalization
Edge Network Delivery
Streaming Media DeliveryCaching
Source httpwwwvignettecom
HP 38
Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain
FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph
hydrography transportation
TitleTitle Dakota Aquifer
Online linkageOnline linkagehttpgisdasckgsukansedudasc
Direct Spatial Reference MethodDirect Spatial Reference Method Vector
Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator
hellip hellip hellip
UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation
TopicTopic Dakota Aquifer
AdressAdress IdIdhttpgisdasckgsukansedudasc
Measuring TechniquesMeasuring Techniques Vector
CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator
hellip hellip hellip
Kansas StateKansas State
HP 39
Different views of Metadata
Domain Independent Specifications (RDF)
FrameworksInfrastructures (XCM)
MetadataApplication Specific
ICE
Media Specific
MPEG7 VoiceXML
Domain Specific
NewsML FGDCUDK
HP 40
Creating and Serving Metadata to Power the Life-cycle of Content
Taalee Infrastructure Services Taalee Content Applications
Where is the content
Whose is it
ProduceAggregate
CatalogIndex
What other content is it related to
Integrate Syndicate
What is the right content for this
user
Personalize
What is the best way to
monetize this interaction
Interactive Marketing
BroadcastWirelineWirelessInteractive TV
Taalee Semantic MetaBase
HP 41
Taaleersquos Intelligent Content Process
HP 42
Metadata Creation and Semanticization
bull Automatic Content ClassificationCategorization
bull Metadata CreationExtractionTypes of metadata created
Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc
HP 43
FormsTypesIngest of Content
Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)
HP 44
Content HandlingIngest
InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents
Centralized Distributed MobileMigratory
HP 45
Information Extraction for Metadata Creation
GlobalEnterpriseWeb Repositories
METADATAMETADATA
EXTRACTORSEXTRACTORS
Digital Maps
NexisUPIAP
Documents
Digital Audios
Data Stores
Digital Videos
Digital Images
HP 46
Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach
INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT
NATIONAL PREPAREDNESS LEVEL II
CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection
SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues
CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning
LAYOUT
Date =gt day month int lsquorsquo int
Traditional TextCategorization
StatisticalAI Techniques
Classify Place ina taxonomy
feed
Customer Training
Set
RoutingDistribution
Customer Article Feed
4715
Standard Metadata
Feed Source iSyndicate
Posted Date 11202000
Classification of Article 4715
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 11
RDF (Resource Description Format)
PropertyValueResource
bullRDF data consists of nodes and attached attributevalue pairs
bullNodes can be any web resources (pages servers basically anything for which you can give a URI) even other instances of metadata
bullAttributes are named properties of the nodes and their values are either atomic (text strings numbers etc) or other resources or metadata instances
HP 12
RDF Example 1
URIAMITdccreator
dctitleMysteries of Metadata
URITALK
ltXML version=lsquo10rsquogt ltrdfRDF xmlnsrdf = ldquohttpwwww3orgTRREC-rdf-syntaxrdquo xmlnsdc = ldquohttppurlorgdcelements10rdquogt ltrdfDescription rdfabout = ldquoURITALKrdquogt ltdctitlegtMysteries of Metadataltdctitlegt ltdccreator rdfresource = ldquoURIAMITrdquogt ltrdfDescriptiongt ltrdfRDFgt
HP 13
RDF Example 2
URIAMITdccreator
URILIB amittaaleecom
BIBEmailBIBName
BIBAff
dctitleMysteries of Metadata
URITALK
Amit Sheth
HP 14
RDFS (RDF Schema)
Enables resource description communities to define
(and share) vocabularies (museum library e-
commercehellip)
Vocabulary (in RDFS) = the meaning characteristics
and relationships of a set of properties
HP 15
RDF Based Web
HTML
Resources
RDFXMLDescriptions
RDFSchemas
Sourcehttpwwww3crlacuk
HP 16
Dublin Core Metadata Initiative
Simple element set designed for resource description
International inter-discipline W3C community consensus
ldquoSemanticrdquo interface among resource description communities (very limited form of semantics)
Sourcewwwdesireorg
HP 17
Dublin Core RDF
ltxmlgtltnamespace href = httpw3orgrdf-schema as = RDFgtltnamespace href = httpmetadatanetDC as = DCgtltRDFAbbreviatedgtltRDFAssertion RDFHREF = httpwwwmysitecommydochtmlDCTitle = Ive Never Metadata Ive Never LikedldquoDCCreator = Mary CrystalldquoDCSubject = Metadata Dublin Core StuffgtltRDFAbbreviatedgtltxmlgt
HP 18
MOF (Metadata Object Facility) and XMI
MOF models metadata using a subset of UML that is
relevant to modeling metadata (class models - classes
associations and subtyping) a set of rules for mapping
the elements of the MOF Core to CORBA IDL
XML Metadata Interchange (XMI) is an extension of the
MOF into the XML space
HP 19
NewsML
NewsML is a packaging and metadata format for news contentNewsML is developed by the International Press Telecommunications Council (IPTC) a consortium of news providers mostly in the print or wire-service industries Since it deals only with packaging and metadata NewsML is complementary both to news content formats like NITF and to syndication protocols like ICE
HP 20
NewsMLhellip
It can be used by news providers to combine their pictures video text graphics and audio files in news output available on web sites mobile phones high end desktops interactive television and any other deviceaccurate objective set of description tools which help qualify the information and make the search more preciseNewsML allows a range of metadata to be attached to a multi-media story including a detailed computer-readable description of what an item is about
HP 21
Example of the end-to-end flow -NewsML
The content providersupplies NewsML packaged media content to the operator The content is categorized as current events finance sport etc and updated hourly
The operator receives NewsML data from the content provider The content server automatically pushes updated news articles to all news service subscribers
Consumers sign up for the news service directly on the device When using the news service the user browses through the categories and reads the news articles The news articles are presented in a continuous flow (one after the other) without end-user interaction
Sourcehttpwwwmediabrickscom
HP 22
PRISM
Publishing Requirements for Industry Standard MetadataVersion 10 April 2001Authors IDEAlliance (Adobe Vignette Kinecta et al)Idea ldquoa standard for interoperable content description interchange and reuse in both traditional and electronic publishing contextsrdquoWeb site httpwwwprismstandardorg
HP 23
PRISM Design
Built on existing standards like Dublin Core (DC) RDF XMLDesigned to be used in a simple straightforward way over the InternetCompatible with NewsMLIntegrates easily with ICE (for syndication)Vocabulary
Basic DCExtensions ldquoControlled Vocabulariesrdquo eg ldquoNorth American Industrial Classification Systemldquo (NAICS)
HP 24
PRISM Example
ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10
xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt
ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt
ltrdfDescriptiongtltrdfRDFgt
(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)
HP 25
VoiceXML
A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech
(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input
Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications
High- level voice-specific language simplifies application development
Source httpwwwvoicexmlorg
HP 26
Voice Based Internet Applications
Source httpwwwvoicexmlorg
HP 27
Voice XML Metadata
Voice Specific metadata
Supports Syntactic interoperablity
Text data to voice data
Voice XML = XML + Voice Metadata
HP 28
VoiceXML ndash Possible Services
Information retrieval ndash News sports traffic stock quotes
e- Transactions (e- commerce e- tailing etc)
Financial banking stock trading
Catalog browsing (generally as an adjunct to paper)
Telephone services
Personal voice dialing One- number find- me services
Intranet ndash Inventory HR services corporate portals
Unification ndash My Whatever personal portals personal agents unified messaging
Source httpwwwvoicexmlorg
HP 29
MPEG7
set of description scheme and descriptors to describe the content of multimedia data
Provides a language to specify description schemes
A scheme for coding the description
HP 30
Application Examples for MPEG7
A few application examples are
Digital libraries (image catalog musical dictionary)
Multimedia directory services (eg yellow pages)
Broadcast media selection (radio channel TV channel)
HP 31
Information and Content Exchange (ICE)
Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax
Authors Adobe Kinecta MS Sun Vignette et al
Status latest spec version 11 May 2000 submitted to W3C for review
Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip
Web Site httpwwwicestandardorg
HP 32
What is the ICE Protocol
Syndication Protocol for communication between
Syndicators and Subscribers
Metadata to define
roles and responsibilities of involved parties Subscriber vs
Syndicator Requestor vs Responder Sender vs Receiver
format and method of content exchange (eg sequenced
packages pull vs push model)
HP 33
ICE Applications
ICE vocabulary + domain vocabulary = complete application
ICEestablishes and manages the syndication
delivers data
logs events
=gt content-independent metadata
industry-specific vocabulary defines the content =gt domain-specific metadata
Source httpwwwicestandardorg
HP 34
ICE Explained
ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information
Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm
ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe
timestamp=05-15-2001T110001 iceversion=10 gt
ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321
subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt
ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt
PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)
ltcomic-stripgtltice-itemgt ltice-item-groupgt
ltice-responsegt ltice-payloadgt
Content (domain-specific
metadata)
HP 36
XCM (eXtended Content Management)
a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are
Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval
Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle
Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability
Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html
HP 37
XCM
eXtended Content Management
Content DevelopmentManagement
Content DeliveryApplication ContentManagement
Content AuthoringDigital Asset Management
Software ConfigurationManagement
Document ProcessManagement
Metadata ManagementRecombinationPersonalization
Edge Network Delivery
Streaming Media DeliveryCaching
Source httpwwwvignettecom
HP 38
Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain
FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph
hydrography transportation
TitleTitle Dakota Aquifer
Online linkageOnline linkagehttpgisdasckgsukansedudasc
Direct Spatial Reference MethodDirect Spatial Reference Method Vector
Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator
hellip hellip hellip
UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation
TopicTopic Dakota Aquifer
AdressAdress IdIdhttpgisdasckgsukansedudasc
Measuring TechniquesMeasuring Techniques Vector
CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator
hellip hellip hellip
Kansas StateKansas State
HP 39
Different views of Metadata
Domain Independent Specifications (RDF)
FrameworksInfrastructures (XCM)
MetadataApplication Specific
ICE
Media Specific
MPEG7 VoiceXML
Domain Specific
NewsML FGDCUDK
HP 40
Creating and Serving Metadata to Power the Life-cycle of Content
Taalee Infrastructure Services Taalee Content Applications
Where is the content
Whose is it
ProduceAggregate
CatalogIndex
What other content is it related to
Integrate Syndicate
What is the right content for this
user
Personalize
What is the best way to
monetize this interaction
Interactive Marketing
BroadcastWirelineWirelessInteractive TV
Taalee Semantic MetaBase
HP 41
Taaleersquos Intelligent Content Process
HP 42
Metadata Creation and Semanticization
bull Automatic Content ClassificationCategorization
bull Metadata CreationExtractionTypes of metadata created
Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc
HP 43
FormsTypesIngest of Content
Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)
HP 44
Content HandlingIngest
InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents
Centralized Distributed MobileMigratory
HP 45
Information Extraction for Metadata Creation
GlobalEnterpriseWeb Repositories
METADATAMETADATA
EXTRACTORSEXTRACTORS
Digital Maps
NexisUPIAP
Documents
Digital Audios
Data Stores
Digital Videos
Digital Images
HP 46
Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach
INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT
NATIONAL PREPAREDNESS LEVEL II
CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection
SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues
CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning
LAYOUT
Date =gt day month int lsquorsquo int
Traditional TextCategorization
StatisticalAI Techniques
Classify Place ina taxonomy
feed
Customer Training
Set
RoutingDistribution
Customer Article Feed
4715
Standard Metadata
Feed Source iSyndicate
Posted Date 11202000
Classification of Article 4715
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 12
RDF Example 1
URIAMITdccreator
dctitleMysteries of Metadata
URITALK
ltXML version=lsquo10rsquogt ltrdfRDF xmlnsrdf = ldquohttpwwww3orgTRREC-rdf-syntaxrdquo xmlnsdc = ldquohttppurlorgdcelements10rdquogt ltrdfDescription rdfabout = ldquoURITALKrdquogt ltdctitlegtMysteries of Metadataltdctitlegt ltdccreator rdfresource = ldquoURIAMITrdquogt ltrdfDescriptiongt ltrdfRDFgt
HP 13
RDF Example 2
URIAMITdccreator
URILIB amittaaleecom
BIBEmailBIBName
BIBAff
dctitleMysteries of Metadata
URITALK
Amit Sheth
HP 14
RDFS (RDF Schema)
Enables resource description communities to define
(and share) vocabularies (museum library e-
commercehellip)
Vocabulary (in RDFS) = the meaning characteristics
and relationships of a set of properties
HP 15
RDF Based Web
HTML
Resources
RDFXMLDescriptions
RDFSchemas
Sourcehttpwwww3crlacuk
HP 16
Dublin Core Metadata Initiative
Simple element set designed for resource description
International inter-discipline W3C community consensus
ldquoSemanticrdquo interface among resource description communities (very limited form of semantics)
Sourcewwwdesireorg
HP 17
Dublin Core RDF
ltxmlgtltnamespace href = httpw3orgrdf-schema as = RDFgtltnamespace href = httpmetadatanetDC as = DCgtltRDFAbbreviatedgtltRDFAssertion RDFHREF = httpwwwmysitecommydochtmlDCTitle = Ive Never Metadata Ive Never LikedldquoDCCreator = Mary CrystalldquoDCSubject = Metadata Dublin Core StuffgtltRDFAbbreviatedgtltxmlgt
HP 18
MOF (Metadata Object Facility) and XMI
MOF models metadata using a subset of UML that is
relevant to modeling metadata (class models - classes
associations and subtyping) a set of rules for mapping
the elements of the MOF Core to CORBA IDL
XML Metadata Interchange (XMI) is an extension of the
MOF into the XML space
HP 19
NewsML
NewsML is a packaging and metadata format for news contentNewsML is developed by the International Press Telecommunications Council (IPTC) a consortium of news providers mostly in the print or wire-service industries Since it deals only with packaging and metadata NewsML is complementary both to news content formats like NITF and to syndication protocols like ICE
HP 20
NewsMLhellip
It can be used by news providers to combine their pictures video text graphics and audio files in news output available on web sites mobile phones high end desktops interactive television and any other deviceaccurate objective set of description tools which help qualify the information and make the search more preciseNewsML allows a range of metadata to be attached to a multi-media story including a detailed computer-readable description of what an item is about
HP 21
Example of the end-to-end flow -NewsML
The content providersupplies NewsML packaged media content to the operator The content is categorized as current events finance sport etc and updated hourly
The operator receives NewsML data from the content provider The content server automatically pushes updated news articles to all news service subscribers
Consumers sign up for the news service directly on the device When using the news service the user browses through the categories and reads the news articles The news articles are presented in a continuous flow (one after the other) without end-user interaction
Sourcehttpwwwmediabrickscom
HP 22
PRISM
Publishing Requirements for Industry Standard MetadataVersion 10 April 2001Authors IDEAlliance (Adobe Vignette Kinecta et al)Idea ldquoa standard for interoperable content description interchange and reuse in both traditional and electronic publishing contextsrdquoWeb site httpwwwprismstandardorg
HP 23
PRISM Design
Built on existing standards like Dublin Core (DC) RDF XMLDesigned to be used in a simple straightforward way over the InternetCompatible with NewsMLIntegrates easily with ICE (for syndication)Vocabulary
Basic DCExtensions ldquoControlled Vocabulariesrdquo eg ldquoNorth American Industrial Classification Systemldquo (NAICS)
HP 24
PRISM Example
ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10
xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt
ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt
ltrdfDescriptiongtltrdfRDFgt
(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)
HP 25
VoiceXML
A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech
(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input
Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications
High- level voice-specific language simplifies application development
Source httpwwwvoicexmlorg
HP 26
Voice Based Internet Applications
Source httpwwwvoicexmlorg
HP 27
Voice XML Metadata
Voice Specific metadata
Supports Syntactic interoperablity
Text data to voice data
Voice XML = XML + Voice Metadata
HP 28
VoiceXML ndash Possible Services
Information retrieval ndash News sports traffic stock quotes
e- Transactions (e- commerce e- tailing etc)
Financial banking stock trading
Catalog browsing (generally as an adjunct to paper)
Telephone services
Personal voice dialing One- number find- me services
Intranet ndash Inventory HR services corporate portals
Unification ndash My Whatever personal portals personal agents unified messaging
Source httpwwwvoicexmlorg
HP 29
MPEG7
set of description scheme and descriptors to describe the content of multimedia data
Provides a language to specify description schemes
A scheme for coding the description
HP 30
Application Examples for MPEG7
A few application examples are
Digital libraries (image catalog musical dictionary)
Multimedia directory services (eg yellow pages)
Broadcast media selection (radio channel TV channel)
HP 31
Information and Content Exchange (ICE)
Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax
Authors Adobe Kinecta MS Sun Vignette et al
Status latest spec version 11 May 2000 submitted to W3C for review
Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip
Web Site httpwwwicestandardorg
HP 32
What is the ICE Protocol
Syndication Protocol for communication between
Syndicators and Subscribers
Metadata to define
roles and responsibilities of involved parties Subscriber vs
Syndicator Requestor vs Responder Sender vs Receiver
format and method of content exchange (eg sequenced
packages pull vs push model)
HP 33
ICE Applications
ICE vocabulary + domain vocabulary = complete application
ICEestablishes and manages the syndication
delivers data
logs events
=gt content-independent metadata
industry-specific vocabulary defines the content =gt domain-specific metadata
Source httpwwwicestandardorg
HP 34
ICE Explained
ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information
Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm
ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe
timestamp=05-15-2001T110001 iceversion=10 gt
ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321
subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt
ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt
PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)
ltcomic-stripgtltice-itemgt ltice-item-groupgt
ltice-responsegt ltice-payloadgt
Content (domain-specific
metadata)
HP 36
XCM (eXtended Content Management)
a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are
Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval
Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle
Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability
Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html
HP 37
XCM
eXtended Content Management
Content DevelopmentManagement
Content DeliveryApplication ContentManagement
Content AuthoringDigital Asset Management
Software ConfigurationManagement
Document ProcessManagement
Metadata ManagementRecombinationPersonalization
Edge Network Delivery
Streaming Media DeliveryCaching
Source httpwwwvignettecom
HP 38
Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain
FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph
hydrography transportation
TitleTitle Dakota Aquifer
Online linkageOnline linkagehttpgisdasckgsukansedudasc
Direct Spatial Reference MethodDirect Spatial Reference Method Vector
Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator
hellip hellip hellip
UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation
TopicTopic Dakota Aquifer
AdressAdress IdIdhttpgisdasckgsukansedudasc
Measuring TechniquesMeasuring Techniques Vector
CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator
hellip hellip hellip
Kansas StateKansas State
HP 39
Different views of Metadata
Domain Independent Specifications (RDF)
FrameworksInfrastructures (XCM)
MetadataApplication Specific
ICE
Media Specific
MPEG7 VoiceXML
Domain Specific
NewsML FGDCUDK
HP 40
Creating and Serving Metadata to Power the Life-cycle of Content
Taalee Infrastructure Services Taalee Content Applications
Where is the content
Whose is it
ProduceAggregate
CatalogIndex
What other content is it related to
Integrate Syndicate
What is the right content for this
user
Personalize
What is the best way to
monetize this interaction
Interactive Marketing
BroadcastWirelineWirelessInteractive TV
Taalee Semantic MetaBase
HP 41
Taaleersquos Intelligent Content Process
HP 42
Metadata Creation and Semanticization
bull Automatic Content ClassificationCategorization
bull Metadata CreationExtractionTypes of metadata created
Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc
HP 43
FormsTypesIngest of Content
Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)
HP 44
Content HandlingIngest
InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents
Centralized Distributed MobileMigratory
HP 45
Information Extraction for Metadata Creation
GlobalEnterpriseWeb Repositories
METADATAMETADATA
EXTRACTORSEXTRACTORS
Digital Maps
NexisUPIAP
Documents
Digital Audios
Data Stores
Digital Videos
Digital Images
HP 46
Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach
INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT
NATIONAL PREPAREDNESS LEVEL II
CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection
SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues
CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning
LAYOUT
Date =gt day month int lsquorsquo int
Traditional TextCategorization
StatisticalAI Techniques
Classify Place ina taxonomy
feed
Customer Training
Set
RoutingDistribution
Customer Article Feed
4715
Standard Metadata
Feed Source iSyndicate
Posted Date 11202000
Classification of Article 4715
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 13
RDF Example 2
URIAMITdccreator
URILIB amittaaleecom
BIBEmailBIBName
BIBAff
dctitleMysteries of Metadata
URITALK
Amit Sheth
HP 14
RDFS (RDF Schema)
Enables resource description communities to define
(and share) vocabularies (museum library e-
commercehellip)
Vocabulary (in RDFS) = the meaning characteristics
and relationships of a set of properties
HP 15
RDF Based Web
HTML
Resources
RDFXMLDescriptions
RDFSchemas
Sourcehttpwwww3crlacuk
HP 16
Dublin Core Metadata Initiative
Simple element set designed for resource description
International inter-discipline W3C community consensus
ldquoSemanticrdquo interface among resource description communities (very limited form of semantics)
Sourcewwwdesireorg
HP 17
Dublin Core RDF
ltxmlgtltnamespace href = httpw3orgrdf-schema as = RDFgtltnamespace href = httpmetadatanetDC as = DCgtltRDFAbbreviatedgtltRDFAssertion RDFHREF = httpwwwmysitecommydochtmlDCTitle = Ive Never Metadata Ive Never LikedldquoDCCreator = Mary CrystalldquoDCSubject = Metadata Dublin Core StuffgtltRDFAbbreviatedgtltxmlgt
HP 18
MOF (Metadata Object Facility) and XMI
MOF models metadata using a subset of UML that is
relevant to modeling metadata (class models - classes
associations and subtyping) a set of rules for mapping
the elements of the MOF Core to CORBA IDL
XML Metadata Interchange (XMI) is an extension of the
MOF into the XML space
HP 19
NewsML
NewsML is a packaging and metadata format for news contentNewsML is developed by the International Press Telecommunications Council (IPTC) a consortium of news providers mostly in the print or wire-service industries Since it deals only with packaging and metadata NewsML is complementary both to news content formats like NITF and to syndication protocols like ICE
HP 20
NewsMLhellip
It can be used by news providers to combine their pictures video text graphics and audio files in news output available on web sites mobile phones high end desktops interactive television and any other deviceaccurate objective set of description tools which help qualify the information and make the search more preciseNewsML allows a range of metadata to be attached to a multi-media story including a detailed computer-readable description of what an item is about
HP 21
Example of the end-to-end flow -NewsML
The content providersupplies NewsML packaged media content to the operator The content is categorized as current events finance sport etc and updated hourly
The operator receives NewsML data from the content provider The content server automatically pushes updated news articles to all news service subscribers
Consumers sign up for the news service directly on the device When using the news service the user browses through the categories and reads the news articles The news articles are presented in a continuous flow (one after the other) without end-user interaction
Sourcehttpwwwmediabrickscom
HP 22
PRISM
Publishing Requirements for Industry Standard MetadataVersion 10 April 2001Authors IDEAlliance (Adobe Vignette Kinecta et al)Idea ldquoa standard for interoperable content description interchange and reuse in both traditional and electronic publishing contextsrdquoWeb site httpwwwprismstandardorg
HP 23
PRISM Design
Built on existing standards like Dublin Core (DC) RDF XMLDesigned to be used in a simple straightforward way over the InternetCompatible with NewsMLIntegrates easily with ICE (for syndication)Vocabulary
Basic DCExtensions ldquoControlled Vocabulariesrdquo eg ldquoNorth American Industrial Classification Systemldquo (NAICS)
HP 24
PRISM Example
ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10
xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt
ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt
ltrdfDescriptiongtltrdfRDFgt
(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)
HP 25
VoiceXML
A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech
(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input
Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications
High- level voice-specific language simplifies application development
Source httpwwwvoicexmlorg
HP 26
Voice Based Internet Applications
Source httpwwwvoicexmlorg
HP 27
Voice XML Metadata
Voice Specific metadata
Supports Syntactic interoperablity
Text data to voice data
Voice XML = XML + Voice Metadata
HP 28
VoiceXML ndash Possible Services
Information retrieval ndash News sports traffic stock quotes
e- Transactions (e- commerce e- tailing etc)
Financial banking stock trading
Catalog browsing (generally as an adjunct to paper)
Telephone services
Personal voice dialing One- number find- me services
Intranet ndash Inventory HR services corporate portals
Unification ndash My Whatever personal portals personal agents unified messaging
Source httpwwwvoicexmlorg
HP 29
MPEG7
set of description scheme and descriptors to describe the content of multimedia data
Provides a language to specify description schemes
A scheme for coding the description
HP 30
Application Examples for MPEG7
A few application examples are
Digital libraries (image catalog musical dictionary)
Multimedia directory services (eg yellow pages)
Broadcast media selection (radio channel TV channel)
HP 31
Information and Content Exchange (ICE)
Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax
Authors Adobe Kinecta MS Sun Vignette et al
Status latest spec version 11 May 2000 submitted to W3C for review
Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip
Web Site httpwwwicestandardorg
HP 32
What is the ICE Protocol
Syndication Protocol for communication between
Syndicators and Subscribers
Metadata to define
roles and responsibilities of involved parties Subscriber vs
Syndicator Requestor vs Responder Sender vs Receiver
format and method of content exchange (eg sequenced
packages pull vs push model)
HP 33
ICE Applications
ICE vocabulary + domain vocabulary = complete application
ICEestablishes and manages the syndication
delivers data
logs events
=gt content-independent metadata
industry-specific vocabulary defines the content =gt domain-specific metadata
Source httpwwwicestandardorg
HP 34
ICE Explained
ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information
Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm
ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe
timestamp=05-15-2001T110001 iceversion=10 gt
ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321
subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt
ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt
PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)
ltcomic-stripgtltice-itemgt ltice-item-groupgt
ltice-responsegt ltice-payloadgt
Content (domain-specific
metadata)
HP 36
XCM (eXtended Content Management)
a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are
Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval
Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle
Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability
Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html
HP 37
XCM
eXtended Content Management
Content DevelopmentManagement
Content DeliveryApplication ContentManagement
Content AuthoringDigital Asset Management
Software ConfigurationManagement
Document ProcessManagement
Metadata ManagementRecombinationPersonalization
Edge Network Delivery
Streaming Media DeliveryCaching
Source httpwwwvignettecom
HP 38
Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain
FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph
hydrography transportation
TitleTitle Dakota Aquifer
Online linkageOnline linkagehttpgisdasckgsukansedudasc
Direct Spatial Reference MethodDirect Spatial Reference Method Vector
Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator
hellip hellip hellip
UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation
TopicTopic Dakota Aquifer
AdressAdress IdIdhttpgisdasckgsukansedudasc
Measuring TechniquesMeasuring Techniques Vector
CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator
hellip hellip hellip
Kansas StateKansas State
HP 39
Different views of Metadata
Domain Independent Specifications (RDF)
FrameworksInfrastructures (XCM)
MetadataApplication Specific
ICE
Media Specific
MPEG7 VoiceXML
Domain Specific
NewsML FGDCUDK
HP 40
Creating and Serving Metadata to Power the Life-cycle of Content
Taalee Infrastructure Services Taalee Content Applications
Where is the content
Whose is it
ProduceAggregate
CatalogIndex
What other content is it related to
Integrate Syndicate
What is the right content for this
user
Personalize
What is the best way to
monetize this interaction
Interactive Marketing
BroadcastWirelineWirelessInteractive TV
Taalee Semantic MetaBase
HP 41
Taaleersquos Intelligent Content Process
HP 42
Metadata Creation and Semanticization
bull Automatic Content ClassificationCategorization
bull Metadata CreationExtractionTypes of metadata created
Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc
HP 43
FormsTypesIngest of Content
Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)
HP 44
Content HandlingIngest
InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents
Centralized Distributed MobileMigratory
HP 45
Information Extraction for Metadata Creation
GlobalEnterpriseWeb Repositories
METADATAMETADATA
EXTRACTORSEXTRACTORS
Digital Maps
NexisUPIAP
Documents
Digital Audios
Data Stores
Digital Videos
Digital Images
HP 46
Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach
INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT
NATIONAL PREPAREDNESS LEVEL II
CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection
SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues
CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning
LAYOUT
Date =gt day month int lsquorsquo int
Traditional TextCategorization
StatisticalAI Techniques
Classify Place ina taxonomy
feed
Customer Training
Set
RoutingDistribution
Customer Article Feed
4715
Standard Metadata
Feed Source iSyndicate
Posted Date 11202000
Classification of Article 4715
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 14
RDFS (RDF Schema)
Enables resource description communities to define
(and share) vocabularies (museum library e-
commercehellip)
Vocabulary (in RDFS) = the meaning characteristics
and relationships of a set of properties
HP 15
RDF Based Web
HTML
Resources
RDFXMLDescriptions
RDFSchemas
Sourcehttpwwww3crlacuk
HP 16
Dublin Core Metadata Initiative
Simple element set designed for resource description
International inter-discipline W3C community consensus
ldquoSemanticrdquo interface among resource description communities (very limited form of semantics)
Sourcewwwdesireorg
HP 17
Dublin Core RDF
ltxmlgtltnamespace href = httpw3orgrdf-schema as = RDFgtltnamespace href = httpmetadatanetDC as = DCgtltRDFAbbreviatedgtltRDFAssertion RDFHREF = httpwwwmysitecommydochtmlDCTitle = Ive Never Metadata Ive Never LikedldquoDCCreator = Mary CrystalldquoDCSubject = Metadata Dublin Core StuffgtltRDFAbbreviatedgtltxmlgt
HP 18
MOF (Metadata Object Facility) and XMI
MOF models metadata using a subset of UML that is
relevant to modeling metadata (class models - classes
associations and subtyping) a set of rules for mapping
the elements of the MOF Core to CORBA IDL
XML Metadata Interchange (XMI) is an extension of the
MOF into the XML space
HP 19
NewsML
NewsML is a packaging and metadata format for news contentNewsML is developed by the International Press Telecommunications Council (IPTC) a consortium of news providers mostly in the print or wire-service industries Since it deals only with packaging and metadata NewsML is complementary both to news content formats like NITF and to syndication protocols like ICE
HP 20
NewsMLhellip
It can be used by news providers to combine their pictures video text graphics and audio files in news output available on web sites mobile phones high end desktops interactive television and any other deviceaccurate objective set of description tools which help qualify the information and make the search more preciseNewsML allows a range of metadata to be attached to a multi-media story including a detailed computer-readable description of what an item is about
HP 21
Example of the end-to-end flow -NewsML
The content providersupplies NewsML packaged media content to the operator The content is categorized as current events finance sport etc and updated hourly
The operator receives NewsML data from the content provider The content server automatically pushes updated news articles to all news service subscribers
Consumers sign up for the news service directly on the device When using the news service the user browses through the categories and reads the news articles The news articles are presented in a continuous flow (one after the other) without end-user interaction
Sourcehttpwwwmediabrickscom
HP 22
PRISM
Publishing Requirements for Industry Standard MetadataVersion 10 April 2001Authors IDEAlliance (Adobe Vignette Kinecta et al)Idea ldquoa standard for interoperable content description interchange and reuse in both traditional and electronic publishing contextsrdquoWeb site httpwwwprismstandardorg
HP 23
PRISM Design
Built on existing standards like Dublin Core (DC) RDF XMLDesigned to be used in a simple straightforward way over the InternetCompatible with NewsMLIntegrates easily with ICE (for syndication)Vocabulary
Basic DCExtensions ldquoControlled Vocabulariesrdquo eg ldquoNorth American Industrial Classification Systemldquo (NAICS)
HP 24
PRISM Example
ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10
xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt
ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt
ltrdfDescriptiongtltrdfRDFgt
(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)
HP 25
VoiceXML
A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech
(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input
Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications
High- level voice-specific language simplifies application development
Source httpwwwvoicexmlorg
HP 26
Voice Based Internet Applications
Source httpwwwvoicexmlorg
HP 27
Voice XML Metadata
Voice Specific metadata
Supports Syntactic interoperablity
Text data to voice data
Voice XML = XML + Voice Metadata
HP 28
VoiceXML ndash Possible Services
Information retrieval ndash News sports traffic stock quotes
e- Transactions (e- commerce e- tailing etc)
Financial banking stock trading
Catalog browsing (generally as an adjunct to paper)
Telephone services
Personal voice dialing One- number find- me services
Intranet ndash Inventory HR services corporate portals
Unification ndash My Whatever personal portals personal agents unified messaging
Source httpwwwvoicexmlorg
HP 29
MPEG7
set of description scheme and descriptors to describe the content of multimedia data
Provides a language to specify description schemes
A scheme for coding the description
HP 30
Application Examples for MPEG7
A few application examples are
Digital libraries (image catalog musical dictionary)
Multimedia directory services (eg yellow pages)
Broadcast media selection (radio channel TV channel)
HP 31
Information and Content Exchange (ICE)
Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax
Authors Adobe Kinecta MS Sun Vignette et al
Status latest spec version 11 May 2000 submitted to W3C for review
Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip
Web Site httpwwwicestandardorg
HP 32
What is the ICE Protocol
Syndication Protocol for communication between
Syndicators and Subscribers
Metadata to define
roles and responsibilities of involved parties Subscriber vs
Syndicator Requestor vs Responder Sender vs Receiver
format and method of content exchange (eg sequenced
packages pull vs push model)
HP 33
ICE Applications
ICE vocabulary + domain vocabulary = complete application
ICEestablishes and manages the syndication
delivers data
logs events
=gt content-independent metadata
industry-specific vocabulary defines the content =gt domain-specific metadata
Source httpwwwicestandardorg
HP 34
ICE Explained
ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information
Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm
ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe
timestamp=05-15-2001T110001 iceversion=10 gt
ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321
subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt
ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt
PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)
ltcomic-stripgtltice-itemgt ltice-item-groupgt
ltice-responsegt ltice-payloadgt
Content (domain-specific
metadata)
HP 36
XCM (eXtended Content Management)
a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are
Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval
Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle
Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability
Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html
HP 37
XCM
eXtended Content Management
Content DevelopmentManagement
Content DeliveryApplication ContentManagement
Content AuthoringDigital Asset Management
Software ConfigurationManagement
Document ProcessManagement
Metadata ManagementRecombinationPersonalization
Edge Network Delivery
Streaming Media DeliveryCaching
Source httpwwwvignettecom
HP 38
Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain
FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph
hydrography transportation
TitleTitle Dakota Aquifer
Online linkageOnline linkagehttpgisdasckgsukansedudasc
Direct Spatial Reference MethodDirect Spatial Reference Method Vector
Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator
hellip hellip hellip
UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation
TopicTopic Dakota Aquifer
AdressAdress IdIdhttpgisdasckgsukansedudasc
Measuring TechniquesMeasuring Techniques Vector
CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator
hellip hellip hellip
Kansas StateKansas State
HP 39
Different views of Metadata
Domain Independent Specifications (RDF)
FrameworksInfrastructures (XCM)
MetadataApplication Specific
ICE
Media Specific
MPEG7 VoiceXML
Domain Specific
NewsML FGDCUDK
HP 40
Creating and Serving Metadata to Power the Life-cycle of Content
Taalee Infrastructure Services Taalee Content Applications
Where is the content
Whose is it
ProduceAggregate
CatalogIndex
What other content is it related to
Integrate Syndicate
What is the right content for this
user
Personalize
What is the best way to
monetize this interaction
Interactive Marketing
BroadcastWirelineWirelessInteractive TV
Taalee Semantic MetaBase
HP 41
Taaleersquos Intelligent Content Process
HP 42
Metadata Creation and Semanticization
bull Automatic Content ClassificationCategorization
bull Metadata CreationExtractionTypes of metadata created
Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc
HP 43
FormsTypesIngest of Content
Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)
HP 44
Content HandlingIngest
InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents
Centralized Distributed MobileMigratory
HP 45
Information Extraction for Metadata Creation
GlobalEnterpriseWeb Repositories
METADATAMETADATA
EXTRACTORSEXTRACTORS
Digital Maps
NexisUPIAP
Documents
Digital Audios
Data Stores
Digital Videos
Digital Images
HP 46
Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach
INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT
NATIONAL PREPAREDNESS LEVEL II
CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection
SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues
CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning
LAYOUT
Date =gt day month int lsquorsquo int
Traditional TextCategorization
StatisticalAI Techniques
Classify Place ina taxonomy
feed
Customer Training
Set
RoutingDistribution
Customer Article Feed
4715
Standard Metadata
Feed Source iSyndicate
Posted Date 11202000
Classification of Article 4715
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 15
RDF Based Web
HTML
Resources
RDFXMLDescriptions
RDFSchemas
Sourcehttpwwww3crlacuk
HP 16
Dublin Core Metadata Initiative
Simple element set designed for resource description
International inter-discipline W3C community consensus
ldquoSemanticrdquo interface among resource description communities (very limited form of semantics)
Sourcewwwdesireorg
HP 17
Dublin Core RDF
ltxmlgtltnamespace href = httpw3orgrdf-schema as = RDFgtltnamespace href = httpmetadatanetDC as = DCgtltRDFAbbreviatedgtltRDFAssertion RDFHREF = httpwwwmysitecommydochtmlDCTitle = Ive Never Metadata Ive Never LikedldquoDCCreator = Mary CrystalldquoDCSubject = Metadata Dublin Core StuffgtltRDFAbbreviatedgtltxmlgt
HP 18
MOF (Metadata Object Facility) and XMI
MOF models metadata using a subset of UML that is
relevant to modeling metadata (class models - classes
associations and subtyping) a set of rules for mapping
the elements of the MOF Core to CORBA IDL
XML Metadata Interchange (XMI) is an extension of the
MOF into the XML space
HP 19
NewsML
NewsML is a packaging and metadata format for news contentNewsML is developed by the International Press Telecommunications Council (IPTC) a consortium of news providers mostly in the print or wire-service industries Since it deals only with packaging and metadata NewsML is complementary both to news content formats like NITF and to syndication protocols like ICE
HP 20
NewsMLhellip
It can be used by news providers to combine their pictures video text graphics and audio files in news output available on web sites mobile phones high end desktops interactive television and any other deviceaccurate objective set of description tools which help qualify the information and make the search more preciseNewsML allows a range of metadata to be attached to a multi-media story including a detailed computer-readable description of what an item is about
HP 21
Example of the end-to-end flow -NewsML
The content providersupplies NewsML packaged media content to the operator The content is categorized as current events finance sport etc and updated hourly
The operator receives NewsML data from the content provider The content server automatically pushes updated news articles to all news service subscribers
Consumers sign up for the news service directly on the device When using the news service the user browses through the categories and reads the news articles The news articles are presented in a continuous flow (one after the other) without end-user interaction
Sourcehttpwwwmediabrickscom
HP 22
PRISM
Publishing Requirements for Industry Standard MetadataVersion 10 April 2001Authors IDEAlliance (Adobe Vignette Kinecta et al)Idea ldquoa standard for interoperable content description interchange and reuse in both traditional and electronic publishing contextsrdquoWeb site httpwwwprismstandardorg
HP 23
PRISM Design
Built on existing standards like Dublin Core (DC) RDF XMLDesigned to be used in a simple straightforward way over the InternetCompatible with NewsMLIntegrates easily with ICE (for syndication)Vocabulary
Basic DCExtensions ldquoControlled Vocabulariesrdquo eg ldquoNorth American Industrial Classification Systemldquo (NAICS)
HP 24
PRISM Example
ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10
xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt
ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt
ltrdfDescriptiongtltrdfRDFgt
(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)
HP 25
VoiceXML
A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech
(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input
Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications
High- level voice-specific language simplifies application development
Source httpwwwvoicexmlorg
HP 26
Voice Based Internet Applications
Source httpwwwvoicexmlorg
HP 27
Voice XML Metadata
Voice Specific metadata
Supports Syntactic interoperablity
Text data to voice data
Voice XML = XML + Voice Metadata
HP 28
VoiceXML ndash Possible Services
Information retrieval ndash News sports traffic stock quotes
e- Transactions (e- commerce e- tailing etc)
Financial banking stock trading
Catalog browsing (generally as an adjunct to paper)
Telephone services
Personal voice dialing One- number find- me services
Intranet ndash Inventory HR services corporate portals
Unification ndash My Whatever personal portals personal agents unified messaging
Source httpwwwvoicexmlorg
HP 29
MPEG7
set of description scheme and descriptors to describe the content of multimedia data
Provides a language to specify description schemes
A scheme for coding the description
HP 30
Application Examples for MPEG7
A few application examples are
Digital libraries (image catalog musical dictionary)
Multimedia directory services (eg yellow pages)
Broadcast media selection (radio channel TV channel)
HP 31
Information and Content Exchange (ICE)
Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax
Authors Adobe Kinecta MS Sun Vignette et al
Status latest spec version 11 May 2000 submitted to W3C for review
Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip
Web Site httpwwwicestandardorg
HP 32
What is the ICE Protocol
Syndication Protocol for communication between
Syndicators and Subscribers
Metadata to define
roles and responsibilities of involved parties Subscriber vs
Syndicator Requestor vs Responder Sender vs Receiver
format and method of content exchange (eg sequenced
packages pull vs push model)
HP 33
ICE Applications
ICE vocabulary + domain vocabulary = complete application
ICEestablishes and manages the syndication
delivers data
logs events
=gt content-independent metadata
industry-specific vocabulary defines the content =gt domain-specific metadata
Source httpwwwicestandardorg
HP 34
ICE Explained
ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information
Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm
ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe
timestamp=05-15-2001T110001 iceversion=10 gt
ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321
subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt
ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt
PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)
ltcomic-stripgtltice-itemgt ltice-item-groupgt
ltice-responsegt ltice-payloadgt
Content (domain-specific
metadata)
HP 36
XCM (eXtended Content Management)
a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are
Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval
Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle
Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability
Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html
HP 37
XCM
eXtended Content Management
Content DevelopmentManagement
Content DeliveryApplication ContentManagement
Content AuthoringDigital Asset Management
Software ConfigurationManagement
Document ProcessManagement
Metadata ManagementRecombinationPersonalization
Edge Network Delivery
Streaming Media DeliveryCaching
Source httpwwwvignettecom
HP 38
Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain
FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph
hydrography transportation
TitleTitle Dakota Aquifer
Online linkageOnline linkagehttpgisdasckgsukansedudasc
Direct Spatial Reference MethodDirect Spatial Reference Method Vector
Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator
hellip hellip hellip
UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation
TopicTopic Dakota Aquifer
AdressAdress IdIdhttpgisdasckgsukansedudasc
Measuring TechniquesMeasuring Techniques Vector
CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator
hellip hellip hellip
Kansas StateKansas State
HP 39
Different views of Metadata
Domain Independent Specifications (RDF)
FrameworksInfrastructures (XCM)
MetadataApplication Specific
ICE
Media Specific
MPEG7 VoiceXML
Domain Specific
NewsML FGDCUDK
HP 40
Creating and Serving Metadata to Power the Life-cycle of Content
Taalee Infrastructure Services Taalee Content Applications
Where is the content
Whose is it
ProduceAggregate
CatalogIndex
What other content is it related to
Integrate Syndicate
What is the right content for this
user
Personalize
What is the best way to
monetize this interaction
Interactive Marketing
BroadcastWirelineWirelessInteractive TV
Taalee Semantic MetaBase
HP 41
Taaleersquos Intelligent Content Process
HP 42
Metadata Creation and Semanticization
bull Automatic Content ClassificationCategorization
bull Metadata CreationExtractionTypes of metadata created
Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc
HP 43
FormsTypesIngest of Content
Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)
HP 44
Content HandlingIngest
InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents
Centralized Distributed MobileMigratory
HP 45
Information Extraction for Metadata Creation
GlobalEnterpriseWeb Repositories
METADATAMETADATA
EXTRACTORSEXTRACTORS
Digital Maps
NexisUPIAP
Documents
Digital Audios
Data Stores
Digital Videos
Digital Images
HP 46
Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach
INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT
NATIONAL PREPAREDNESS LEVEL II
CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection
SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues
CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning
LAYOUT
Date =gt day month int lsquorsquo int
Traditional TextCategorization
StatisticalAI Techniques
Classify Place ina taxonomy
feed
Customer Training
Set
RoutingDistribution
Customer Article Feed
4715
Standard Metadata
Feed Source iSyndicate
Posted Date 11202000
Classification of Article 4715
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 16
Dublin Core Metadata Initiative
Simple element set designed for resource description
International inter-discipline W3C community consensus
ldquoSemanticrdquo interface among resource description communities (very limited form of semantics)
Sourcewwwdesireorg
HP 17
Dublin Core RDF
ltxmlgtltnamespace href = httpw3orgrdf-schema as = RDFgtltnamespace href = httpmetadatanetDC as = DCgtltRDFAbbreviatedgtltRDFAssertion RDFHREF = httpwwwmysitecommydochtmlDCTitle = Ive Never Metadata Ive Never LikedldquoDCCreator = Mary CrystalldquoDCSubject = Metadata Dublin Core StuffgtltRDFAbbreviatedgtltxmlgt
HP 18
MOF (Metadata Object Facility) and XMI
MOF models metadata using a subset of UML that is
relevant to modeling metadata (class models - classes
associations and subtyping) a set of rules for mapping
the elements of the MOF Core to CORBA IDL
XML Metadata Interchange (XMI) is an extension of the
MOF into the XML space
HP 19
NewsML
NewsML is a packaging and metadata format for news contentNewsML is developed by the International Press Telecommunications Council (IPTC) a consortium of news providers mostly in the print or wire-service industries Since it deals only with packaging and metadata NewsML is complementary both to news content formats like NITF and to syndication protocols like ICE
HP 20
NewsMLhellip
It can be used by news providers to combine their pictures video text graphics and audio files in news output available on web sites mobile phones high end desktops interactive television and any other deviceaccurate objective set of description tools which help qualify the information and make the search more preciseNewsML allows a range of metadata to be attached to a multi-media story including a detailed computer-readable description of what an item is about
HP 21
Example of the end-to-end flow -NewsML
The content providersupplies NewsML packaged media content to the operator The content is categorized as current events finance sport etc and updated hourly
The operator receives NewsML data from the content provider The content server automatically pushes updated news articles to all news service subscribers
Consumers sign up for the news service directly on the device When using the news service the user browses through the categories and reads the news articles The news articles are presented in a continuous flow (one after the other) without end-user interaction
Sourcehttpwwwmediabrickscom
HP 22
PRISM
Publishing Requirements for Industry Standard MetadataVersion 10 April 2001Authors IDEAlliance (Adobe Vignette Kinecta et al)Idea ldquoa standard for interoperable content description interchange and reuse in both traditional and electronic publishing contextsrdquoWeb site httpwwwprismstandardorg
HP 23
PRISM Design
Built on existing standards like Dublin Core (DC) RDF XMLDesigned to be used in a simple straightforward way over the InternetCompatible with NewsMLIntegrates easily with ICE (for syndication)Vocabulary
Basic DCExtensions ldquoControlled Vocabulariesrdquo eg ldquoNorth American Industrial Classification Systemldquo (NAICS)
HP 24
PRISM Example
ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10
xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt
ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt
ltrdfDescriptiongtltrdfRDFgt
(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)
HP 25
VoiceXML
A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech
(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input
Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications
High- level voice-specific language simplifies application development
Source httpwwwvoicexmlorg
HP 26
Voice Based Internet Applications
Source httpwwwvoicexmlorg
HP 27
Voice XML Metadata
Voice Specific metadata
Supports Syntactic interoperablity
Text data to voice data
Voice XML = XML + Voice Metadata
HP 28
VoiceXML ndash Possible Services
Information retrieval ndash News sports traffic stock quotes
e- Transactions (e- commerce e- tailing etc)
Financial banking stock trading
Catalog browsing (generally as an adjunct to paper)
Telephone services
Personal voice dialing One- number find- me services
Intranet ndash Inventory HR services corporate portals
Unification ndash My Whatever personal portals personal agents unified messaging
Source httpwwwvoicexmlorg
HP 29
MPEG7
set of description scheme and descriptors to describe the content of multimedia data
Provides a language to specify description schemes
A scheme for coding the description
HP 30
Application Examples for MPEG7
A few application examples are
Digital libraries (image catalog musical dictionary)
Multimedia directory services (eg yellow pages)
Broadcast media selection (radio channel TV channel)
HP 31
Information and Content Exchange (ICE)
Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax
Authors Adobe Kinecta MS Sun Vignette et al
Status latest spec version 11 May 2000 submitted to W3C for review
Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip
Web Site httpwwwicestandardorg
HP 32
What is the ICE Protocol
Syndication Protocol for communication between
Syndicators and Subscribers
Metadata to define
roles and responsibilities of involved parties Subscriber vs
Syndicator Requestor vs Responder Sender vs Receiver
format and method of content exchange (eg sequenced
packages pull vs push model)
HP 33
ICE Applications
ICE vocabulary + domain vocabulary = complete application
ICEestablishes and manages the syndication
delivers data
logs events
=gt content-independent metadata
industry-specific vocabulary defines the content =gt domain-specific metadata
Source httpwwwicestandardorg
HP 34
ICE Explained
ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information
Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm
ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe
timestamp=05-15-2001T110001 iceversion=10 gt
ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321
subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt
ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt
PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)
ltcomic-stripgtltice-itemgt ltice-item-groupgt
ltice-responsegt ltice-payloadgt
Content (domain-specific
metadata)
HP 36
XCM (eXtended Content Management)
a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are
Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval
Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle
Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability
Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html
HP 37
XCM
eXtended Content Management
Content DevelopmentManagement
Content DeliveryApplication ContentManagement
Content AuthoringDigital Asset Management
Software ConfigurationManagement
Document ProcessManagement
Metadata ManagementRecombinationPersonalization
Edge Network Delivery
Streaming Media DeliveryCaching
Source httpwwwvignettecom
HP 38
Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain
FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph
hydrography transportation
TitleTitle Dakota Aquifer
Online linkageOnline linkagehttpgisdasckgsukansedudasc
Direct Spatial Reference MethodDirect Spatial Reference Method Vector
Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator
hellip hellip hellip
UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation
TopicTopic Dakota Aquifer
AdressAdress IdIdhttpgisdasckgsukansedudasc
Measuring TechniquesMeasuring Techniques Vector
CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator
hellip hellip hellip
Kansas StateKansas State
HP 39
Different views of Metadata
Domain Independent Specifications (RDF)
FrameworksInfrastructures (XCM)
MetadataApplication Specific
ICE
Media Specific
MPEG7 VoiceXML
Domain Specific
NewsML FGDCUDK
HP 40
Creating and Serving Metadata to Power the Life-cycle of Content
Taalee Infrastructure Services Taalee Content Applications
Where is the content
Whose is it
ProduceAggregate
CatalogIndex
What other content is it related to
Integrate Syndicate
What is the right content for this
user
Personalize
What is the best way to
monetize this interaction
Interactive Marketing
BroadcastWirelineWirelessInteractive TV
Taalee Semantic MetaBase
HP 41
Taaleersquos Intelligent Content Process
HP 42
Metadata Creation and Semanticization
bull Automatic Content ClassificationCategorization
bull Metadata CreationExtractionTypes of metadata created
Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc
HP 43
FormsTypesIngest of Content
Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)
HP 44
Content HandlingIngest
InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents
Centralized Distributed MobileMigratory
HP 45
Information Extraction for Metadata Creation
GlobalEnterpriseWeb Repositories
METADATAMETADATA
EXTRACTORSEXTRACTORS
Digital Maps
NexisUPIAP
Documents
Digital Audios
Data Stores
Digital Videos
Digital Images
HP 46
Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach
INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT
NATIONAL PREPAREDNESS LEVEL II
CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection
SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues
CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning
LAYOUT
Date =gt day month int lsquorsquo int
Traditional TextCategorization
StatisticalAI Techniques
Classify Place ina taxonomy
feed
Customer Training
Set
RoutingDistribution
Customer Article Feed
4715
Standard Metadata
Feed Source iSyndicate
Posted Date 11202000
Classification of Article 4715
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 17
Dublin Core RDF
ltxmlgtltnamespace href = httpw3orgrdf-schema as = RDFgtltnamespace href = httpmetadatanetDC as = DCgtltRDFAbbreviatedgtltRDFAssertion RDFHREF = httpwwwmysitecommydochtmlDCTitle = Ive Never Metadata Ive Never LikedldquoDCCreator = Mary CrystalldquoDCSubject = Metadata Dublin Core StuffgtltRDFAbbreviatedgtltxmlgt
HP 18
MOF (Metadata Object Facility) and XMI
MOF models metadata using a subset of UML that is
relevant to modeling metadata (class models - classes
associations and subtyping) a set of rules for mapping
the elements of the MOF Core to CORBA IDL
XML Metadata Interchange (XMI) is an extension of the
MOF into the XML space
HP 19
NewsML
NewsML is a packaging and metadata format for news contentNewsML is developed by the International Press Telecommunications Council (IPTC) a consortium of news providers mostly in the print or wire-service industries Since it deals only with packaging and metadata NewsML is complementary both to news content formats like NITF and to syndication protocols like ICE
HP 20
NewsMLhellip
It can be used by news providers to combine their pictures video text graphics and audio files in news output available on web sites mobile phones high end desktops interactive television and any other deviceaccurate objective set of description tools which help qualify the information and make the search more preciseNewsML allows a range of metadata to be attached to a multi-media story including a detailed computer-readable description of what an item is about
HP 21
Example of the end-to-end flow -NewsML
The content providersupplies NewsML packaged media content to the operator The content is categorized as current events finance sport etc and updated hourly
The operator receives NewsML data from the content provider The content server automatically pushes updated news articles to all news service subscribers
Consumers sign up for the news service directly on the device When using the news service the user browses through the categories and reads the news articles The news articles are presented in a continuous flow (one after the other) without end-user interaction
Sourcehttpwwwmediabrickscom
HP 22
PRISM
Publishing Requirements for Industry Standard MetadataVersion 10 April 2001Authors IDEAlliance (Adobe Vignette Kinecta et al)Idea ldquoa standard for interoperable content description interchange and reuse in both traditional and electronic publishing contextsrdquoWeb site httpwwwprismstandardorg
HP 23
PRISM Design
Built on existing standards like Dublin Core (DC) RDF XMLDesigned to be used in a simple straightforward way over the InternetCompatible with NewsMLIntegrates easily with ICE (for syndication)Vocabulary
Basic DCExtensions ldquoControlled Vocabulariesrdquo eg ldquoNorth American Industrial Classification Systemldquo (NAICS)
HP 24
PRISM Example
ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10
xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt
ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt
ltrdfDescriptiongtltrdfRDFgt
(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)
HP 25
VoiceXML
A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech
(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input
Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications
High- level voice-specific language simplifies application development
Source httpwwwvoicexmlorg
HP 26
Voice Based Internet Applications
Source httpwwwvoicexmlorg
HP 27
Voice XML Metadata
Voice Specific metadata
Supports Syntactic interoperablity
Text data to voice data
Voice XML = XML + Voice Metadata
HP 28
VoiceXML ndash Possible Services
Information retrieval ndash News sports traffic stock quotes
e- Transactions (e- commerce e- tailing etc)
Financial banking stock trading
Catalog browsing (generally as an adjunct to paper)
Telephone services
Personal voice dialing One- number find- me services
Intranet ndash Inventory HR services corporate portals
Unification ndash My Whatever personal portals personal agents unified messaging
Source httpwwwvoicexmlorg
HP 29
MPEG7
set of description scheme and descriptors to describe the content of multimedia data
Provides a language to specify description schemes
A scheme for coding the description
HP 30
Application Examples for MPEG7
A few application examples are
Digital libraries (image catalog musical dictionary)
Multimedia directory services (eg yellow pages)
Broadcast media selection (radio channel TV channel)
HP 31
Information and Content Exchange (ICE)
Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax
Authors Adobe Kinecta MS Sun Vignette et al
Status latest spec version 11 May 2000 submitted to W3C for review
Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip
Web Site httpwwwicestandardorg
HP 32
What is the ICE Protocol
Syndication Protocol for communication between
Syndicators and Subscribers
Metadata to define
roles and responsibilities of involved parties Subscriber vs
Syndicator Requestor vs Responder Sender vs Receiver
format and method of content exchange (eg sequenced
packages pull vs push model)
HP 33
ICE Applications
ICE vocabulary + domain vocabulary = complete application
ICEestablishes and manages the syndication
delivers data
logs events
=gt content-independent metadata
industry-specific vocabulary defines the content =gt domain-specific metadata
Source httpwwwicestandardorg
HP 34
ICE Explained
ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information
Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm
ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe
timestamp=05-15-2001T110001 iceversion=10 gt
ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321
subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt
ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt
PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)
ltcomic-stripgtltice-itemgt ltice-item-groupgt
ltice-responsegt ltice-payloadgt
Content (domain-specific
metadata)
HP 36
XCM (eXtended Content Management)
a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are
Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval
Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle
Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability
Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html
HP 37
XCM
eXtended Content Management
Content DevelopmentManagement
Content DeliveryApplication ContentManagement
Content AuthoringDigital Asset Management
Software ConfigurationManagement
Document ProcessManagement
Metadata ManagementRecombinationPersonalization
Edge Network Delivery
Streaming Media DeliveryCaching
Source httpwwwvignettecom
HP 38
Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain
FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph
hydrography transportation
TitleTitle Dakota Aquifer
Online linkageOnline linkagehttpgisdasckgsukansedudasc
Direct Spatial Reference MethodDirect Spatial Reference Method Vector
Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator
hellip hellip hellip
UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation
TopicTopic Dakota Aquifer
AdressAdress IdIdhttpgisdasckgsukansedudasc
Measuring TechniquesMeasuring Techniques Vector
CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator
hellip hellip hellip
Kansas StateKansas State
HP 39
Different views of Metadata
Domain Independent Specifications (RDF)
FrameworksInfrastructures (XCM)
MetadataApplication Specific
ICE
Media Specific
MPEG7 VoiceXML
Domain Specific
NewsML FGDCUDK
HP 40
Creating and Serving Metadata to Power the Life-cycle of Content
Taalee Infrastructure Services Taalee Content Applications
Where is the content
Whose is it
ProduceAggregate
CatalogIndex
What other content is it related to
Integrate Syndicate
What is the right content for this
user
Personalize
What is the best way to
monetize this interaction
Interactive Marketing
BroadcastWirelineWirelessInteractive TV
Taalee Semantic MetaBase
HP 41
Taaleersquos Intelligent Content Process
HP 42
Metadata Creation and Semanticization
bull Automatic Content ClassificationCategorization
bull Metadata CreationExtractionTypes of metadata created
Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc
HP 43
FormsTypesIngest of Content
Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)
HP 44
Content HandlingIngest
InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents
Centralized Distributed MobileMigratory
HP 45
Information Extraction for Metadata Creation
GlobalEnterpriseWeb Repositories
METADATAMETADATA
EXTRACTORSEXTRACTORS
Digital Maps
NexisUPIAP
Documents
Digital Audios
Data Stores
Digital Videos
Digital Images
HP 46
Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach
INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT
NATIONAL PREPAREDNESS LEVEL II
CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection
SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues
CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning
LAYOUT
Date =gt day month int lsquorsquo int
Traditional TextCategorization
StatisticalAI Techniques
Classify Place ina taxonomy
feed
Customer Training
Set
RoutingDistribution
Customer Article Feed
4715
Standard Metadata
Feed Source iSyndicate
Posted Date 11202000
Classification of Article 4715
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 18
MOF (Metadata Object Facility) and XMI
MOF models metadata using a subset of UML that is
relevant to modeling metadata (class models - classes
associations and subtyping) a set of rules for mapping
the elements of the MOF Core to CORBA IDL
XML Metadata Interchange (XMI) is an extension of the
MOF into the XML space
HP 19
NewsML
NewsML is a packaging and metadata format for news contentNewsML is developed by the International Press Telecommunications Council (IPTC) a consortium of news providers mostly in the print or wire-service industries Since it deals only with packaging and metadata NewsML is complementary both to news content formats like NITF and to syndication protocols like ICE
HP 20
NewsMLhellip
It can be used by news providers to combine their pictures video text graphics and audio files in news output available on web sites mobile phones high end desktops interactive television and any other deviceaccurate objective set of description tools which help qualify the information and make the search more preciseNewsML allows a range of metadata to be attached to a multi-media story including a detailed computer-readable description of what an item is about
HP 21
Example of the end-to-end flow -NewsML
The content providersupplies NewsML packaged media content to the operator The content is categorized as current events finance sport etc and updated hourly
The operator receives NewsML data from the content provider The content server automatically pushes updated news articles to all news service subscribers
Consumers sign up for the news service directly on the device When using the news service the user browses through the categories and reads the news articles The news articles are presented in a continuous flow (one after the other) without end-user interaction
Sourcehttpwwwmediabrickscom
HP 22
PRISM
Publishing Requirements for Industry Standard MetadataVersion 10 April 2001Authors IDEAlliance (Adobe Vignette Kinecta et al)Idea ldquoa standard for interoperable content description interchange and reuse in both traditional and electronic publishing contextsrdquoWeb site httpwwwprismstandardorg
HP 23
PRISM Design
Built on existing standards like Dublin Core (DC) RDF XMLDesigned to be used in a simple straightforward way over the InternetCompatible with NewsMLIntegrates easily with ICE (for syndication)Vocabulary
Basic DCExtensions ldquoControlled Vocabulariesrdquo eg ldquoNorth American Industrial Classification Systemldquo (NAICS)
HP 24
PRISM Example
ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10
xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt
ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt
ltrdfDescriptiongtltrdfRDFgt
(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)
HP 25
VoiceXML
A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech
(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input
Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications
High- level voice-specific language simplifies application development
Source httpwwwvoicexmlorg
HP 26
Voice Based Internet Applications
Source httpwwwvoicexmlorg
HP 27
Voice XML Metadata
Voice Specific metadata
Supports Syntactic interoperablity
Text data to voice data
Voice XML = XML + Voice Metadata
HP 28
VoiceXML ndash Possible Services
Information retrieval ndash News sports traffic stock quotes
e- Transactions (e- commerce e- tailing etc)
Financial banking stock trading
Catalog browsing (generally as an adjunct to paper)
Telephone services
Personal voice dialing One- number find- me services
Intranet ndash Inventory HR services corporate portals
Unification ndash My Whatever personal portals personal agents unified messaging
Source httpwwwvoicexmlorg
HP 29
MPEG7
set of description scheme and descriptors to describe the content of multimedia data
Provides a language to specify description schemes
A scheme for coding the description
HP 30
Application Examples for MPEG7
A few application examples are
Digital libraries (image catalog musical dictionary)
Multimedia directory services (eg yellow pages)
Broadcast media selection (radio channel TV channel)
HP 31
Information and Content Exchange (ICE)
Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax
Authors Adobe Kinecta MS Sun Vignette et al
Status latest spec version 11 May 2000 submitted to W3C for review
Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip
Web Site httpwwwicestandardorg
HP 32
What is the ICE Protocol
Syndication Protocol for communication between
Syndicators and Subscribers
Metadata to define
roles and responsibilities of involved parties Subscriber vs
Syndicator Requestor vs Responder Sender vs Receiver
format and method of content exchange (eg sequenced
packages pull vs push model)
HP 33
ICE Applications
ICE vocabulary + domain vocabulary = complete application
ICEestablishes and manages the syndication
delivers data
logs events
=gt content-independent metadata
industry-specific vocabulary defines the content =gt domain-specific metadata
Source httpwwwicestandardorg
HP 34
ICE Explained
ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information
Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm
ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe
timestamp=05-15-2001T110001 iceversion=10 gt
ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321
subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt
ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt
PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)
ltcomic-stripgtltice-itemgt ltice-item-groupgt
ltice-responsegt ltice-payloadgt
Content (domain-specific
metadata)
HP 36
XCM (eXtended Content Management)
a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are
Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval
Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle
Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability
Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html
HP 37
XCM
eXtended Content Management
Content DevelopmentManagement
Content DeliveryApplication ContentManagement
Content AuthoringDigital Asset Management
Software ConfigurationManagement
Document ProcessManagement
Metadata ManagementRecombinationPersonalization
Edge Network Delivery
Streaming Media DeliveryCaching
Source httpwwwvignettecom
HP 38
Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain
FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph
hydrography transportation
TitleTitle Dakota Aquifer
Online linkageOnline linkagehttpgisdasckgsukansedudasc
Direct Spatial Reference MethodDirect Spatial Reference Method Vector
Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator
hellip hellip hellip
UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation
TopicTopic Dakota Aquifer
AdressAdress IdIdhttpgisdasckgsukansedudasc
Measuring TechniquesMeasuring Techniques Vector
CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator
hellip hellip hellip
Kansas StateKansas State
HP 39
Different views of Metadata
Domain Independent Specifications (RDF)
FrameworksInfrastructures (XCM)
MetadataApplication Specific
ICE
Media Specific
MPEG7 VoiceXML
Domain Specific
NewsML FGDCUDK
HP 40
Creating and Serving Metadata to Power the Life-cycle of Content
Taalee Infrastructure Services Taalee Content Applications
Where is the content
Whose is it
ProduceAggregate
CatalogIndex
What other content is it related to
Integrate Syndicate
What is the right content for this
user
Personalize
What is the best way to
monetize this interaction
Interactive Marketing
BroadcastWirelineWirelessInteractive TV
Taalee Semantic MetaBase
HP 41
Taaleersquos Intelligent Content Process
HP 42
Metadata Creation and Semanticization
bull Automatic Content ClassificationCategorization
bull Metadata CreationExtractionTypes of metadata created
Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc
HP 43
FormsTypesIngest of Content
Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)
HP 44
Content HandlingIngest
InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents
Centralized Distributed MobileMigratory
HP 45
Information Extraction for Metadata Creation
GlobalEnterpriseWeb Repositories
METADATAMETADATA
EXTRACTORSEXTRACTORS
Digital Maps
NexisUPIAP
Documents
Digital Audios
Data Stores
Digital Videos
Digital Images
HP 46
Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach
INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT
NATIONAL PREPAREDNESS LEVEL II
CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection
SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues
CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning
LAYOUT
Date =gt day month int lsquorsquo int
Traditional TextCategorization
StatisticalAI Techniques
Classify Place ina taxonomy
feed
Customer Training
Set
RoutingDistribution
Customer Article Feed
4715
Standard Metadata
Feed Source iSyndicate
Posted Date 11202000
Classification of Article 4715
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 19
NewsML
NewsML is a packaging and metadata format for news contentNewsML is developed by the International Press Telecommunications Council (IPTC) a consortium of news providers mostly in the print or wire-service industries Since it deals only with packaging and metadata NewsML is complementary both to news content formats like NITF and to syndication protocols like ICE
HP 20
NewsMLhellip
It can be used by news providers to combine their pictures video text graphics and audio files in news output available on web sites mobile phones high end desktops interactive television and any other deviceaccurate objective set of description tools which help qualify the information and make the search more preciseNewsML allows a range of metadata to be attached to a multi-media story including a detailed computer-readable description of what an item is about
HP 21
Example of the end-to-end flow -NewsML
The content providersupplies NewsML packaged media content to the operator The content is categorized as current events finance sport etc and updated hourly
The operator receives NewsML data from the content provider The content server automatically pushes updated news articles to all news service subscribers
Consumers sign up for the news service directly on the device When using the news service the user browses through the categories and reads the news articles The news articles are presented in a continuous flow (one after the other) without end-user interaction
Sourcehttpwwwmediabrickscom
HP 22
PRISM
Publishing Requirements for Industry Standard MetadataVersion 10 April 2001Authors IDEAlliance (Adobe Vignette Kinecta et al)Idea ldquoa standard for interoperable content description interchange and reuse in both traditional and electronic publishing contextsrdquoWeb site httpwwwprismstandardorg
HP 23
PRISM Design
Built on existing standards like Dublin Core (DC) RDF XMLDesigned to be used in a simple straightforward way over the InternetCompatible with NewsMLIntegrates easily with ICE (for syndication)Vocabulary
Basic DCExtensions ldquoControlled Vocabulariesrdquo eg ldquoNorth American Industrial Classification Systemldquo (NAICS)
HP 24
PRISM Example
ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10
xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt
ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt
ltrdfDescriptiongtltrdfRDFgt
(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)
HP 25
VoiceXML
A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech
(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input
Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications
High- level voice-specific language simplifies application development
Source httpwwwvoicexmlorg
HP 26
Voice Based Internet Applications
Source httpwwwvoicexmlorg
HP 27
Voice XML Metadata
Voice Specific metadata
Supports Syntactic interoperablity
Text data to voice data
Voice XML = XML + Voice Metadata
HP 28
VoiceXML ndash Possible Services
Information retrieval ndash News sports traffic stock quotes
e- Transactions (e- commerce e- tailing etc)
Financial banking stock trading
Catalog browsing (generally as an adjunct to paper)
Telephone services
Personal voice dialing One- number find- me services
Intranet ndash Inventory HR services corporate portals
Unification ndash My Whatever personal portals personal agents unified messaging
Source httpwwwvoicexmlorg
HP 29
MPEG7
set of description scheme and descriptors to describe the content of multimedia data
Provides a language to specify description schemes
A scheme for coding the description
HP 30
Application Examples for MPEG7
A few application examples are
Digital libraries (image catalog musical dictionary)
Multimedia directory services (eg yellow pages)
Broadcast media selection (radio channel TV channel)
HP 31
Information and Content Exchange (ICE)
Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax
Authors Adobe Kinecta MS Sun Vignette et al
Status latest spec version 11 May 2000 submitted to W3C for review
Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip
Web Site httpwwwicestandardorg
HP 32
What is the ICE Protocol
Syndication Protocol for communication between
Syndicators and Subscribers
Metadata to define
roles and responsibilities of involved parties Subscriber vs
Syndicator Requestor vs Responder Sender vs Receiver
format and method of content exchange (eg sequenced
packages pull vs push model)
HP 33
ICE Applications
ICE vocabulary + domain vocabulary = complete application
ICEestablishes and manages the syndication
delivers data
logs events
=gt content-independent metadata
industry-specific vocabulary defines the content =gt domain-specific metadata
Source httpwwwicestandardorg
HP 34
ICE Explained
ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information
Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm
ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe
timestamp=05-15-2001T110001 iceversion=10 gt
ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321
subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt
ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt
PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)
ltcomic-stripgtltice-itemgt ltice-item-groupgt
ltice-responsegt ltice-payloadgt
Content (domain-specific
metadata)
HP 36
XCM (eXtended Content Management)
a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are
Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval
Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle
Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability
Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html
HP 37
XCM
eXtended Content Management
Content DevelopmentManagement
Content DeliveryApplication ContentManagement
Content AuthoringDigital Asset Management
Software ConfigurationManagement
Document ProcessManagement
Metadata ManagementRecombinationPersonalization
Edge Network Delivery
Streaming Media DeliveryCaching
Source httpwwwvignettecom
HP 38
Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain
FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph
hydrography transportation
TitleTitle Dakota Aquifer
Online linkageOnline linkagehttpgisdasckgsukansedudasc
Direct Spatial Reference MethodDirect Spatial Reference Method Vector
Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator
hellip hellip hellip
UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation
TopicTopic Dakota Aquifer
AdressAdress IdIdhttpgisdasckgsukansedudasc
Measuring TechniquesMeasuring Techniques Vector
CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator
hellip hellip hellip
Kansas StateKansas State
HP 39
Different views of Metadata
Domain Independent Specifications (RDF)
FrameworksInfrastructures (XCM)
MetadataApplication Specific
ICE
Media Specific
MPEG7 VoiceXML
Domain Specific
NewsML FGDCUDK
HP 40
Creating and Serving Metadata to Power the Life-cycle of Content
Taalee Infrastructure Services Taalee Content Applications
Where is the content
Whose is it
ProduceAggregate
CatalogIndex
What other content is it related to
Integrate Syndicate
What is the right content for this
user
Personalize
What is the best way to
monetize this interaction
Interactive Marketing
BroadcastWirelineWirelessInteractive TV
Taalee Semantic MetaBase
HP 41
Taaleersquos Intelligent Content Process
HP 42
Metadata Creation and Semanticization
bull Automatic Content ClassificationCategorization
bull Metadata CreationExtractionTypes of metadata created
Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc
HP 43
FormsTypesIngest of Content
Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)
HP 44
Content HandlingIngest
InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents
Centralized Distributed MobileMigratory
HP 45
Information Extraction for Metadata Creation
GlobalEnterpriseWeb Repositories
METADATAMETADATA
EXTRACTORSEXTRACTORS
Digital Maps
NexisUPIAP
Documents
Digital Audios
Data Stores
Digital Videos
Digital Images
HP 46
Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach
INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT
NATIONAL PREPAREDNESS LEVEL II
CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection
SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues
CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning
LAYOUT
Date =gt day month int lsquorsquo int
Traditional TextCategorization
StatisticalAI Techniques
Classify Place ina taxonomy
feed
Customer Training
Set
RoutingDistribution
Customer Article Feed
4715
Standard Metadata
Feed Source iSyndicate
Posted Date 11202000
Classification of Article 4715
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 20
NewsMLhellip
It can be used by news providers to combine their pictures video text graphics and audio files in news output available on web sites mobile phones high end desktops interactive television and any other deviceaccurate objective set of description tools which help qualify the information and make the search more preciseNewsML allows a range of metadata to be attached to a multi-media story including a detailed computer-readable description of what an item is about
HP 21
Example of the end-to-end flow -NewsML
The content providersupplies NewsML packaged media content to the operator The content is categorized as current events finance sport etc and updated hourly
The operator receives NewsML data from the content provider The content server automatically pushes updated news articles to all news service subscribers
Consumers sign up for the news service directly on the device When using the news service the user browses through the categories and reads the news articles The news articles are presented in a continuous flow (one after the other) without end-user interaction
Sourcehttpwwwmediabrickscom
HP 22
PRISM
Publishing Requirements for Industry Standard MetadataVersion 10 April 2001Authors IDEAlliance (Adobe Vignette Kinecta et al)Idea ldquoa standard for interoperable content description interchange and reuse in both traditional and electronic publishing contextsrdquoWeb site httpwwwprismstandardorg
HP 23
PRISM Design
Built on existing standards like Dublin Core (DC) RDF XMLDesigned to be used in a simple straightforward way over the InternetCompatible with NewsMLIntegrates easily with ICE (for syndication)Vocabulary
Basic DCExtensions ldquoControlled Vocabulariesrdquo eg ldquoNorth American Industrial Classification Systemldquo (NAICS)
HP 24
PRISM Example
ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10
xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt
ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt
ltrdfDescriptiongtltrdfRDFgt
(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)
HP 25
VoiceXML
A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech
(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input
Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications
High- level voice-specific language simplifies application development
Source httpwwwvoicexmlorg
HP 26
Voice Based Internet Applications
Source httpwwwvoicexmlorg
HP 27
Voice XML Metadata
Voice Specific metadata
Supports Syntactic interoperablity
Text data to voice data
Voice XML = XML + Voice Metadata
HP 28
VoiceXML ndash Possible Services
Information retrieval ndash News sports traffic stock quotes
e- Transactions (e- commerce e- tailing etc)
Financial banking stock trading
Catalog browsing (generally as an adjunct to paper)
Telephone services
Personal voice dialing One- number find- me services
Intranet ndash Inventory HR services corporate portals
Unification ndash My Whatever personal portals personal agents unified messaging
Source httpwwwvoicexmlorg
HP 29
MPEG7
set of description scheme and descriptors to describe the content of multimedia data
Provides a language to specify description schemes
A scheme for coding the description
HP 30
Application Examples for MPEG7
A few application examples are
Digital libraries (image catalog musical dictionary)
Multimedia directory services (eg yellow pages)
Broadcast media selection (radio channel TV channel)
HP 31
Information and Content Exchange (ICE)
Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax
Authors Adobe Kinecta MS Sun Vignette et al
Status latest spec version 11 May 2000 submitted to W3C for review
Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip
Web Site httpwwwicestandardorg
HP 32
What is the ICE Protocol
Syndication Protocol for communication between
Syndicators and Subscribers
Metadata to define
roles and responsibilities of involved parties Subscriber vs
Syndicator Requestor vs Responder Sender vs Receiver
format and method of content exchange (eg sequenced
packages pull vs push model)
HP 33
ICE Applications
ICE vocabulary + domain vocabulary = complete application
ICEestablishes and manages the syndication
delivers data
logs events
=gt content-independent metadata
industry-specific vocabulary defines the content =gt domain-specific metadata
Source httpwwwicestandardorg
HP 34
ICE Explained
ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information
Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm
ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe
timestamp=05-15-2001T110001 iceversion=10 gt
ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321
subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt
ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt
PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)
ltcomic-stripgtltice-itemgt ltice-item-groupgt
ltice-responsegt ltice-payloadgt
Content (domain-specific
metadata)
HP 36
XCM (eXtended Content Management)
a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are
Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval
Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle
Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability
Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html
HP 37
XCM
eXtended Content Management
Content DevelopmentManagement
Content DeliveryApplication ContentManagement
Content AuthoringDigital Asset Management
Software ConfigurationManagement
Document ProcessManagement
Metadata ManagementRecombinationPersonalization
Edge Network Delivery
Streaming Media DeliveryCaching
Source httpwwwvignettecom
HP 38
Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain
FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph
hydrography transportation
TitleTitle Dakota Aquifer
Online linkageOnline linkagehttpgisdasckgsukansedudasc
Direct Spatial Reference MethodDirect Spatial Reference Method Vector
Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator
hellip hellip hellip
UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation
TopicTopic Dakota Aquifer
AdressAdress IdIdhttpgisdasckgsukansedudasc
Measuring TechniquesMeasuring Techniques Vector
CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator
hellip hellip hellip
Kansas StateKansas State
HP 39
Different views of Metadata
Domain Independent Specifications (RDF)
FrameworksInfrastructures (XCM)
MetadataApplication Specific
ICE
Media Specific
MPEG7 VoiceXML
Domain Specific
NewsML FGDCUDK
HP 40
Creating and Serving Metadata to Power the Life-cycle of Content
Taalee Infrastructure Services Taalee Content Applications
Where is the content
Whose is it
ProduceAggregate
CatalogIndex
What other content is it related to
Integrate Syndicate
What is the right content for this
user
Personalize
What is the best way to
monetize this interaction
Interactive Marketing
BroadcastWirelineWirelessInteractive TV
Taalee Semantic MetaBase
HP 41
Taaleersquos Intelligent Content Process
HP 42
Metadata Creation and Semanticization
bull Automatic Content ClassificationCategorization
bull Metadata CreationExtractionTypes of metadata created
Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc
HP 43
FormsTypesIngest of Content
Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)
HP 44
Content HandlingIngest
InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents
Centralized Distributed MobileMigratory
HP 45
Information Extraction for Metadata Creation
GlobalEnterpriseWeb Repositories
METADATAMETADATA
EXTRACTORSEXTRACTORS
Digital Maps
NexisUPIAP
Documents
Digital Audios
Data Stores
Digital Videos
Digital Images
HP 46
Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach
INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT
NATIONAL PREPAREDNESS LEVEL II
CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection
SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues
CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning
LAYOUT
Date =gt day month int lsquorsquo int
Traditional TextCategorization
StatisticalAI Techniques
Classify Place ina taxonomy
feed
Customer Training
Set
RoutingDistribution
Customer Article Feed
4715
Standard Metadata
Feed Source iSyndicate
Posted Date 11202000
Classification of Article 4715
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 21
Example of the end-to-end flow -NewsML
The content providersupplies NewsML packaged media content to the operator The content is categorized as current events finance sport etc and updated hourly
The operator receives NewsML data from the content provider The content server automatically pushes updated news articles to all news service subscribers
Consumers sign up for the news service directly on the device When using the news service the user browses through the categories and reads the news articles The news articles are presented in a continuous flow (one after the other) without end-user interaction
Sourcehttpwwwmediabrickscom
HP 22
PRISM
Publishing Requirements for Industry Standard MetadataVersion 10 April 2001Authors IDEAlliance (Adobe Vignette Kinecta et al)Idea ldquoa standard for interoperable content description interchange and reuse in both traditional and electronic publishing contextsrdquoWeb site httpwwwprismstandardorg
HP 23
PRISM Design
Built on existing standards like Dublin Core (DC) RDF XMLDesigned to be used in a simple straightforward way over the InternetCompatible with NewsMLIntegrates easily with ICE (for syndication)Vocabulary
Basic DCExtensions ldquoControlled Vocabulariesrdquo eg ldquoNorth American Industrial Classification Systemldquo (NAICS)
HP 24
PRISM Example
ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10
xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt
ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt
ltrdfDescriptiongtltrdfRDFgt
(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)
HP 25
VoiceXML
A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech
(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input
Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications
High- level voice-specific language simplifies application development
Source httpwwwvoicexmlorg
HP 26
Voice Based Internet Applications
Source httpwwwvoicexmlorg
HP 27
Voice XML Metadata
Voice Specific metadata
Supports Syntactic interoperablity
Text data to voice data
Voice XML = XML + Voice Metadata
HP 28
VoiceXML ndash Possible Services
Information retrieval ndash News sports traffic stock quotes
e- Transactions (e- commerce e- tailing etc)
Financial banking stock trading
Catalog browsing (generally as an adjunct to paper)
Telephone services
Personal voice dialing One- number find- me services
Intranet ndash Inventory HR services corporate portals
Unification ndash My Whatever personal portals personal agents unified messaging
Source httpwwwvoicexmlorg
HP 29
MPEG7
set of description scheme and descriptors to describe the content of multimedia data
Provides a language to specify description schemes
A scheme for coding the description
HP 30
Application Examples for MPEG7
A few application examples are
Digital libraries (image catalog musical dictionary)
Multimedia directory services (eg yellow pages)
Broadcast media selection (radio channel TV channel)
HP 31
Information and Content Exchange (ICE)
Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax
Authors Adobe Kinecta MS Sun Vignette et al
Status latest spec version 11 May 2000 submitted to W3C for review
Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip
Web Site httpwwwicestandardorg
HP 32
What is the ICE Protocol
Syndication Protocol for communication between
Syndicators and Subscribers
Metadata to define
roles and responsibilities of involved parties Subscriber vs
Syndicator Requestor vs Responder Sender vs Receiver
format and method of content exchange (eg sequenced
packages pull vs push model)
HP 33
ICE Applications
ICE vocabulary + domain vocabulary = complete application
ICEestablishes and manages the syndication
delivers data
logs events
=gt content-independent metadata
industry-specific vocabulary defines the content =gt domain-specific metadata
Source httpwwwicestandardorg
HP 34
ICE Explained
ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information
Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm
ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe
timestamp=05-15-2001T110001 iceversion=10 gt
ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321
subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt
ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt
PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)
ltcomic-stripgtltice-itemgt ltice-item-groupgt
ltice-responsegt ltice-payloadgt
Content (domain-specific
metadata)
HP 36
XCM (eXtended Content Management)
a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are
Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval
Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle
Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability
Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html
HP 37
XCM
eXtended Content Management
Content DevelopmentManagement
Content DeliveryApplication ContentManagement
Content AuthoringDigital Asset Management
Software ConfigurationManagement
Document ProcessManagement
Metadata ManagementRecombinationPersonalization
Edge Network Delivery
Streaming Media DeliveryCaching
Source httpwwwvignettecom
HP 38
Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain
FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph
hydrography transportation
TitleTitle Dakota Aquifer
Online linkageOnline linkagehttpgisdasckgsukansedudasc
Direct Spatial Reference MethodDirect Spatial Reference Method Vector
Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator
hellip hellip hellip
UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation
TopicTopic Dakota Aquifer
AdressAdress IdIdhttpgisdasckgsukansedudasc
Measuring TechniquesMeasuring Techniques Vector
CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator
hellip hellip hellip
Kansas StateKansas State
HP 39
Different views of Metadata
Domain Independent Specifications (RDF)
FrameworksInfrastructures (XCM)
MetadataApplication Specific
ICE
Media Specific
MPEG7 VoiceXML
Domain Specific
NewsML FGDCUDK
HP 40
Creating and Serving Metadata to Power the Life-cycle of Content
Taalee Infrastructure Services Taalee Content Applications
Where is the content
Whose is it
ProduceAggregate
CatalogIndex
What other content is it related to
Integrate Syndicate
What is the right content for this
user
Personalize
What is the best way to
monetize this interaction
Interactive Marketing
BroadcastWirelineWirelessInteractive TV
Taalee Semantic MetaBase
HP 41
Taaleersquos Intelligent Content Process
HP 42
Metadata Creation and Semanticization
bull Automatic Content ClassificationCategorization
bull Metadata CreationExtractionTypes of metadata created
Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc
HP 43
FormsTypesIngest of Content
Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)
HP 44
Content HandlingIngest
InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents
Centralized Distributed MobileMigratory
HP 45
Information Extraction for Metadata Creation
GlobalEnterpriseWeb Repositories
METADATAMETADATA
EXTRACTORSEXTRACTORS
Digital Maps
NexisUPIAP
Documents
Digital Audios
Data Stores
Digital Videos
Digital Images
HP 46
Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach
INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT
NATIONAL PREPAREDNESS LEVEL II
CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection
SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues
CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning
LAYOUT
Date =gt day month int lsquorsquo int
Traditional TextCategorization
StatisticalAI Techniques
Classify Place ina taxonomy
feed
Customer Training
Set
RoutingDistribution
Customer Article Feed
4715
Standard Metadata
Feed Source iSyndicate
Posted Date 11202000
Classification of Article 4715
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 22
PRISM
Publishing Requirements for Industry Standard MetadataVersion 10 April 2001Authors IDEAlliance (Adobe Vignette Kinecta et al)Idea ldquoa standard for interoperable content description interchange and reuse in both traditional and electronic publishing contextsrdquoWeb site httpwwwprismstandardorg
HP 23
PRISM Design
Built on existing standards like Dublin Core (DC) RDF XMLDesigned to be used in a simple straightforward way over the InternetCompatible with NewsMLIntegrates easily with ICE (for syndication)Vocabulary
Basic DCExtensions ldquoControlled Vocabulariesrdquo eg ldquoNorth American Industrial Classification Systemldquo (NAICS)
HP 24
PRISM Example
ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10
xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt
ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt
ltrdfDescriptiongtltrdfRDFgt
(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)
HP 25
VoiceXML
A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech
(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input
Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications
High- level voice-specific language simplifies application development
Source httpwwwvoicexmlorg
HP 26
Voice Based Internet Applications
Source httpwwwvoicexmlorg
HP 27
Voice XML Metadata
Voice Specific metadata
Supports Syntactic interoperablity
Text data to voice data
Voice XML = XML + Voice Metadata
HP 28
VoiceXML ndash Possible Services
Information retrieval ndash News sports traffic stock quotes
e- Transactions (e- commerce e- tailing etc)
Financial banking stock trading
Catalog browsing (generally as an adjunct to paper)
Telephone services
Personal voice dialing One- number find- me services
Intranet ndash Inventory HR services corporate portals
Unification ndash My Whatever personal portals personal agents unified messaging
Source httpwwwvoicexmlorg
HP 29
MPEG7
set of description scheme and descriptors to describe the content of multimedia data
Provides a language to specify description schemes
A scheme for coding the description
HP 30
Application Examples for MPEG7
A few application examples are
Digital libraries (image catalog musical dictionary)
Multimedia directory services (eg yellow pages)
Broadcast media selection (radio channel TV channel)
HP 31
Information and Content Exchange (ICE)
Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax
Authors Adobe Kinecta MS Sun Vignette et al
Status latest spec version 11 May 2000 submitted to W3C for review
Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip
Web Site httpwwwicestandardorg
HP 32
What is the ICE Protocol
Syndication Protocol for communication between
Syndicators and Subscribers
Metadata to define
roles and responsibilities of involved parties Subscriber vs
Syndicator Requestor vs Responder Sender vs Receiver
format and method of content exchange (eg sequenced
packages pull vs push model)
HP 33
ICE Applications
ICE vocabulary + domain vocabulary = complete application
ICEestablishes and manages the syndication
delivers data
logs events
=gt content-independent metadata
industry-specific vocabulary defines the content =gt domain-specific metadata
Source httpwwwicestandardorg
HP 34
ICE Explained
ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information
Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm
ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe
timestamp=05-15-2001T110001 iceversion=10 gt
ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321
subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt
ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt
PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)
ltcomic-stripgtltice-itemgt ltice-item-groupgt
ltice-responsegt ltice-payloadgt
Content (domain-specific
metadata)
HP 36
XCM (eXtended Content Management)
a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are
Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval
Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle
Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability
Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html
HP 37
XCM
eXtended Content Management
Content DevelopmentManagement
Content DeliveryApplication ContentManagement
Content AuthoringDigital Asset Management
Software ConfigurationManagement
Document ProcessManagement
Metadata ManagementRecombinationPersonalization
Edge Network Delivery
Streaming Media DeliveryCaching
Source httpwwwvignettecom
HP 38
Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain
FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph
hydrography transportation
TitleTitle Dakota Aquifer
Online linkageOnline linkagehttpgisdasckgsukansedudasc
Direct Spatial Reference MethodDirect Spatial Reference Method Vector
Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator
hellip hellip hellip
UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation
TopicTopic Dakota Aquifer
AdressAdress IdIdhttpgisdasckgsukansedudasc
Measuring TechniquesMeasuring Techniques Vector
CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator
hellip hellip hellip
Kansas StateKansas State
HP 39
Different views of Metadata
Domain Independent Specifications (RDF)
FrameworksInfrastructures (XCM)
MetadataApplication Specific
ICE
Media Specific
MPEG7 VoiceXML
Domain Specific
NewsML FGDCUDK
HP 40
Creating and Serving Metadata to Power the Life-cycle of Content
Taalee Infrastructure Services Taalee Content Applications
Where is the content
Whose is it
ProduceAggregate
CatalogIndex
What other content is it related to
Integrate Syndicate
What is the right content for this
user
Personalize
What is the best way to
monetize this interaction
Interactive Marketing
BroadcastWirelineWirelessInteractive TV
Taalee Semantic MetaBase
HP 41
Taaleersquos Intelligent Content Process
HP 42
Metadata Creation and Semanticization
bull Automatic Content ClassificationCategorization
bull Metadata CreationExtractionTypes of metadata created
Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc
HP 43
FormsTypesIngest of Content
Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)
HP 44
Content HandlingIngest
InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents
Centralized Distributed MobileMigratory
HP 45
Information Extraction for Metadata Creation
GlobalEnterpriseWeb Repositories
METADATAMETADATA
EXTRACTORSEXTRACTORS
Digital Maps
NexisUPIAP
Documents
Digital Audios
Data Stores
Digital Videos
Digital Images
HP 46
Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach
INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT
NATIONAL PREPAREDNESS LEVEL II
CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection
SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues
CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning
LAYOUT
Date =gt day month int lsquorsquo int
Traditional TextCategorization
StatisticalAI Techniques
Classify Place ina taxonomy
feed
Customer Training
Set
RoutingDistribution
Customer Article Feed
4715
Standard Metadata
Feed Source iSyndicate
Posted Date 11202000
Classification of Article 4715
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 23
PRISM Design
Built on existing standards like Dublin Core (DC) RDF XMLDesigned to be used in a simple straightforward way over the InternetCompatible with NewsMLIntegrates easily with ICE (for syndication)Vocabulary
Basic DCExtensions ldquoControlled Vocabulariesrdquo eg ldquoNorth American Industrial Classification Systemldquo (NAICS)
HP 24
PRISM Example
ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10
xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt
ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt
ltrdfDescriptiongtltrdfRDFgt
(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)
HP 25
VoiceXML
A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech
(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input
Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications
High- level voice-specific language simplifies application development
Source httpwwwvoicexmlorg
HP 26
Voice Based Internet Applications
Source httpwwwvoicexmlorg
HP 27
Voice XML Metadata
Voice Specific metadata
Supports Syntactic interoperablity
Text data to voice data
Voice XML = XML + Voice Metadata
HP 28
VoiceXML ndash Possible Services
Information retrieval ndash News sports traffic stock quotes
e- Transactions (e- commerce e- tailing etc)
Financial banking stock trading
Catalog browsing (generally as an adjunct to paper)
Telephone services
Personal voice dialing One- number find- me services
Intranet ndash Inventory HR services corporate portals
Unification ndash My Whatever personal portals personal agents unified messaging
Source httpwwwvoicexmlorg
HP 29
MPEG7
set of description scheme and descriptors to describe the content of multimedia data
Provides a language to specify description schemes
A scheme for coding the description
HP 30
Application Examples for MPEG7
A few application examples are
Digital libraries (image catalog musical dictionary)
Multimedia directory services (eg yellow pages)
Broadcast media selection (radio channel TV channel)
HP 31
Information and Content Exchange (ICE)
Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax
Authors Adobe Kinecta MS Sun Vignette et al
Status latest spec version 11 May 2000 submitted to W3C for review
Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip
Web Site httpwwwicestandardorg
HP 32
What is the ICE Protocol
Syndication Protocol for communication between
Syndicators and Subscribers
Metadata to define
roles and responsibilities of involved parties Subscriber vs
Syndicator Requestor vs Responder Sender vs Receiver
format and method of content exchange (eg sequenced
packages pull vs push model)
HP 33
ICE Applications
ICE vocabulary + domain vocabulary = complete application
ICEestablishes and manages the syndication
delivers data
logs events
=gt content-independent metadata
industry-specific vocabulary defines the content =gt domain-specific metadata
Source httpwwwicestandardorg
HP 34
ICE Explained
ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information
Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm
ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe
timestamp=05-15-2001T110001 iceversion=10 gt
ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321
subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt
ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt
PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)
ltcomic-stripgtltice-itemgt ltice-item-groupgt
ltice-responsegt ltice-payloadgt
Content (domain-specific
metadata)
HP 36
XCM (eXtended Content Management)
a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are
Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval
Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle
Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability
Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html
HP 37
XCM
eXtended Content Management
Content DevelopmentManagement
Content DeliveryApplication ContentManagement
Content AuthoringDigital Asset Management
Software ConfigurationManagement
Document ProcessManagement
Metadata ManagementRecombinationPersonalization
Edge Network Delivery
Streaming Media DeliveryCaching
Source httpwwwvignettecom
HP 38
Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain
FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph
hydrography transportation
TitleTitle Dakota Aquifer
Online linkageOnline linkagehttpgisdasckgsukansedudasc
Direct Spatial Reference MethodDirect Spatial Reference Method Vector
Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator
hellip hellip hellip
UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation
TopicTopic Dakota Aquifer
AdressAdress IdIdhttpgisdasckgsukansedudasc
Measuring TechniquesMeasuring Techniques Vector
CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator
hellip hellip hellip
Kansas StateKansas State
HP 39
Different views of Metadata
Domain Independent Specifications (RDF)
FrameworksInfrastructures (XCM)
MetadataApplication Specific
ICE
Media Specific
MPEG7 VoiceXML
Domain Specific
NewsML FGDCUDK
HP 40
Creating and Serving Metadata to Power the Life-cycle of Content
Taalee Infrastructure Services Taalee Content Applications
Where is the content
Whose is it
ProduceAggregate
CatalogIndex
What other content is it related to
Integrate Syndicate
What is the right content for this
user
Personalize
What is the best way to
monetize this interaction
Interactive Marketing
BroadcastWirelineWirelessInteractive TV
Taalee Semantic MetaBase
HP 41
Taaleersquos Intelligent Content Process
HP 42
Metadata Creation and Semanticization
bull Automatic Content ClassificationCategorization
bull Metadata CreationExtractionTypes of metadata created
Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc
HP 43
FormsTypesIngest of Content
Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)
HP 44
Content HandlingIngest
InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents
Centralized Distributed MobileMigratory
HP 45
Information Extraction for Metadata Creation
GlobalEnterpriseWeb Repositories
METADATAMETADATA
EXTRACTORSEXTRACTORS
Digital Maps
NexisUPIAP
Documents
Digital Audios
Data Stores
Digital Videos
Digital Images
HP 46
Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach
INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT
NATIONAL PREPAREDNESS LEVEL II
CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection
SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues
CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning
LAYOUT
Date =gt day month int lsquorsquo int
Traditional TextCategorization
StatisticalAI Techniques
Classify Place ina taxonomy
feed
Customer Training
Set
RoutingDistribution
Customer Article Feed
4715
Standard Metadata
Feed Source iSyndicate
Posted Date 11202000
Classification of Article 4715
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 24
PRISM Example
ltxml version=10 encoding=UTF-8gtltrdfRDF xmlnsprism=httpprismstandardorg10
xmlnsrdf=httpwwww3org19990222-rdf-syntax-nsxmlnsdc=httppurlorgdcelements11gt
ltrdfDescription rdfabout=httpwanderlustcom200008Corfujpggtltdcidentifier rdfresource=httpwanderlustcomcontent2357845 gtltdcdescriptiongtPhotograph taken at 600 am on Corfu with two modelsltdcdescriptiongtltdctitlegtWalking on the Beach in CorfultdctitlegtltdccreatorgtJohn PetersonltdccreatorgtltdccontributorgtSally Smith lightingltdccontributorgtltdcformatgtimagejpegltdcformatgt
ltrdfDescriptiongtltrdfRDFgt
(Source PRISM spec v 1 httpwwwprismstandardorgtechdevprismspec1asp)
HP 25
VoiceXML
A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech
(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input
Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications
High- level voice-specific language simplifies application development
Source httpwwwvoicexmlorg
HP 26
Voice Based Internet Applications
Source httpwwwvoicexmlorg
HP 27
Voice XML Metadata
Voice Specific metadata
Supports Syntactic interoperablity
Text data to voice data
Voice XML = XML + Voice Metadata
HP 28
VoiceXML ndash Possible Services
Information retrieval ndash News sports traffic stock quotes
e- Transactions (e- commerce e- tailing etc)
Financial banking stock trading
Catalog browsing (generally as an adjunct to paper)
Telephone services
Personal voice dialing One- number find- me services
Intranet ndash Inventory HR services corporate portals
Unification ndash My Whatever personal portals personal agents unified messaging
Source httpwwwvoicexmlorg
HP 29
MPEG7
set of description scheme and descriptors to describe the content of multimedia data
Provides a language to specify description schemes
A scheme for coding the description
HP 30
Application Examples for MPEG7
A few application examples are
Digital libraries (image catalog musical dictionary)
Multimedia directory services (eg yellow pages)
Broadcast media selection (radio channel TV channel)
HP 31
Information and Content Exchange (ICE)
Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax
Authors Adobe Kinecta MS Sun Vignette et al
Status latest spec version 11 May 2000 submitted to W3C for review
Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip
Web Site httpwwwicestandardorg
HP 32
What is the ICE Protocol
Syndication Protocol for communication between
Syndicators and Subscribers
Metadata to define
roles and responsibilities of involved parties Subscriber vs
Syndicator Requestor vs Responder Sender vs Receiver
format and method of content exchange (eg sequenced
packages pull vs push model)
HP 33
ICE Applications
ICE vocabulary + domain vocabulary = complete application
ICEestablishes and manages the syndication
delivers data
logs events
=gt content-independent metadata
industry-specific vocabulary defines the content =gt domain-specific metadata
Source httpwwwicestandardorg
HP 34
ICE Explained
ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information
Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm
ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe
timestamp=05-15-2001T110001 iceversion=10 gt
ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321
subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt
ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt
PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)
ltcomic-stripgtltice-itemgt ltice-item-groupgt
ltice-responsegt ltice-payloadgt
Content (domain-specific
metadata)
HP 36
XCM (eXtended Content Management)
a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are
Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval
Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle
Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability
Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html
HP 37
XCM
eXtended Content Management
Content DevelopmentManagement
Content DeliveryApplication ContentManagement
Content AuthoringDigital Asset Management
Software ConfigurationManagement
Document ProcessManagement
Metadata ManagementRecombinationPersonalization
Edge Network Delivery
Streaming Media DeliveryCaching
Source httpwwwvignettecom
HP 38
Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain
FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph
hydrography transportation
TitleTitle Dakota Aquifer
Online linkageOnline linkagehttpgisdasckgsukansedudasc
Direct Spatial Reference MethodDirect Spatial Reference Method Vector
Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator
hellip hellip hellip
UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation
TopicTopic Dakota Aquifer
AdressAdress IdIdhttpgisdasckgsukansedudasc
Measuring TechniquesMeasuring Techniques Vector
CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator
hellip hellip hellip
Kansas StateKansas State
HP 39
Different views of Metadata
Domain Independent Specifications (RDF)
FrameworksInfrastructures (XCM)
MetadataApplication Specific
ICE
Media Specific
MPEG7 VoiceXML
Domain Specific
NewsML FGDCUDK
HP 40
Creating and Serving Metadata to Power the Life-cycle of Content
Taalee Infrastructure Services Taalee Content Applications
Where is the content
Whose is it
ProduceAggregate
CatalogIndex
What other content is it related to
Integrate Syndicate
What is the right content for this
user
Personalize
What is the best way to
monetize this interaction
Interactive Marketing
BroadcastWirelineWirelessInteractive TV
Taalee Semantic MetaBase
HP 41
Taaleersquos Intelligent Content Process
HP 42
Metadata Creation and Semanticization
bull Automatic Content ClassificationCategorization
bull Metadata CreationExtractionTypes of metadata created
Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc
HP 43
FormsTypesIngest of Content
Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)
HP 44
Content HandlingIngest
InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents
Centralized Distributed MobileMigratory
HP 45
Information Extraction for Metadata Creation
GlobalEnterpriseWeb Repositories
METADATAMETADATA
EXTRACTORSEXTRACTORS
Digital Maps
NexisUPIAP
Documents
Digital Audios
Data Stores
Digital Videos
Digital Images
HP 46
Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach
INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT
NATIONAL PREPAREDNESS LEVEL II
CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection
SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues
CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning
LAYOUT
Date =gt day month int lsquorsquo int
Traditional TextCategorization
StatisticalAI Techniques
Classify Place ina taxonomy
feed
Customer Training
Set
RoutingDistribution
Customer Article Feed
4715
Standard Metadata
Feed Source iSyndicate
Posted Date 11202000
Classification of Article 4715
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 25
VoiceXML
A language for specifying voice dialogsVoice dialogs use audio prompts and text- to- speech
(TTS) for output touch- tone keys (DTMF) and automatic speech recognition (ASR) for input
Goal is to bring the advantages of web-based development and content delivery to interactive voice response applications
High- level voice-specific language simplifies application development
Source httpwwwvoicexmlorg
HP 26
Voice Based Internet Applications
Source httpwwwvoicexmlorg
HP 27
Voice XML Metadata
Voice Specific metadata
Supports Syntactic interoperablity
Text data to voice data
Voice XML = XML + Voice Metadata
HP 28
VoiceXML ndash Possible Services
Information retrieval ndash News sports traffic stock quotes
e- Transactions (e- commerce e- tailing etc)
Financial banking stock trading
Catalog browsing (generally as an adjunct to paper)
Telephone services
Personal voice dialing One- number find- me services
Intranet ndash Inventory HR services corporate portals
Unification ndash My Whatever personal portals personal agents unified messaging
Source httpwwwvoicexmlorg
HP 29
MPEG7
set of description scheme and descriptors to describe the content of multimedia data
Provides a language to specify description schemes
A scheme for coding the description
HP 30
Application Examples for MPEG7
A few application examples are
Digital libraries (image catalog musical dictionary)
Multimedia directory services (eg yellow pages)
Broadcast media selection (radio channel TV channel)
HP 31
Information and Content Exchange (ICE)
Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax
Authors Adobe Kinecta MS Sun Vignette et al
Status latest spec version 11 May 2000 submitted to W3C for review
Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip
Web Site httpwwwicestandardorg
HP 32
What is the ICE Protocol
Syndication Protocol for communication between
Syndicators and Subscribers
Metadata to define
roles and responsibilities of involved parties Subscriber vs
Syndicator Requestor vs Responder Sender vs Receiver
format and method of content exchange (eg sequenced
packages pull vs push model)
HP 33
ICE Applications
ICE vocabulary + domain vocabulary = complete application
ICEestablishes and manages the syndication
delivers data
logs events
=gt content-independent metadata
industry-specific vocabulary defines the content =gt domain-specific metadata
Source httpwwwicestandardorg
HP 34
ICE Explained
ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information
Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm
ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe
timestamp=05-15-2001T110001 iceversion=10 gt
ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321
subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt
ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt
PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)
ltcomic-stripgtltice-itemgt ltice-item-groupgt
ltice-responsegt ltice-payloadgt
Content (domain-specific
metadata)
HP 36
XCM (eXtended Content Management)
a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are
Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval
Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle
Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability
Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html
HP 37
XCM
eXtended Content Management
Content DevelopmentManagement
Content DeliveryApplication ContentManagement
Content AuthoringDigital Asset Management
Software ConfigurationManagement
Document ProcessManagement
Metadata ManagementRecombinationPersonalization
Edge Network Delivery
Streaming Media DeliveryCaching
Source httpwwwvignettecom
HP 38
Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain
FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph
hydrography transportation
TitleTitle Dakota Aquifer
Online linkageOnline linkagehttpgisdasckgsukansedudasc
Direct Spatial Reference MethodDirect Spatial Reference Method Vector
Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator
hellip hellip hellip
UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation
TopicTopic Dakota Aquifer
AdressAdress IdIdhttpgisdasckgsukansedudasc
Measuring TechniquesMeasuring Techniques Vector
CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator
hellip hellip hellip
Kansas StateKansas State
HP 39
Different views of Metadata
Domain Independent Specifications (RDF)
FrameworksInfrastructures (XCM)
MetadataApplication Specific
ICE
Media Specific
MPEG7 VoiceXML
Domain Specific
NewsML FGDCUDK
HP 40
Creating and Serving Metadata to Power the Life-cycle of Content
Taalee Infrastructure Services Taalee Content Applications
Where is the content
Whose is it
ProduceAggregate
CatalogIndex
What other content is it related to
Integrate Syndicate
What is the right content for this
user
Personalize
What is the best way to
monetize this interaction
Interactive Marketing
BroadcastWirelineWirelessInteractive TV
Taalee Semantic MetaBase
HP 41
Taaleersquos Intelligent Content Process
HP 42
Metadata Creation and Semanticization
bull Automatic Content ClassificationCategorization
bull Metadata CreationExtractionTypes of metadata created
Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc
HP 43
FormsTypesIngest of Content
Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)
HP 44
Content HandlingIngest
InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents
Centralized Distributed MobileMigratory
HP 45
Information Extraction for Metadata Creation
GlobalEnterpriseWeb Repositories
METADATAMETADATA
EXTRACTORSEXTRACTORS
Digital Maps
NexisUPIAP
Documents
Digital Audios
Data Stores
Digital Videos
Digital Images
HP 46
Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach
INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT
NATIONAL PREPAREDNESS LEVEL II
CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection
SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues
CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning
LAYOUT
Date =gt day month int lsquorsquo int
Traditional TextCategorization
StatisticalAI Techniques
Classify Place ina taxonomy
feed
Customer Training
Set
RoutingDistribution
Customer Article Feed
4715
Standard Metadata
Feed Source iSyndicate
Posted Date 11202000
Classification of Article 4715
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 26
Voice Based Internet Applications
Source httpwwwvoicexmlorg
HP 27
Voice XML Metadata
Voice Specific metadata
Supports Syntactic interoperablity
Text data to voice data
Voice XML = XML + Voice Metadata
HP 28
VoiceXML ndash Possible Services
Information retrieval ndash News sports traffic stock quotes
e- Transactions (e- commerce e- tailing etc)
Financial banking stock trading
Catalog browsing (generally as an adjunct to paper)
Telephone services
Personal voice dialing One- number find- me services
Intranet ndash Inventory HR services corporate portals
Unification ndash My Whatever personal portals personal agents unified messaging
Source httpwwwvoicexmlorg
HP 29
MPEG7
set of description scheme and descriptors to describe the content of multimedia data
Provides a language to specify description schemes
A scheme for coding the description
HP 30
Application Examples for MPEG7
A few application examples are
Digital libraries (image catalog musical dictionary)
Multimedia directory services (eg yellow pages)
Broadcast media selection (radio channel TV channel)
HP 31
Information and Content Exchange (ICE)
Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax
Authors Adobe Kinecta MS Sun Vignette et al
Status latest spec version 11 May 2000 submitted to W3C for review
Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip
Web Site httpwwwicestandardorg
HP 32
What is the ICE Protocol
Syndication Protocol for communication between
Syndicators and Subscribers
Metadata to define
roles and responsibilities of involved parties Subscriber vs
Syndicator Requestor vs Responder Sender vs Receiver
format and method of content exchange (eg sequenced
packages pull vs push model)
HP 33
ICE Applications
ICE vocabulary + domain vocabulary = complete application
ICEestablishes and manages the syndication
delivers data
logs events
=gt content-independent metadata
industry-specific vocabulary defines the content =gt domain-specific metadata
Source httpwwwicestandardorg
HP 34
ICE Explained
ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information
Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm
ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe
timestamp=05-15-2001T110001 iceversion=10 gt
ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321
subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt
ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt
PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)
ltcomic-stripgtltice-itemgt ltice-item-groupgt
ltice-responsegt ltice-payloadgt
Content (domain-specific
metadata)
HP 36
XCM (eXtended Content Management)
a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are
Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval
Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle
Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability
Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html
HP 37
XCM
eXtended Content Management
Content DevelopmentManagement
Content DeliveryApplication ContentManagement
Content AuthoringDigital Asset Management
Software ConfigurationManagement
Document ProcessManagement
Metadata ManagementRecombinationPersonalization
Edge Network Delivery
Streaming Media DeliveryCaching
Source httpwwwvignettecom
HP 38
Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain
FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph
hydrography transportation
TitleTitle Dakota Aquifer
Online linkageOnline linkagehttpgisdasckgsukansedudasc
Direct Spatial Reference MethodDirect Spatial Reference Method Vector
Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator
hellip hellip hellip
UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation
TopicTopic Dakota Aquifer
AdressAdress IdIdhttpgisdasckgsukansedudasc
Measuring TechniquesMeasuring Techniques Vector
CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator
hellip hellip hellip
Kansas StateKansas State
HP 39
Different views of Metadata
Domain Independent Specifications (RDF)
FrameworksInfrastructures (XCM)
MetadataApplication Specific
ICE
Media Specific
MPEG7 VoiceXML
Domain Specific
NewsML FGDCUDK
HP 40
Creating and Serving Metadata to Power the Life-cycle of Content
Taalee Infrastructure Services Taalee Content Applications
Where is the content
Whose is it
ProduceAggregate
CatalogIndex
What other content is it related to
Integrate Syndicate
What is the right content for this
user
Personalize
What is the best way to
monetize this interaction
Interactive Marketing
BroadcastWirelineWirelessInteractive TV
Taalee Semantic MetaBase
HP 41
Taaleersquos Intelligent Content Process
HP 42
Metadata Creation and Semanticization
bull Automatic Content ClassificationCategorization
bull Metadata CreationExtractionTypes of metadata created
Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc
HP 43
FormsTypesIngest of Content
Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)
HP 44
Content HandlingIngest
InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents
Centralized Distributed MobileMigratory
HP 45
Information Extraction for Metadata Creation
GlobalEnterpriseWeb Repositories
METADATAMETADATA
EXTRACTORSEXTRACTORS
Digital Maps
NexisUPIAP
Documents
Digital Audios
Data Stores
Digital Videos
Digital Images
HP 46
Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach
INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT
NATIONAL PREPAREDNESS LEVEL II
CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection
SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues
CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning
LAYOUT
Date =gt day month int lsquorsquo int
Traditional TextCategorization
StatisticalAI Techniques
Classify Place ina taxonomy
feed
Customer Training
Set
RoutingDistribution
Customer Article Feed
4715
Standard Metadata
Feed Source iSyndicate
Posted Date 11202000
Classification of Article 4715
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 27
Voice XML Metadata
Voice Specific metadata
Supports Syntactic interoperablity
Text data to voice data
Voice XML = XML + Voice Metadata
HP 28
VoiceXML ndash Possible Services
Information retrieval ndash News sports traffic stock quotes
e- Transactions (e- commerce e- tailing etc)
Financial banking stock trading
Catalog browsing (generally as an adjunct to paper)
Telephone services
Personal voice dialing One- number find- me services
Intranet ndash Inventory HR services corporate portals
Unification ndash My Whatever personal portals personal agents unified messaging
Source httpwwwvoicexmlorg
HP 29
MPEG7
set of description scheme and descriptors to describe the content of multimedia data
Provides a language to specify description schemes
A scheme for coding the description
HP 30
Application Examples for MPEG7
A few application examples are
Digital libraries (image catalog musical dictionary)
Multimedia directory services (eg yellow pages)
Broadcast media selection (radio channel TV channel)
HP 31
Information and Content Exchange (ICE)
Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax
Authors Adobe Kinecta MS Sun Vignette et al
Status latest spec version 11 May 2000 submitted to W3C for review
Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip
Web Site httpwwwicestandardorg
HP 32
What is the ICE Protocol
Syndication Protocol for communication between
Syndicators and Subscribers
Metadata to define
roles and responsibilities of involved parties Subscriber vs
Syndicator Requestor vs Responder Sender vs Receiver
format and method of content exchange (eg sequenced
packages pull vs push model)
HP 33
ICE Applications
ICE vocabulary + domain vocabulary = complete application
ICEestablishes and manages the syndication
delivers data
logs events
=gt content-independent metadata
industry-specific vocabulary defines the content =gt domain-specific metadata
Source httpwwwicestandardorg
HP 34
ICE Explained
ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information
Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm
ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe
timestamp=05-15-2001T110001 iceversion=10 gt
ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321
subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt
ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt
PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)
ltcomic-stripgtltice-itemgt ltice-item-groupgt
ltice-responsegt ltice-payloadgt
Content (domain-specific
metadata)
HP 36
XCM (eXtended Content Management)
a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are
Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval
Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle
Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability
Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html
HP 37
XCM
eXtended Content Management
Content DevelopmentManagement
Content DeliveryApplication ContentManagement
Content AuthoringDigital Asset Management
Software ConfigurationManagement
Document ProcessManagement
Metadata ManagementRecombinationPersonalization
Edge Network Delivery
Streaming Media DeliveryCaching
Source httpwwwvignettecom
HP 38
Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain
FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph
hydrography transportation
TitleTitle Dakota Aquifer
Online linkageOnline linkagehttpgisdasckgsukansedudasc
Direct Spatial Reference MethodDirect Spatial Reference Method Vector
Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator
hellip hellip hellip
UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation
TopicTopic Dakota Aquifer
AdressAdress IdIdhttpgisdasckgsukansedudasc
Measuring TechniquesMeasuring Techniques Vector
CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator
hellip hellip hellip
Kansas StateKansas State
HP 39
Different views of Metadata
Domain Independent Specifications (RDF)
FrameworksInfrastructures (XCM)
MetadataApplication Specific
ICE
Media Specific
MPEG7 VoiceXML
Domain Specific
NewsML FGDCUDK
HP 40
Creating and Serving Metadata to Power the Life-cycle of Content
Taalee Infrastructure Services Taalee Content Applications
Where is the content
Whose is it
ProduceAggregate
CatalogIndex
What other content is it related to
Integrate Syndicate
What is the right content for this
user
Personalize
What is the best way to
monetize this interaction
Interactive Marketing
BroadcastWirelineWirelessInteractive TV
Taalee Semantic MetaBase
HP 41
Taaleersquos Intelligent Content Process
HP 42
Metadata Creation and Semanticization
bull Automatic Content ClassificationCategorization
bull Metadata CreationExtractionTypes of metadata created
Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc
HP 43
FormsTypesIngest of Content
Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)
HP 44
Content HandlingIngest
InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents
Centralized Distributed MobileMigratory
HP 45
Information Extraction for Metadata Creation
GlobalEnterpriseWeb Repositories
METADATAMETADATA
EXTRACTORSEXTRACTORS
Digital Maps
NexisUPIAP
Documents
Digital Audios
Data Stores
Digital Videos
Digital Images
HP 46
Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach
INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT
NATIONAL PREPAREDNESS LEVEL II
CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection
SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues
CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning
LAYOUT
Date =gt day month int lsquorsquo int
Traditional TextCategorization
StatisticalAI Techniques
Classify Place ina taxonomy
feed
Customer Training
Set
RoutingDistribution
Customer Article Feed
4715
Standard Metadata
Feed Source iSyndicate
Posted Date 11202000
Classification of Article 4715
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 28
VoiceXML ndash Possible Services
Information retrieval ndash News sports traffic stock quotes
e- Transactions (e- commerce e- tailing etc)
Financial banking stock trading
Catalog browsing (generally as an adjunct to paper)
Telephone services
Personal voice dialing One- number find- me services
Intranet ndash Inventory HR services corporate portals
Unification ndash My Whatever personal portals personal agents unified messaging
Source httpwwwvoicexmlorg
HP 29
MPEG7
set of description scheme and descriptors to describe the content of multimedia data
Provides a language to specify description schemes
A scheme for coding the description
HP 30
Application Examples for MPEG7
A few application examples are
Digital libraries (image catalog musical dictionary)
Multimedia directory services (eg yellow pages)
Broadcast media selection (radio channel TV channel)
HP 31
Information and Content Exchange (ICE)
Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax
Authors Adobe Kinecta MS Sun Vignette et al
Status latest spec version 11 May 2000 submitted to W3C for review
Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip
Web Site httpwwwicestandardorg
HP 32
What is the ICE Protocol
Syndication Protocol for communication between
Syndicators and Subscribers
Metadata to define
roles and responsibilities of involved parties Subscriber vs
Syndicator Requestor vs Responder Sender vs Receiver
format and method of content exchange (eg sequenced
packages pull vs push model)
HP 33
ICE Applications
ICE vocabulary + domain vocabulary = complete application
ICEestablishes and manages the syndication
delivers data
logs events
=gt content-independent metadata
industry-specific vocabulary defines the content =gt domain-specific metadata
Source httpwwwicestandardorg
HP 34
ICE Explained
ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information
Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm
ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe
timestamp=05-15-2001T110001 iceversion=10 gt
ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321
subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt
ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt
PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)
ltcomic-stripgtltice-itemgt ltice-item-groupgt
ltice-responsegt ltice-payloadgt
Content (domain-specific
metadata)
HP 36
XCM (eXtended Content Management)
a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are
Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval
Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle
Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability
Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html
HP 37
XCM
eXtended Content Management
Content DevelopmentManagement
Content DeliveryApplication ContentManagement
Content AuthoringDigital Asset Management
Software ConfigurationManagement
Document ProcessManagement
Metadata ManagementRecombinationPersonalization
Edge Network Delivery
Streaming Media DeliveryCaching
Source httpwwwvignettecom
HP 38
Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain
FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph
hydrography transportation
TitleTitle Dakota Aquifer
Online linkageOnline linkagehttpgisdasckgsukansedudasc
Direct Spatial Reference MethodDirect Spatial Reference Method Vector
Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator
hellip hellip hellip
UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation
TopicTopic Dakota Aquifer
AdressAdress IdIdhttpgisdasckgsukansedudasc
Measuring TechniquesMeasuring Techniques Vector
CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator
hellip hellip hellip
Kansas StateKansas State
HP 39
Different views of Metadata
Domain Independent Specifications (RDF)
FrameworksInfrastructures (XCM)
MetadataApplication Specific
ICE
Media Specific
MPEG7 VoiceXML
Domain Specific
NewsML FGDCUDK
HP 40
Creating and Serving Metadata to Power the Life-cycle of Content
Taalee Infrastructure Services Taalee Content Applications
Where is the content
Whose is it
ProduceAggregate
CatalogIndex
What other content is it related to
Integrate Syndicate
What is the right content for this
user
Personalize
What is the best way to
monetize this interaction
Interactive Marketing
BroadcastWirelineWirelessInteractive TV
Taalee Semantic MetaBase
HP 41
Taaleersquos Intelligent Content Process
HP 42
Metadata Creation and Semanticization
bull Automatic Content ClassificationCategorization
bull Metadata CreationExtractionTypes of metadata created
Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc
HP 43
FormsTypesIngest of Content
Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)
HP 44
Content HandlingIngest
InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents
Centralized Distributed MobileMigratory
HP 45
Information Extraction for Metadata Creation
GlobalEnterpriseWeb Repositories
METADATAMETADATA
EXTRACTORSEXTRACTORS
Digital Maps
NexisUPIAP
Documents
Digital Audios
Data Stores
Digital Videos
Digital Images
HP 46
Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach
INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT
NATIONAL PREPAREDNESS LEVEL II
CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection
SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues
CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning
LAYOUT
Date =gt day month int lsquorsquo int
Traditional TextCategorization
StatisticalAI Techniques
Classify Place ina taxonomy
feed
Customer Training
Set
RoutingDistribution
Customer Article Feed
4715
Standard Metadata
Feed Source iSyndicate
Posted Date 11202000
Classification of Article 4715
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 29
MPEG7
set of description scheme and descriptors to describe the content of multimedia data
Provides a language to specify description schemes
A scheme for coding the description
HP 30
Application Examples for MPEG7
A few application examples are
Digital libraries (image catalog musical dictionary)
Multimedia directory services (eg yellow pages)
Broadcast media selection (radio channel TV channel)
HP 31
Information and Content Exchange (ICE)
Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax
Authors Adobe Kinecta MS Sun Vignette et al
Status latest spec version 11 May 2000 submitted to W3C for review
Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip
Web Site httpwwwicestandardorg
HP 32
What is the ICE Protocol
Syndication Protocol for communication between
Syndicators and Subscribers
Metadata to define
roles and responsibilities of involved parties Subscriber vs
Syndicator Requestor vs Responder Sender vs Receiver
format and method of content exchange (eg sequenced
packages pull vs push model)
HP 33
ICE Applications
ICE vocabulary + domain vocabulary = complete application
ICEestablishes and manages the syndication
delivers data
logs events
=gt content-independent metadata
industry-specific vocabulary defines the content =gt domain-specific metadata
Source httpwwwicestandardorg
HP 34
ICE Explained
ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information
Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm
ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe
timestamp=05-15-2001T110001 iceversion=10 gt
ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321
subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt
ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt
PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)
ltcomic-stripgtltice-itemgt ltice-item-groupgt
ltice-responsegt ltice-payloadgt
Content (domain-specific
metadata)
HP 36
XCM (eXtended Content Management)
a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are
Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval
Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle
Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability
Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html
HP 37
XCM
eXtended Content Management
Content DevelopmentManagement
Content DeliveryApplication ContentManagement
Content AuthoringDigital Asset Management
Software ConfigurationManagement
Document ProcessManagement
Metadata ManagementRecombinationPersonalization
Edge Network Delivery
Streaming Media DeliveryCaching
Source httpwwwvignettecom
HP 38
Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain
FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph
hydrography transportation
TitleTitle Dakota Aquifer
Online linkageOnline linkagehttpgisdasckgsukansedudasc
Direct Spatial Reference MethodDirect Spatial Reference Method Vector
Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator
hellip hellip hellip
UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation
TopicTopic Dakota Aquifer
AdressAdress IdIdhttpgisdasckgsukansedudasc
Measuring TechniquesMeasuring Techniques Vector
CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator
hellip hellip hellip
Kansas StateKansas State
HP 39
Different views of Metadata
Domain Independent Specifications (RDF)
FrameworksInfrastructures (XCM)
MetadataApplication Specific
ICE
Media Specific
MPEG7 VoiceXML
Domain Specific
NewsML FGDCUDK
HP 40
Creating and Serving Metadata to Power the Life-cycle of Content
Taalee Infrastructure Services Taalee Content Applications
Where is the content
Whose is it
ProduceAggregate
CatalogIndex
What other content is it related to
Integrate Syndicate
What is the right content for this
user
Personalize
What is the best way to
monetize this interaction
Interactive Marketing
BroadcastWirelineWirelessInteractive TV
Taalee Semantic MetaBase
HP 41
Taaleersquos Intelligent Content Process
HP 42
Metadata Creation and Semanticization
bull Automatic Content ClassificationCategorization
bull Metadata CreationExtractionTypes of metadata created
Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc
HP 43
FormsTypesIngest of Content
Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)
HP 44
Content HandlingIngest
InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents
Centralized Distributed MobileMigratory
HP 45
Information Extraction for Metadata Creation
GlobalEnterpriseWeb Repositories
METADATAMETADATA
EXTRACTORSEXTRACTORS
Digital Maps
NexisUPIAP
Documents
Digital Audios
Data Stores
Digital Videos
Digital Images
HP 46
Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach
INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT
NATIONAL PREPAREDNESS LEVEL II
CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection
SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues
CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning
LAYOUT
Date =gt day month int lsquorsquo int
Traditional TextCategorization
StatisticalAI Techniques
Classify Place ina taxonomy
feed
Customer Training
Set
RoutingDistribution
Customer Article Feed
4715
Standard Metadata
Feed Source iSyndicate
Posted Date 11202000
Classification of Article 4715
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 30
Application Examples for MPEG7
A few application examples are
Digital libraries (image catalog musical dictionary)
Multimedia directory services (eg yellow pages)
Broadcast media selection (radio channel TV channel)
HP 31
Information and Content Exchange (ICE)
Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax
Authors Adobe Kinecta MS Sun Vignette et al
Status latest spec version 11 May 2000 submitted to W3C for review
Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip
Web Site httpwwwicestandardorg
HP 32
What is the ICE Protocol
Syndication Protocol for communication between
Syndicators and Subscribers
Metadata to define
roles and responsibilities of involved parties Subscriber vs
Syndicator Requestor vs Responder Sender vs Receiver
format and method of content exchange (eg sequenced
packages pull vs push model)
HP 33
ICE Applications
ICE vocabulary + domain vocabulary = complete application
ICEestablishes and manages the syndication
delivers data
logs events
=gt content-independent metadata
industry-specific vocabulary defines the content =gt domain-specific metadata
Source httpwwwicestandardorg
HP 34
ICE Explained
ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information
Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm
ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe
timestamp=05-15-2001T110001 iceversion=10 gt
ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321
subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt
ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt
PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)
ltcomic-stripgtltice-itemgt ltice-item-groupgt
ltice-responsegt ltice-payloadgt
Content (domain-specific
metadata)
HP 36
XCM (eXtended Content Management)
a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are
Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval
Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle
Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability
Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html
HP 37
XCM
eXtended Content Management
Content DevelopmentManagement
Content DeliveryApplication ContentManagement
Content AuthoringDigital Asset Management
Software ConfigurationManagement
Document ProcessManagement
Metadata ManagementRecombinationPersonalization
Edge Network Delivery
Streaming Media DeliveryCaching
Source httpwwwvignettecom
HP 38
Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain
FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph
hydrography transportation
TitleTitle Dakota Aquifer
Online linkageOnline linkagehttpgisdasckgsukansedudasc
Direct Spatial Reference MethodDirect Spatial Reference Method Vector
Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator
hellip hellip hellip
UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation
TopicTopic Dakota Aquifer
AdressAdress IdIdhttpgisdasckgsukansedudasc
Measuring TechniquesMeasuring Techniques Vector
CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator
hellip hellip hellip
Kansas StateKansas State
HP 39
Different views of Metadata
Domain Independent Specifications (RDF)
FrameworksInfrastructures (XCM)
MetadataApplication Specific
ICE
Media Specific
MPEG7 VoiceXML
Domain Specific
NewsML FGDCUDK
HP 40
Creating and Serving Metadata to Power the Life-cycle of Content
Taalee Infrastructure Services Taalee Content Applications
Where is the content
Whose is it
ProduceAggregate
CatalogIndex
What other content is it related to
Integrate Syndicate
What is the right content for this
user
Personalize
What is the best way to
monetize this interaction
Interactive Marketing
BroadcastWirelineWirelessInteractive TV
Taalee Semantic MetaBase
HP 41
Taaleersquos Intelligent Content Process
HP 42
Metadata Creation and Semanticization
bull Automatic Content ClassificationCategorization
bull Metadata CreationExtractionTypes of metadata created
Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc
HP 43
FormsTypesIngest of Content
Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)
HP 44
Content HandlingIngest
InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents
Centralized Distributed MobileMigratory
HP 45
Information Extraction for Metadata Creation
GlobalEnterpriseWeb Repositories
METADATAMETADATA
EXTRACTORSEXTRACTORS
Digital Maps
NexisUPIAP
Documents
Digital Audios
Data Stores
Digital Videos
Digital Images
HP 46
Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach
INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT
NATIONAL PREPAREDNESS LEVEL II
CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection
SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues
CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning
LAYOUT
Date =gt day month int lsquorsquo int
Traditional TextCategorization
StatisticalAI Techniques
Classify Place ina taxonomy
feed
Customer Training
Set
RoutingDistribution
Customer Article Feed
4715
Standard Metadata
Feed Source iSyndicate
Posted Date 11202000
Classification of Article 4715
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 31
Information and Content Exchange (ICE)
Main Goal efficient and extensible Content Syndication protocol for the Internet using XML syntax
Authors Adobe Kinecta MS Sun Vignette et al
Status latest spec version 11 May 2000 submitted to W3C for review
Implementations Vignette Syndication Server MS BizTalk Kinecta Interact hellip
Web Site httpwwwicestandardorg
HP 32
What is the ICE Protocol
Syndication Protocol for communication between
Syndicators and Subscribers
Metadata to define
roles and responsibilities of involved parties Subscriber vs
Syndicator Requestor vs Responder Sender vs Receiver
format and method of content exchange (eg sequenced
packages pull vs push model)
HP 33
ICE Applications
ICE vocabulary + domain vocabulary = complete application
ICEestablishes and manages the syndication
delivers data
logs events
=gt content-independent metadata
industry-specific vocabulary defines the content =gt domain-specific metadata
Source httpwwwicestandardorg
HP 34
ICE Explained
ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information
Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm
ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe
timestamp=05-15-2001T110001 iceversion=10 gt
ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321
subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt
ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt
PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)
ltcomic-stripgtltice-itemgt ltice-item-groupgt
ltice-responsegt ltice-payloadgt
Content (domain-specific
metadata)
HP 36
XCM (eXtended Content Management)
a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are
Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval
Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle
Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability
Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html
HP 37
XCM
eXtended Content Management
Content DevelopmentManagement
Content DeliveryApplication ContentManagement
Content AuthoringDigital Asset Management
Software ConfigurationManagement
Document ProcessManagement
Metadata ManagementRecombinationPersonalization
Edge Network Delivery
Streaming Media DeliveryCaching
Source httpwwwvignettecom
HP 38
Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain
FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph
hydrography transportation
TitleTitle Dakota Aquifer
Online linkageOnline linkagehttpgisdasckgsukansedudasc
Direct Spatial Reference MethodDirect Spatial Reference Method Vector
Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator
hellip hellip hellip
UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation
TopicTopic Dakota Aquifer
AdressAdress IdIdhttpgisdasckgsukansedudasc
Measuring TechniquesMeasuring Techniques Vector
CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator
hellip hellip hellip
Kansas StateKansas State
HP 39
Different views of Metadata
Domain Independent Specifications (RDF)
FrameworksInfrastructures (XCM)
MetadataApplication Specific
ICE
Media Specific
MPEG7 VoiceXML
Domain Specific
NewsML FGDCUDK
HP 40
Creating and Serving Metadata to Power the Life-cycle of Content
Taalee Infrastructure Services Taalee Content Applications
Where is the content
Whose is it
ProduceAggregate
CatalogIndex
What other content is it related to
Integrate Syndicate
What is the right content for this
user
Personalize
What is the best way to
monetize this interaction
Interactive Marketing
BroadcastWirelineWirelessInteractive TV
Taalee Semantic MetaBase
HP 41
Taaleersquos Intelligent Content Process
HP 42
Metadata Creation and Semanticization
bull Automatic Content ClassificationCategorization
bull Metadata CreationExtractionTypes of metadata created
Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc
HP 43
FormsTypesIngest of Content
Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)
HP 44
Content HandlingIngest
InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents
Centralized Distributed MobileMigratory
HP 45
Information Extraction for Metadata Creation
GlobalEnterpriseWeb Repositories
METADATAMETADATA
EXTRACTORSEXTRACTORS
Digital Maps
NexisUPIAP
Documents
Digital Audios
Data Stores
Digital Videos
Digital Images
HP 46
Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach
INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT
NATIONAL PREPAREDNESS LEVEL II
CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection
SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues
CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning
LAYOUT
Date =gt day month int lsquorsquo int
Traditional TextCategorization
StatisticalAI Techniques
Classify Place ina taxonomy
feed
Customer Training
Set
RoutingDistribution
Customer Article Feed
4715
Standard Metadata
Feed Source iSyndicate
Posted Date 11202000
Classification of Article 4715
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 32
What is the ICE Protocol
Syndication Protocol for communication between
Syndicators and Subscribers
Metadata to define
roles and responsibilities of involved parties Subscriber vs
Syndicator Requestor vs Responder Sender vs Receiver
format and method of content exchange (eg sequenced
packages pull vs push model)
HP 33
ICE Applications
ICE vocabulary + domain vocabulary = complete application
ICEestablishes and manages the syndication
delivers data
logs events
=gt content-independent metadata
industry-specific vocabulary defines the content =gt domain-specific metadata
Source httpwwwicestandardorg
HP 34
ICE Explained
ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information
Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm
ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe
timestamp=05-15-2001T110001 iceversion=10 gt
ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321
subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt
ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt
PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)
ltcomic-stripgtltice-itemgt ltice-item-groupgt
ltice-responsegt ltice-payloadgt
Content (domain-specific
metadata)
HP 36
XCM (eXtended Content Management)
a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are
Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval
Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle
Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability
Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html
HP 37
XCM
eXtended Content Management
Content DevelopmentManagement
Content DeliveryApplication ContentManagement
Content AuthoringDigital Asset Management
Software ConfigurationManagement
Document ProcessManagement
Metadata ManagementRecombinationPersonalization
Edge Network Delivery
Streaming Media DeliveryCaching
Source httpwwwvignettecom
HP 38
Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain
FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph
hydrography transportation
TitleTitle Dakota Aquifer
Online linkageOnline linkagehttpgisdasckgsukansedudasc
Direct Spatial Reference MethodDirect Spatial Reference Method Vector
Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator
hellip hellip hellip
UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation
TopicTopic Dakota Aquifer
AdressAdress IdIdhttpgisdasckgsukansedudasc
Measuring TechniquesMeasuring Techniques Vector
CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator
hellip hellip hellip
Kansas StateKansas State
HP 39
Different views of Metadata
Domain Independent Specifications (RDF)
FrameworksInfrastructures (XCM)
MetadataApplication Specific
ICE
Media Specific
MPEG7 VoiceXML
Domain Specific
NewsML FGDCUDK
HP 40
Creating and Serving Metadata to Power the Life-cycle of Content
Taalee Infrastructure Services Taalee Content Applications
Where is the content
Whose is it
ProduceAggregate
CatalogIndex
What other content is it related to
Integrate Syndicate
What is the right content for this
user
Personalize
What is the best way to
monetize this interaction
Interactive Marketing
BroadcastWirelineWirelessInteractive TV
Taalee Semantic MetaBase
HP 41
Taaleersquos Intelligent Content Process
HP 42
Metadata Creation and Semanticization
bull Automatic Content ClassificationCategorization
bull Metadata CreationExtractionTypes of metadata created
Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc
HP 43
FormsTypesIngest of Content
Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)
HP 44
Content HandlingIngest
InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents
Centralized Distributed MobileMigratory
HP 45
Information Extraction for Metadata Creation
GlobalEnterpriseWeb Repositories
METADATAMETADATA
EXTRACTORSEXTRACTORS
Digital Maps
NexisUPIAP
Documents
Digital Audios
Data Stores
Digital Videos
Digital Images
HP 46
Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach
INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT
NATIONAL PREPAREDNESS LEVEL II
CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection
SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues
CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning
LAYOUT
Date =gt day month int lsquorsquo int
Traditional TextCategorization
StatisticalAI Techniques
Classify Place ina taxonomy
feed
Customer Training
Set
RoutingDistribution
Customer Article Feed
4715
Standard Metadata
Feed Source iSyndicate
Posted Date 11202000
Classification of Article 4715
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 33
ICE Applications
ICE vocabulary + domain vocabulary = complete application
ICEestablishes and manages the syndication
delivers data
logs events
=gt content-independent metadata
industry-specific vocabulary defines the content =gt domain-specific metadata
Source httpwwwicestandardorg
HP 34
ICE Explained
ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information
Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm
ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe
timestamp=05-15-2001T110001 iceversion=10 gt
ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321
subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt
ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt
PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)
ltcomic-stripgtltice-itemgt ltice-item-groupgt
ltice-responsegt ltice-payloadgt
Content (domain-specific
metadata)
HP 36
XCM (eXtended Content Management)
a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are
Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval
Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle
Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability
Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html
HP 37
XCM
eXtended Content Management
Content DevelopmentManagement
Content DeliveryApplication ContentManagement
Content AuthoringDigital Asset Management
Software ConfigurationManagement
Document ProcessManagement
Metadata ManagementRecombinationPersonalization
Edge Network Delivery
Streaming Media DeliveryCaching
Source httpwwwvignettecom
HP 38
Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain
FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph
hydrography transportation
TitleTitle Dakota Aquifer
Online linkageOnline linkagehttpgisdasckgsukansedudasc
Direct Spatial Reference MethodDirect Spatial Reference Method Vector
Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator
hellip hellip hellip
UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation
TopicTopic Dakota Aquifer
AdressAdress IdIdhttpgisdasckgsukansedudasc
Measuring TechniquesMeasuring Techniques Vector
CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator
hellip hellip hellip
Kansas StateKansas State
HP 39
Different views of Metadata
Domain Independent Specifications (RDF)
FrameworksInfrastructures (XCM)
MetadataApplication Specific
ICE
Media Specific
MPEG7 VoiceXML
Domain Specific
NewsML FGDCUDK
HP 40
Creating and Serving Metadata to Power the Life-cycle of Content
Taalee Infrastructure Services Taalee Content Applications
Where is the content
Whose is it
ProduceAggregate
CatalogIndex
What other content is it related to
Integrate Syndicate
What is the right content for this
user
Personalize
What is the best way to
monetize this interaction
Interactive Marketing
BroadcastWirelineWirelessInteractive TV
Taalee Semantic MetaBase
HP 41
Taaleersquos Intelligent Content Process
HP 42
Metadata Creation and Semanticization
bull Automatic Content ClassificationCategorization
bull Metadata CreationExtractionTypes of metadata created
Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc
HP 43
FormsTypesIngest of Content
Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)
HP 44
Content HandlingIngest
InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents
Centralized Distributed MobileMigratory
HP 45
Information Extraction for Metadata Creation
GlobalEnterpriseWeb Repositories
METADATAMETADATA
EXTRACTORSEXTRACTORS
Digital Maps
NexisUPIAP
Documents
Digital Audios
Data Stores
Digital Videos
Digital Images
HP 46
Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach
INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT
NATIONAL PREPAREDNESS LEVEL II
CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection
SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues
CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning
LAYOUT
Date =gt day month int lsquorsquo int
Traditional TextCategorization
StatisticalAI Techniques
Classify Place ina taxonomy
feed
Customer Training
Set
RoutingDistribution
Customer Article Feed
4715
Standard Metadata
Feed Source iSyndicate
Posted Date 11202000
Classification of Article 4715
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 34
ICE Explained
ICE Information and Content Exchange protocol Syndicator A content aggregator and distributor Subscriber A content consumer Subscription An agreement between a subscriber and a syndicatorfor the delivery of content according to the delivery policy and other parameters in the agreement Collection The current content of a subscription ICE Package A delivery of commands to update a collection such as the addition of content items ICE Payload The XML document used by ICE to carry protocol information Examples include requests for packages catalogs ofsubscription offers usage logs and other management information
Sources InternetWeek ICE Cookbook version 10 httpwwwinternetweekcomebizapps01ebiz050701-3htm
ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe
timestamp=05-15-2001T110001 iceversion=10 gt
ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321
subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt
ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt
PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)
ltcomic-stripgtltice-itemgt ltice-item-groupgt
ltice-responsegt ltice-payloadgt
Content (domain-specific
metadata)
HP 36
XCM (eXtended Content Management)
a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are
Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval
Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle
Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability
Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html
HP 37
XCM
eXtended Content Management
Content DevelopmentManagement
Content DeliveryApplication ContentManagement
Content AuthoringDigital Asset Management
Software ConfigurationManagement
Document ProcessManagement
Metadata ManagementRecombinationPersonalization
Edge Network Delivery
Streaming Media DeliveryCaching
Source httpwwwvignettecom
HP 38
Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain
FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph
hydrography transportation
TitleTitle Dakota Aquifer
Online linkageOnline linkagehttpgisdasckgsukansedudasc
Direct Spatial Reference MethodDirect Spatial Reference Method Vector
Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator
hellip hellip hellip
UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation
TopicTopic Dakota Aquifer
AdressAdress IdIdhttpgisdasckgsukansedudasc
Measuring TechniquesMeasuring Techniques Vector
CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator
hellip hellip hellip
Kansas StateKansas State
HP 39
Different views of Metadata
Domain Independent Specifications (RDF)
FrameworksInfrastructures (XCM)
MetadataApplication Specific
ICE
Media Specific
MPEG7 VoiceXML
Domain Specific
NewsML FGDCUDK
HP 40
Creating and Serving Metadata to Power the Life-cycle of Content
Taalee Infrastructure Services Taalee Content Applications
Where is the content
Whose is it
ProduceAggregate
CatalogIndex
What other content is it related to
Integrate Syndicate
What is the right content for this
user
Personalize
What is the best way to
monetize this interaction
Interactive Marketing
BroadcastWirelineWirelessInteractive TV
Taalee Semantic MetaBase
HP 41
Taaleersquos Intelligent Content Process
HP 42
Metadata Creation and Semanticization
bull Automatic Content ClassificationCategorization
bull Metadata CreationExtractionTypes of metadata created
Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc
HP 43
FormsTypesIngest of Content
Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)
HP 44
Content HandlingIngest
InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents
Centralized Distributed MobileMigratory
HP 45
Information Extraction for Metadata Creation
GlobalEnterpriseWeb Repositories
METADATAMETADATA
EXTRACTORSEXTRACTORS
Digital Maps
NexisUPIAP
Documents
Digital Audios
Data Stores
Digital Videos
Digital Images
HP 46
Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach
INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT
NATIONAL PREPAREDNESS LEVEL II
CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection
SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues
CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning
LAYOUT
Date =gt day month int lsquorsquo int
Traditional TextCategorization
StatisticalAI Techniques
Classify Place ina taxonomy
feed
Customer Training
Set
RoutingDistribution
Customer Article Feed
4715
Standard Metadata
Feed Source iSyndicate
Posted Date 11202000
Classification of Article 4715
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
ltxml version=10gtltDOCTYPE ice-payload SYSTEM httpicedtdgtltice-payload payload-id=ipl-80a56cfe
timestamp=05-15-2001T110001 iceversion=10 gt
ltice-response response-id=irp-20010515181600gt ltice-item-group group-id= grp-8610gtltice-item item-id=4321
subscription-element=4321 name=Cartoon filename=demogif content-type=applicationxml gt
ltcomic-strip title=Looney City author=Amito Pateru copyright=Taalee Makeups pubdate=20010515gt
PdXIWZQ8IiPLhHrQcrjxAQ8VquFJS8vDC hellip(ASCII-encoded image)
ltcomic-stripgtltice-itemgt ltice-item-groupgt
ltice-responsegt ltice-payloadgt
Content (domain-specific
metadata)
HP 36
XCM (eXtended Content Management)
a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are
Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval
Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle
Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability
Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html
HP 37
XCM
eXtended Content Management
Content DevelopmentManagement
Content DeliveryApplication ContentManagement
Content AuthoringDigital Asset Management
Software ConfigurationManagement
Document ProcessManagement
Metadata ManagementRecombinationPersonalization
Edge Network Delivery
Streaming Media DeliveryCaching
Source httpwwwvignettecom
HP 38
Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain
FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph
hydrography transportation
TitleTitle Dakota Aquifer
Online linkageOnline linkagehttpgisdasckgsukansedudasc
Direct Spatial Reference MethodDirect Spatial Reference Method Vector
Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator
hellip hellip hellip
UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation
TopicTopic Dakota Aquifer
AdressAdress IdIdhttpgisdasckgsukansedudasc
Measuring TechniquesMeasuring Techniques Vector
CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator
hellip hellip hellip
Kansas StateKansas State
HP 39
Different views of Metadata
Domain Independent Specifications (RDF)
FrameworksInfrastructures (XCM)
MetadataApplication Specific
ICE
Media Specific
MPEG7 VoiceXML
Domain Specific
NewsML FGDCUDK
HP 40
Creating and Serving Metadata to Power the Life-cycle of Content
Taalee Infrastructure Services Taalee Content Applications
Where is the content
Whose is it
ProduceAggregate
CatalogIndex
What other content is it related to
Integrate Syndicate
What is the right content for this
user
Personalize
What is the best way to
monetize this interaction
Interactive Marketing
BroadcastWirelineWirelessInteractive TV
Taalee Semantic MetaBase
HP 41
Taaleersquos Intelligent Content Process
HP 42
Metadata Creation and Semanticization
bull Automatic Content ClassificationCategorization
bull Metadata CreationExtractionTypes of metadata created
Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc
HP 43
FormsTypesIngest of Content
Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)
HP 44
Content HandlingIngest
InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents
Centralized Distributed MobileMigratory
HP 45
Information Extraction for Metadata Creation
GlobalEnterpriseWeb Repositories
METADATAMETADATA
EXTRACTORSEXTRACTORS
Digital Maps
NexisUPIAP
Documents
Digital Audios
Data Stores
Digital Videos
Digital Images
HP 46
Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach
INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT
NATIONAL PREPAREDNESS LEVEL II
CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection
SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues
CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning
LAYOUT
Date =gt day month int lsquorsquo int
Traditional TextCategorization
StatisticalAI Techniques
Classify Place ina taxonomy
feed
Customer Training
Set
RoutingDistribution
Customer Article Feed
4715
Standard Metadata
Feed Source iSyndicate
Posted Date 11202000
Classification of Article 4715
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 36
XCM (eXtended Content Management)
a framework that allows customers to classify content management offerings according to the business problems they address The segments of XCM are
Content Development - Developing static content and managing the process of its subsequent approval versioning storage and retrieval
Application Content Management (Vignette) - Deploying content dynamically to a Web site and managing that content throughout its online lifecycle
Content Delivery - Delivering content through multiple channels to minimize customer waiting time and improve Web site stability and scalability
Source httpwwwvignettecomCDASite020971-1-30-1458-1146-174300html
HP 37
XCM
eXtended Content Management
Content DevelopmentManagement
Content DeliveryApplication ContentManagement
Content AuthoringDigital Asset Management
Software ConfigurationManagement
Document ProcessManagement
Metadata ManagementRecombinationPersonalization
Edge Network Delivery
Streaming Media DeliveryCaching
Source httpwwwvignettecom
HP 38
Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain
FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph
hydrography transportation
TitleTitle Dakota Aquifer
Online linkageOnline linkagehttpgisdasckgsukansedudasc
Direct Spatial Reference MethodDirect Spatial Reference Method Vector
Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator
hellip hellip hellip
UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation
TopicTopic Dakota Aquifer
AdressAdress IdIdhttpgisdasckgsukansedudasc
Measuring TechniquesMeasuring Techniques Vector
CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator
hellip hellip hellip
Kansas StateKansas State
HP 39
Different views of Metadata
Domain Independent Specifications (RDF)
FrameworksInfrastructures (XCM)
MetadataApplication Specific
ICE
Media Specific
MPEG7 VoiceXML
Domain Specific
NewsML FGDCUDK
HP 40
Creating and Serving Metadata to Power the Life-cycle of Content
Taalee Infrastructure Services Taalee Content Applications
Where is the content
Whose is it
ProduceAggregate
CatalogIndex
What other content is it related to
Integrate Syndicate
What is the right content for this
user
Personalize
What is the best way to
monetize this interaction
Interactive Marketing
BroadcastWirelineWirelessInteractive TV
Taalee Semantic MetaBase
HP 41
Taaleersquos Intelligent Content Process
HP 42
Metadata Creation and Semanticization
bull Automatic Content ClassificationCategorization
bull Metadata CreationExtractionTypes of metadata created
Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc
HP 43
FormsTypesIngest of Content
Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)
HP 44
Content HandlingIngest
InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents
Centralized Distributed MobileMigratory
HP 45
Information Extraction for Metadata Creation
GlobalEnterpriseWeb Repositories
METADATAMETADATA
EXTRACTORSEXTRACTORS
Digital Maps
NexisUPIAP
Documents
Digital Audios
Data Stores
Digital Videos
Digital Images
HP 46
Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach
INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT
NATIONAL PREPAREDNESS LEVEL II
CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection
SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues
CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning
LAYOUT
Date =gt day month int lsquorsquo int
Traditional TextCategorization
StatisticalAI Techniques
Classify Place ina taxonomy
feed
Customer Training
Set
RoutingDistribution
Customer Article Feed
4715
Standard Metadata
Feed Source iSyndicate
Posted Date 11202000
Classification of Article 4715
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 37
XCM
eXtended Content Management
Content DevelopmentManagement
Content DeliveryApplication ContentManagement
Content AuthoringDigital Asset Management
Software ConfigurationManagement
Document ProcessManagement
Metadata ManagementRecombinationPersonalization
Edge Network Delivery
Streaming Media DeliveryCaching
Source httpwwwvignettecom
HP 38
Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain
FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph
hydrography transportation
TitleTitle Dakota Aquifer
Online linkageOnline linkagehttpgisdasckgsukansedudasc
Direct Spatial Reference MethodDirect Spatial Reference Method Vector
Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator
hellip hellip hellip
UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation
TopicTopic Dakota Aquifer
AdressAdress IdIdhttpgisdasckgsukansedudasc
Measuring TechniquesMeasuring Techniques Vector
CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator
hellip hellip hellip
Kansas StateKansas State
HP 39
Different views of Metadata
Domain Independent Specifications (RDF)
FrameworksInfrastructures (XCM)
MetadataApplication Specific
ICE
Media Specific
MPEG7 VoiceXML
Domain Specific
NewsML FGDCUDK
HP 40
Creating and Serving Metadata to Power the Life-cycle of Content
Taalee Infrastructure Services Taalee Content Applications
Where is the content
Whose is it
ProduceAggregate
CatalogIndex
What other content is it related to
Integrate Syndicate
What is the right content for this
user
Personalize
What is the best way to
monetize this interaction
Interactive Marketing
BroadcastWirelineWirelessInteractive TV
Taalee Semantic MetaBase
HP 41
Taaleersquos Intelligent Content Process
HP 42
Metadata Creation and Semanticization
bull Automatic Content ClassificationCategorization
bull Metadata CreationExtractionTypes of metadata created
Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc
HP 43
FormsTypesIngest of Content
Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)
HP 44
Content HandlingIngest
InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents
Centralized Distributed MobileMigratory
HP 45
Information Extraction for Metadata Creation
GlobalEnterpriseWeb Repositories
METADATAMETADATA
EXTRACTORSEXTRACTORS
Digital Maps
NexisUPIAP
Documents
Digital Audios
Data Stores
Digital Videos
Digital Images
HP 46
Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach
INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT
NATIONAL PREPAREDNESS LEVEL II
CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection
SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues
CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning
LAYOUT
Date =gt day month int lsquorsquo int
Traditional TextCategorization
StatisticalAI Techniques
Classify Place ina taxonomy
feed
Customer Training
Set
RoutingDistribution
Customer Article Feed
4715
Standard Metadata
Feed Source iSyndicate
Posted Date 11202000
Classification of Article 4715
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 38
Multiple heterogeneous metadata models with different tag names for the same data in the same GIS domain
FGDC Metadata ModelFGDC Metadata ModelTheme keywordsTheme keywords digital line graph
hydrography transportation
TitleTitle Dakota Aquifer
Online linkageOnline linkagehttpgisdasckgsukansedudasc
Direct Spatial Reference MethodDirect Spatial Reference Method Vector
Horizontal Coordinate System DefinitionHorizontal Coordinate System DefinitionUniversal Transverse Mercator
hellip hellip hellip
UDK Metadata ModelUDK Metadata ModelSearch termsSearch terms digital line graph hydrography transportation
TopicTopic Dakota Aquifer
AdressAdress IdIdhttpgisdasckgsukansedudasc
Measuring TechniquesMeasuring Techniques Vector
CoCo--ordinate Systemordinate SystemUniversal Transverse Mercator
hellip hellip hellip
Kansas StateKansas State
HP 39
Different views of Metadata
Domain Independent Specifications (RDF)
FrameworksInfrastructures (XCM)
MetadataApplication Specific
ICE
Media Specific
MPEG7 VoiceXML
Domain Specific
NewsML FGDCUDK
HP 40
Creating and Serving Metadata to Power the Life-cycle of Content
Taalee Infrastructure Services Taalee Content Applications
Where is the content
Whose is it
ProduceAggregate
CatalogIndex
What other content is it related to
Integrate Syndicate
What is the right content for this
user
Personalize
What is the best way to
monetize this interaction
Interactive Marketing
BroadcastWirelineWirelessInteractive TV
Taalee Semantic MetaBase
HP 41
Taaleersquos Intelligent Content Process
HP 42
Metadata Creation and Semanticization
bull Automatic Content ClassificationCategorization
bull Metadata CreationExtractionTypes of metadata created
Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc
HP 43
FormsTypesIngest of Content
Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)
HP 44
Content HandlingIngest
InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents
Centralized Distributed MobileMigratory
HP 45
Information Extraction for Metadata Creation
GlobalEnterpriseWeb Repositories
METADATAMETADATA
EXTRACTORSEXTRACTORS
Digital Maps
NexisUPIAP
Documents
Digital Audios
Data Stores
Digital Videos
Digital Images
HP 46
Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach
INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT
NATIONAL PREPAREDNESS LEVEL II
CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection
SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues
CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning
LAYOUT
Date =gt day month int lsquorsquo int
Traditional TextCategorization
StatisticalAI Techniques
Classify Place ina taxonomy
feed
Customer Training
Set
RoutingDistribution
Customer Article Feed
4715
Standard Metadata
Feed Source iSyndicate
Posted Date 11202000
Classification of Article 4715
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 39
Different views of Metadata
Domain Independent Specifications (RDF)
FrameworksInfrastructures (XCM)
MetadataApplication Specific
ICE
Media Specific
MPEG7 VoiceXML
Domain Specific
NewsML FGDCUDK
HP 40
Creating and Serving Metadata to Power the Life-cycle of Content
Taalee Infrastructure Services Taalee Content Applications
Where is the content
Whose is it
ProduceAggregate
CatalogIndex
What other content is it related to
Integrate Syndicate
What is the right content for this
user
Personalize
What is the best way to
monetize this interaction
Interactive Marketing
BroadcastWirelineWirelessInteractive TV
Taalee Semantic MetaBase
HP 41
Taaleersquos Intelligent Content Process
HP 42
Metadata Creation and Semanticization
bull Automatic Content ClassificationCategorization
bull Metadata CreationExtractionTypes of metadata created
Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc
HP 43
FormsTypesIngest of Content
Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)
HP 44
Content HandlingIngest
InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents
Centralized Distributed MobileMigratory
HP 45
Information Extraction for Metadata Creation
GlobalEnterpriseWeb Repositories
METADATAMETADATA
EXTRACTORSEXTRACTORS
Digital Maps
NexisUPIAP
Documents
Digital Audios
Data Stores
Digital Videos
Digital Images
HP 46
Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach
INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT
NATIONAL PREPAREDNESS LEVEL II
CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection
SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues
CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning
LAYOUT
Date =gt day month int lsquorsquo int
Traditional TextCategorization
StatisticalAI Techniques
Classify Place ina taxonomy
feed
Customer Training
Set
RoutingDistribution
Customer Article Feed
4715
Standard Metadata
Feed Source iSyndicate
Posted Date 11202000
Classification of Article 4715
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 40
Creating and Serving Metadata to Power the Life-cycle of Content
Taalee Infrastructure Services Taalee Content Applications
Where is the content
Whose is it
ProduceAggregate
CatalogIndex
What other content is it related to
Integrate Syndicate
What is the right content for this
user
Personalize
What is the best way to
monetize this interaction
Interactive Marketing
BroadcastWirelineWirelessInteractive TV
Taalee Semantic MetaBase
HP 41
Taaleersquos Intelligent Content Process
HP 42
Metadata Creation and Semanticization
bull Automatic Content ClassificationCategorization
bull Metadata CreationExtractionTypes of metadata created
Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc
HP 43
FormsTypesIngest of Content
Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)
HP 44
Content HandlingIngest
InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents
Centralized Distributed MobileMigratory
HP 45
Information Extraction for Metadata Creation
GlobalEnterpriseWeb Repositories
METADATAMETADATA
EXTRACTORSEXTRACTORS
Digital Maps
NexisUPIAP
Documents
Digital Audios
Data Stores
Digital Videos
Digital Images
HP 46
Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach
INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT
NATIONAL PREPAREDNESS LEVEL II
CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection
SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues
CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning
LAYOUT
Date =gt day month int lsquorsquo int
Traditional TextCategorization
StatisticalAI Techniques
Classify Place ina taxonomy
feed
Customer Training
Set
RoutingDistribution
Customer Article Feed
4715
Standard Metadata
Feed Source iSyndicate
Posted Date 11202000
Classification of Article 4715
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 41
Taaleersquos Intelligent Content Process
HP 42
Metadata Creation and Semanticization
bull Automatic Content ClassificationCategorization
bull Metadata CreationExtractionTypes of metadata created
Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc
HP 43
FormsTypesIngest of Content
Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)
HP 44
Content HandlingIngest
InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents
Centralized Distributed MobileMigratory
HP 45
Information Extraction for Metadata Creation
GlobalEnterpriseWeb Repositories
METADATAMETADATA
EXTRACTORSEXTRACTORS
Digital Maps
NexisUPIAP
Documents
Digital Audios
Data Stores
Digital Videos
Digital Images
HP 46
Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach
INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT
NATIONAL PREPAREDNESS LEVEL II
CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection
SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues
CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning
LAYOUT
Date =gt day month int lsquorsquo int
Traditional TextCategorization
StatisticalAI Techniques
Classify Place ina taxonomy
feed
Customer Training
Set
RoutingDistribution
Customer Article Feed
4715
Standard Metadata
Feed Source iSyndicate
Posted Date 11202000
Classification of Article 4715
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 42
Metadata Creation and Semanticization
bull Automatic Content ClassificationCategorization
bull Metadata CreationExtractionTypes of metadata created
Semantic Engine and WorldModel are trademarks of Taalee IncMetadata Extraction is a patented technology of Taalee Inc
HP 43
FormsTypesIngest of Content
Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)
HP 44
Content HandlingIngest
InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents
Centralized Distributed MobileMigratory
HP 45
Information Extraction for Metadata Creation
GlobalEnterpriseWeb Repositories
METADATAMETADATA
EXTRACTORSEXTRACTORS
Digital Maps
NexisUPIAP
Documents
Digital Audios
Data Stores
Digital Videos
Digital Images
HP 46
Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach
INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT
NATIONAL PREPAREDNESS LEVEL II
CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection
SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues
CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning
LAYOUT
Date =gt day month int lsquorsquo int
Traditional TextCategorization
StatisticalAI Techniques
Classify Place ina taxonomy
feed
Customer Training
Set
RoutingDistribution
Customer Article Feed
4715
Standard Metadata
Feed Source iSyndicate
Posted Date 11202000
Classification of Article 4715
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 43
FormsTypesIngest of Content
Sources Web Sites Content Feeds and Private RepositoriesTypes Text Graphics Audio Video MultimediaForms Unstructured text Semi-structured text Structured text (+Media) Static or DynamicIngest Feed (push) Web (pull) RepositoryDatabase (usually pull)
HP 44
Content HandlingIngest
InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents
Centralized Distributed MobileMigratory
HP 45
Information Extraction for Metadata Creation
GlobalEnterpriseWeb Repositories
METADATAMETADATA
EXTRACTORSEXTRACTORS
Digital Maps
NexisUPIAP
Documents
Digital Audios
Data Stores
Digital Videos
Digital Images
HP 46
Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach
INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT
NATIONAL PREPAREDNESS LEVEL II
CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection
SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues
CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning
LAYOUT
Date =gt day month int lsquorsquo int
Traditional TextCategorization
StatisticalAI Techniques
Classify Place ina taxonomy
feed
Customer Training
Set
RoutingDistribution
Customer Article Feed
4715
Standard Metadata
Feed Source iSyndicate
Posted Date 11202000
Classification of Article 4715
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 44
Content HandlingIngest
InfrastructureExchangeFeed HandlersCrawlersScreen ScrapersBotsSoftware Agents
Centralized Distributed MobileMigratory
HP 45
Information Extraction for Metadata Creation
GlobalEnterpriseWeb Repositories
METADATAMETADATA
EXTRACTORSEXTRACTORS
Digital Maps
NexisUPIAP
Documents
Digital Audios
Data Stores
Digital Videos
Digital Images
HP 46
Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach
INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT
NATIONAL PREPAREDNESS LEVEL II
CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection
SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues
CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning
LAYOUT
Date =gt day month int lsquorsquo int
Traditional TextCategorization
StatisticalAI Techniques
Classify Place ina taxonomy
feed
Customer Training
Set
RoutingDistribution
Customer Article Feed
4715
Standard Metadata
Feed Source iSyndicate
Posted Date 11202000
Classification of Article 4715
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 45
Information Extraction for Metadata Creation
GlobalEnterpriseWeb Repositories
METADATAMETADATA
EXTRACTORSEXTRACTORS
Digital Maps
NexisUPIAP
Documents
Digital Audios
Data Stores
Digital Videos
Digital Images
HP 46
Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach
INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT
NATIONAL PREPAREDNESS LEVEL II
CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection
SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues
CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning
LAYOUT
Date =gt day month int lsquorsquo int
Traditional TextCategorization
StatisticalAI Techniques
Classify Place ina taxonomy
feed
Customer Training
Set
RoutingDistribution
Customer Article Feed
4715
Standard Metadata
Feed Source iSyndicate
Posted Date 11202000
Classification of Article 4715
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 46
Extracting a Text DocumentExtracting a Text DocumentSyntactic approachSyntactic approach
INCIDENT MANAGEMENT SITUATION REPORTFriday August 1 1997 - 0530 MDT
NATIONAL PREPAREDNESS LEVEL II
CURRENT SITUATION Alaska continues to experience large fire activity Additional fires have beenstaffed for structure protection
SIMELS Galena District BLM This fire is on the east side of the Innoko Flats between Galena and McGrThe fore is active on the southern perimeter which is burning into a continuous stand of black spruce Thefire has increased in size but was not mapped due to thick smoke The slopover on the eastern perimeter is35 contained while protection of the historic cabit continues
CHINIKLIK MOUNTAIN Galena District BLM A Type II Incident Management Team (Wehking) is assigned to the Chiniklik fire The fire is contained Major areas of heat have been mopped up The fire iscontained Major areas of heat have been mopped-up All crews and overhead will mop-up where the fireburned beyond the meadows No flare-ups occurred today Demobilization is planned for this weekenddepending on the results of infrared scanning
LAYOUT
Date =gt day month int lsquorsquo int
Traditional TextCategorization
StatisticalAI Techniques
Classify Place ina taxonomy
feed
Customer Training
Set
RoutingDistribution
Customer Article Feed
4715
Standard Metadata
Feed Source iSyndicate
Posted Date 11202000
Classification of Article 4715
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
Traditional TextCategorization
StatisticalAI Techniques
Classify Place ina taxonomy
feed
Customer Training
Set
RoutingDistribution
Customer Article Feed
4715
Standard Metadata
Feed Source iSyndicate
Posted Date 11202000
Classification of Article 4715
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
Knowledge-base amp StatisticalAI Techniques
ClassifyPlace ina taxonomy
MetadataCatalog
Content Manager
Precise syndicationfiltering
fd
Article 4715 MetadataFeed Source iSyndicatePosted Date 11202000 Company Name France Telecom
EquantTicker Symbol FTE ENTExchange NYSETopic Company News
Standard metadata
Semantic metadata
FTECompany AnalysisConference Calls
EarningsStock Analysis
NYSEMember Companies
Market NewsIPOs
Automated Content Enrichment (ACE)
Taalee Enterprise Customization Suite
Taaleersquos Categorization amp Automatic Metadata Creation
Taalee Training
Set
Customer Training
Set ee ENTCompany AnalysisConference Calls
EarningsStock Analysis
Classification of Article 4715
Article Feed4715 RoutingDistribution
Map to another taxonomy
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 49
Automatic Categorization amp Metadata Tagging (unstructured texttranscript of AV)
ABSOLUTE CONTROL OF THE SENATE IS STILL IN QUESTION AS OF TONIGHT THE REPUBLICANS HAVE 50 SENATE SEATS AND THE DEMOCRATS 49 IN WASHINGTON STATE THE SENATE RACE REMAINS TOO CLOSE TO CALL IF THE DEMOCRATIC CHALLENGER UNSEATS THE REPUBLICAN IUMBENT THE SENATE WILL BE EVENLY DIVIDED IN MISSOURI REPUBLICAN SENATOR JOHN ASHCROFT SAYS HE WILL NOT CHALLENGE HIS LOSS TO GOVERNOR MEL CARNAHAN WHO DIED IN A CRASH THREE WEEKS AGO GOVERNOR CARNAHANS WIFE IS EXPECTED TO TAKE HIS PLACE IN THE HIGHEST PROFILE SENATE EVENT OF THE NIGHT HILLARY CLINTON WON THE NEW YORK SENATE SEAT SHE IS THE FIRST FIRST LADY TO RUN MUCH LESS WIN
Video Segmentwith Associated Text
Segment Description
SemanticMetadata
AutoCategorization
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 50
Automatic Categorization amp Metadata Tagging (Web page)
Video withEditorialized Text on the Web
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 51
Automatic Categorization amp Metadata Tagging (Feed)
TextFromBllomberg
AutoCategorization
AutoCategorization
Semantic MetadataSemantic Metadata
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 52
Taalee Extraction and Knowledgebase Enhancement
Extraction Agent
Web Page Enhanced Metadata Asset
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 53
Basis for Semantics
A FactsConceptsTermsEntitiesDictionary Thesaurus Reference DataVocabulary
B Facts with RelationshipsTaxonomy(Categories) OntologyDomain Modeling (eg Golf = golfer tournament name golf course event)
Knowledge Base
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 54
Basis for Semantics
C ReasoningInference(Statistical)(Information Retrieval)Statistical LearningAI (Bayesian Neural Networks HMMhellip)Logic Based (Description Logic)Natural LanguageGrammar (part of speech)
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 55
Alternatives for Metadata Extraction
Statistical methodsCluster Analysis
LearningAI and Collab Filtering
Reference dataConcept-termsDictionaryThesaurusBy topicindustrysubjectdomain
Word or Phrase
OntologiesDomain Models
KnowledgeBaseBy Entities and Relationships
deeperunderstanding
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 56
Open Directory Project (ODP) ClassificationTaxonomy amp Directory
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 57
Ontology
Standardize meaning description representation of involved attributes Capture the semantics involved via domain characteristicsAllow knowledge sharing and reuse (Ontological Commitment)
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 58
Ontology
Description includesAttributesDomain RulesFunctional Dependencies
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 59
An Ontology
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
Example Interrelated ontologies
LANDUSE
COMERCIAL
INDUSTRIALRURAL
RESIDENTIAL
AGRICULTURAL
MILITARYRECREATIONAL
LAND(SITE)
CULTIVATEDAREA
GREENLANDAREA LAND
BANK
ZONING
LANDFILLSITE
WASTEDISPOSAL
RECYCLING
HAZARDOUS
LANDFILLRESOURCE REC
SOLID SEWAGE
shredding
magneticseparation
screening
washing
NATURALDISASTER
EARTHQUAKE
causes
LANDSLIDE
VOLCANO
STORMFLOOD
FIRE
AVALANCHE
TSUNAMI
causes
causes
causes
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 61
Large Vocabularies TaxonomiesOntologies
WordNetThe Medical Subject Headings (MeSH) NLMscontrolled vocabulary used for indexing articles for cataloging books and other holdings and for searching MeSH-indexed databases including MEDLINE MeSHterminology provides a consistent way to retrieve information that may use different terminology for the same concepts Year 2000 MeSH includes more than 19000 main headings 110000 Supplementary Concept Records (formerly Supplementary Chemical Records) and an entry vocabulary of over 300000 terms
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
Confidential HP
Metadata enabledApplications
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 63
Metadata Usage Metadata Usage Impact on Search amp Query processing Impact on Search amp Query processing
traditional queries based on keywordsattribute based queriescontent-based queries
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 64
Oingocom
Oingo Ontology ndash ODP based() the database of millions of concepts and relationships that powers Oingossemantic technologyOingo Seek - the database of millions of concepts and relationships that powers Oingos semantic technologyOingo Sense - the knowledge extraction tool that uncovers the essential meaning of information by sensing concepts and contextOingo Lingua - the language of meaning used to state intent The basis for intelligent interactionAssets catalogued are Web sites or Web pages
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 65
Use of Categories for Search
After 3 or 4 clicks
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 66
Metadata is the basis of making Content Intelligent
Precisely what the user asked for
Closely-related high-value information beyond what
was requested
Ability to explore any dimension around the immediate
point of interest Intelligent content helps the user
ldquothinkrdquo about and fulfill their information needs with less effort
Intelligent content can bemore effectively managed packaged and distributed
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 67
Metadata and Intelligent Content
Taalee makes content more ldquointelligentrdquo through automatic analysis of every
individual asset to generate a catalog containing
bull Context of the Content
bull Semantic Metadata describing entities (ie Company Industry etc) and
bull Relationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
ldquoNormalrdquo Content can only be ldquofoundrdquo if the user enters a keyword that exists within it
Intelligent Content=+
Adding related metadata and relationshipsdramatically increases the ability to
automatically access needed content via multiple dimensions
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 68
More than metadata
Taalee makes content more ldquointelligentrdquo through automatic analysis of every individual content item to create
Context of the ContentSemantic Metadata describing entities (ie Company Industryetc) andRelationships (semantic associations) among all entities
Based on a ldquoSemanticrdquo or ldquodomainrdquo model describing how the user thinks about the subject matter supported by a knowledgebase
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 69
Metadata amp Search
Metadata can improve search significantly but metadata enables much more than searchAlternatives for improving search clustering link and other analysis (eg Googlersquos Link Flux analysis) classification as context ontologies metadata knowledgebases hellip
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 70
Metadata Usage Keyword Attribute and Content Based Access
The VisualHarness system at LSDISUGA
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 71
Keyword Search vs Attribute Search with Semantic metadata
Virage Search on football touchdown
Jimmy Smith Interview Part SevenJimmy Smith explains his philosophy on showboating URL httpcbssportsline
Brian Griese Interview Part FourBrian Griese talks about the first touchdown he ever threw URL httpcbssportsline
Metadata from Typical Cataloging of Football
Assets
Taalee Metadata on Football Assets
Rich Media Reference Page
Baltimore 31 Pit 24
httpwwwnflcom
Quandry Ismail and Tony Banks hook up for their third long touchdown this time on a 76-yarder to extend the Ravenrsquos lead to 31-24 in the third quarter
ProfessionalRavens SteelersBal 31 Pit 24Quandry Ismail Tony BanksTouchdownNFLcom2022000
LeagueTeamsScore
PlayersEvent
Produced byPosted date
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 72
Taaleersquos Semantic Search
Highly customizable precise and freshest AV search
Context and Domain Specific Attributes Uniform Metadata for Content from Multiple Sources Can be sorted by any field
Delightful relevant informationexceptional targeting opportunity
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 73
Cre
atin
g a
Web
of
rela
ted
info
rmat
ion
Wha
t can
a c
onte
xt d
o
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
Taalee Directory
Georgia Bulldogs
System recognizes ENTITY amp CATEGORY
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
Taalee Directory
Careless whisper
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 76
Semantic Relationships
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 77
Metadata Application Example
Semantic Applications for highly relevant and fresh contentPersonalization andTargetinginteractive marketing
Please contact Taalee for live demonstrations
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
Personalized Directory
Obtain a whole universe of information (that you may not even have thought of) about some entities that have always been of interest to you
Please enter such semantic keywords below
Change Context
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
Personalized Queries amp Hot Topics
PERSONALIZATION
3 Julia Roberts Collection
Movie Trailer Notting Hill
Trailer - Runaway Bride
Patrick
Movie Trailer Stepmom
Conspiracy Theory
4 Pink Floyd Collection
Personalized Queries
Set the Controls for the Heart of the Sunhellip
Wish You Were Here
Round And Around
Keep Talking
The Post War Dream
1 My Stock Portfolio
Microsoft suffers serious hack attack
Cisco Systems Inc
Analyst Safa Rashtchy on Yahoo
PeopleSoft Inc
ATampT Corp
2 My Football Fantasy Team
Gators Spurrier ready for big game
Techs Vick looks to become complete QB
Bucs excited about Hamilton
Jasper Sanks rumbles into the end zonehellip
Edwards explains reasons for leaving BYU morehellip
morehellip
morehellip
morehellip
1 Election 2000
2 Middle East Peace Conflict
3 Napster Controversy
Video Explaining the electoral map
Race for White House hots up
Seniors Give Gore Florida Edge
More die as Israel steps up security
Israel braces for suicide bombs
Pentagon probes Coles security
The Brain Behind Napster
Napster Lawsuit
Creative Nomad II morehellip
HOT Topics
morehellip
morehellip
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 80
Metadata Targeting
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
SemanticInteractive Targeting
Buy Al Pacino VideosBuy Russell Crowe VideosBuy Christopher Plummer VideosBuy Diane Venora VideosBuy Philip Baker Hall VideosBuy The Insider Video
Precisely targeted through the use of Structured Metadata and integration from multiple sources
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 82
Web Extreme Personalization
Realtime Feeds
Interests Preferences
Time-ShiftedContent Aggregator
Web sites and Pages
ContentDatabases Personalized
Content
Semantic EngineTM
Personalized Content
Content
Structured Hi-Quality
Semantic Metabase
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 83
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
User has already completed WebBased registration and personalization at VoquettersquosEnterprise Customer site
Userrsquos ldquoWireless Home pagerdquo shows the categories for his interests There is an alert (new content) for his stock and sportscategories
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 84
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
Clicking on MyStocks brings down userrsquos Personal Portfolio list The user wants to see news items about Cisco (see next slide)
Search at the bottom is a semantic search that understands the financial domain and the knowledge of userrsquos portfolio Typically search can be done by typing one word or selecting from a dynamic personalized menu
My Stocks
CSCO
NT
IBM
Market
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 85
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
Different types of recent audio content about Cisco are available
The user clicks to see a listing of Analyst Calls on Cisco (next slide)
Icons at the bottom of the screen enable contextually relevant functions listen set alert on story add to playlist
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 86
Application of Semantic Metadata and Automatic Content Enrichment
MyStocks
News
Sports
Music
MyMedia
$
My Stocks
CSCO
NT
IBM
Market
CSCO
Analyst Call
Conf Call
Earnings
1108 ON24 Payne1107 ON24 HampQ 1106 CBS Langlesis
CSCO Analysis
Clicking on the link for Cisco Analyst Calls displays a listingsorted by date Semantic filtering uses just the right metadata to meet screen and other constrains Eg Analyst Call focuses onthe source and analyst name or company The icon denote additional metadata such as ldquoStrong Buyrdquo by HampQ Analyst
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 87
iTV Taaleersquos Extreme Personalization
Content Provider
(DBS DISH Wink AOL-TV)
Semantic EngineTM
Meta-DataTagged Content
ContentldquoProgramsrdquo
Immediate Interests
Preferences
Personalized Content Capsules
Redirects and Programming
Structured Hi-QualitySemantic Metabase
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 88
Metadata for Automatic Content Enrichment
Interactive Television
This segment has embedded or referenced metadata that isused by personalization application to show only the stocksthat user is interested in
This screen is customizablewith interactivity featureusing metadata such as whetherthere is a new ConferenceCall video on CSCO
Part of the screen can beautomatically customized to show conference call specific informationndash including transcriptparticipation etc all of which arerelevant metadata
Conference Call itself can have embedded metadata to support personalization andinteractivity
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 89
Metadata in Enterprise Apps
Filter Search ConsolidatePersonalize ArchiveLicensing Syndication
Production SupportProduction SupportSony
Categorize
Catalog
Integrate
CollectionCollection ProcessingProcessing
NetworkContent
AffiliateFeeds
Public Sources Rich Data
Metabase
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 90
t
A leaking gasoline pipeline burst into flames Thursday killing more than 60 people near Nigerias commercial capital of Lagos Many of the dead were fisherman in wooden canoes engulfed in the inferno
More than a dozen burned bodies lay on a beach at the village of Ebute-Oko facing the central business district of Lagos across a lagoon
At least 60 people died in this needless fire senior local official Karimu Alabi said
Fire crews from state-run Nigerian National Petroleum Corp (NNPC) which owns the pipeline were joined by other firemen from construction company Julius Berger in battling the blaze
Residents said the fire started near Ebute-Oko at daybreak and spread rapidly along the line of the oil leak ravaging a cluster of huts and log houses
At about the same time a second fire razed Makoko shantytown where thousands of fishermen and their families live in wood cabins erected on stilts in the lagoon near Lagos University
Residents said fishermen from Makoko had been scavenging for gasoline from the leaking pipeline and storing it in cans in thewooden huts for days Many victims of the Ebute-Oke fire were
Gore Demands That Recount Restart (940 PM)Gore Says Fla Cant Name Electors (450 PM)Bush Meets Colin Powell at Ranch (122 PM)
Market Tumbles on Earnings Warning (927 AM)Barak Outlines His Peace Plan (630 AM)
-- Breaking News for 11302000 --Customize Page Settings | Content | Layout | Color
Sixty Die In Nigeria BlastProduced by Euronews Posted Date 11302000 Event Election 2000 Location Tallahassee Florida USAPeople Al Gore George W Bush
Video
bull Value-add for production broadcast amp syndication
bull Taaleersquos semantic metadata enables powerful access to content used by Enterprisersquos customers
bull Greatly enhances news-room productivity and time-to-market
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 91
-- Breaking News --Gore Demands That Recount Restart
Gore Says Fla Cant Name Electors
Bush Meets Colin Powell at Ranch
Market Tumbles on Earnings Warning
Barak Outlines His Peace Plan
(133) ndash 120600 - ABC
(253) - 120600 - CBS
(516) - 120600 - ABC
(246) - 120600 - FOX
(133) - 120600 - NBC
(533) - 120600
(357) - 120600 - CBS
(427) - 120600 - ABC
(344) - 120600 - FOX
(724) - 120600 - CBS
(133) - 120600 - CBS
TALLAHASSEE Florida (CNN) ndashThough the two presidential candidates have until noon Wednesday to file briefs in Al Gores appeal to the Florida Supreme Court the outcome of two trials set on the same day in Leon County Florida may offer Gore his best hope for the presidency Democrats in Seminole County are seeking to have 15000 absentee ballots thrown out in that heavily Republican jurisdiction -- a move that would give Gore a lead of up to 5000 votes statewide Lawyers for the plaintiff Harry Jacobs claim the ballots should be rejected because they say County Elections Supervisor Sandra Goard allowed Republican workers to fill out voter identification numbers on 2126 incomplete absentee ballot applications sent in by GOP voters while refusing to allow Democratic workers to do the same thing for Democratic voters
The GOP says that suit and one similar to it from Martin County demonstrates Democratic Party politics at its most desperate Gore is not a party to either of those lawsuits On Tuesday the judge in the
(133) - 120600 - ABC
(233) - 120600 - CBS
(312) - 120600 - NNS
(032) - 120600 - CBS
(133) - 120600 - CBS
DescriptionProduced by CNNPosted Date 12072000 Reporter David Lewis Event Election 2000 Location Tallahassee Florida USAPeople Al Gore
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 92
Retrieve Scene Description Track
Enhanced Digital Cable
Video
MPEGDecoder
Node = AVO Object
Create Scene Description Tree
GREATUSER
EXPERIENCE
Metadatarsquos role in emerging iTV infrastructure
MPEG-247MPEG
Encoder
SceneDescriptionTree
License metadata decoder and semantic applications to
device makers
Channel salesthrough Video Server Vendors
Video App Servers and Broadcasters
Enhanced XML
Description
ldquoCisco Systemsrdquo
Node
TaaleeSemanticEngine
ldquoCisco Systemsrdquo
Produced by Fox Sports Creation Date 12052000 League NFLTeams Seattle Seahawks
Atlanta Falcons Players John KitnaCoaches Mike Holmgren
Dan ReevesLocation Atlanta
Object Content Information (OCI)
Metadata-richValue-added Node
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 93
Intelligent Metadata Creation
Content which doescontain the wordsthe user asked for
Extractor Agents
Content which does not contain the words
the user asked for but is about what he asked
for
Value-added Metadata
Content the user did not think to ask for but
which he needs to know
Semantic Associations
+ +
Metadata for Intelligent ContentMetadata for Intelligent Content
Usage
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 94
Intelligent Contentvia
Value-Added Metadata
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 95
Value-added MetadataTraditional methods rely solely on (syntactic) indexing of keywords to enable
users to access content
bull If a keyword is not in the content it cannot be found
bull The burden is on the user to think of and ask for the ldquorightrdquo keyword
For example If a story is about ldquoRoger Clemensrdquo but does not contain the
words ldquoNew York Yankeesrdquo that story cannot and will not be found if the user
searches for ldquoNew York Yankeesrdquo or ldquoYankeesrdquo
Understanding of the content is needed to create new metadata
Taalee understands Roger Clemens is a PERSON who Plays a SPORT called
Baseball for a TEAM from New York called the Yankees Taalee uses these Semantic Associations (COMPANY participates in INDUSTRY)
to add missing metadata to describe content more completely
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 96
Guided Demo for Value Added Metadata ndashExample one
bull Go to httpwwwmediaanywherecomFootballhtml amp search for Player = Jamal Anderson
bull Click on the first result (titled ldquoWeek 3 Top10 Anderson TD Runrdquo) and view the metadata
on the following RMR page
bull Here is what you see
Produced by NFLcom Posted Date 9202000 League NFL
Teams Atlanta Falcons Players Jamal Anderson
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoWeek 3 top 10 Anderson TD runrdquo
bull Verify that Team=Atlanta Falcons or League=NFL was not present in the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Atlanta Falcons will find this story on Jamal Anderson who is a player of
Atlanta Falcons team
bull Note that other search engines and directories will not be able to do this
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 97
Guided Demo for Value Added Metadata ndashExample Two
bull Go to httpwwwmediaanywherecomBaseballhtml amp search for Player = Gary Sheffield
bull Click on the first result (titled ldquoI want outrdquo) amp view the metadata on the following RMR page
bull Here is what you see
Produced by ESPN Posted Date 3032001 League National League
Teams Los Angeles Dodgers Players Gary Sheffield
bull Now click on the button to play the asset (button marked ldquoREALrdquo)
bull View the source HTML page that has the original story and locate this story with the
heading ldquoI want outrdquo
bull Verify that Team=Los Angeles Dodgers or League=National League was not present in
the source content
bull Taalee attached this value-added metadata to this assetrsquos existing metadata so that a user
searching for Los Angeles Dodgers will find this story on Gary Sheffield who is a player of
Los Angeles Dodgers team
N t th t th h i d di t i ill t b bl t d thi
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 98
Example 1 ndash Snapshots (ldquoJamal Andersonrdquo)
Search for lsquoJamal Andersonrsquo in lsquoFootballrsquo
Click on first result for Jamal Anderson
View metadata Note that Team name and League name are also included
in the metadata
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 99
Example 2 ndash Snapshots (ldquoGary Sheffieldrdquo)
Click on first result for Gary Sheffield
View metadata Note that Team name and League name are also included
in the metadata
Search for lsquoGary Sheffieldrsquo in lsquoBaseballrsquo
View the original source HTML page Verify that
the source page contains no mention of Team nameand League name They
were Taaleersquos value-additions to the metadata to facilitate easier search
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 100
Intelligent Content ndash Value-Added Metadata
Posted Date
Posted Date
Date of asset posting ndashExtracted automatically
League Name
Name of league to which the payerrsquos team belongs ndash Not mentioned explicitly in asset ndash Value-added by Taaleersquos processing based on semantic associations
Name of team for which player plays ndash Not mentioned explicitly in assetndash Value-added using Taaleersquos semantic relationships
Team NameTeam Name
Producer Name
Producer Name
Rich MediaSports AssetRich Media
Sports Asset
Name of content provider that produced the asset
Some Metadata are obtained explicitly from the asset Others (not present in the asset) are added
by Taalee using its semantic relationships
The asset is richly fully described in the many ways the users chose to interact
Player NamesPlayer Names
SportSportName of
sport
LegendX Y meansTaalee uses X to add Yas value-added metadatato the asset
Name of players mentioned explicitly in the asset ndash Extracted automatically
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 101
Intelligent Contentvia
Semantic Associations
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 102
Semantic Associations
bull Traditional search engines rely solely on (syntactic) keywords to find content
bull They do not understand the meaning context or relationships of keywords
For example a search engine may see that the word ldquoCommerce Onerdquo occurs
but it does not know that Commerce One is a COMPANY which Participates in
the Corporate Professional amp Financial Software INDUSTRY and COMPETESWITH Ariba
As a result search engines cannot go beyond returning a list (or directory view)
of what the user has asked for Their ability to provide associated information is
extremely limited static and difficult to scale Taaleersquos Semantic Content Model
goes beyond indexing keywords and classifying assets toUnderstand and Associate all content it catalogs
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 103
Example (test on httpdirectorymediaanywherecom)
Search for company lsquoCommerce Onersquo
Links to news on companies that compete against
Commerce One
Links to news on companies Commerce One competes
against(To view news on Ariba click
on the link for Ariba)
Crucial news on Commerce Onersquos
competitors (Ariba) can be accessed easily and
automatically
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 104
Internal Source 1Research
Internal Source 2
External feedsWeb(eg Reuters)
1
2
3
4
Cisco story from PW Source 1passed on to addsemanticassociations
ConsultsKnowledgeBasefor Ciscorsquoscompetition
Returns resultLucent is a competitor of Cisco
Lucent story from external
feeds picked for publishing as ldquosemantically
relatedrdquo to Ciscostory ndash passed
on to Dashboard
Story onLucent
Story onCisco
XCM-compliant metadata XML or other format
SemanticApplication
ASPEnterprise hosted
Extractor Agent 1
Extractor Agent 2
Extractor Agent 3
Metadata centricContent Management Architecture
SemanticEngine
World Model
TaaleeMetabase
Third-partyContent Mgmt
AndSyndication
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 105
Semantic Associationssupported by Taalee Semantic Engine
Intelligent Content = What You Asked for + What you need to know
COMPANYCOMPANYRelated Stock News
Related Stock News
IndustryNews
IndustryNews
CompetitionCompetitionCOMPANIES inINDUSTRY with Competing PRODUCTSCOMPANIES in Same or
Related INDUSTRY
SECEPAEPA
RegulationsRegulationsImpacting INDUSTRY or Filed By COMPANY
Technology Products
Technology ProductsImportant to INDUSTRY or COMPANY
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 106
Semantic Web Application ExampleFinancial Advisor Research Dashboard
Automatic Collation of semantically related digital media information from Multiple Sources
Research Inferred Automatically
Semantically Related News Not Specifically Asked For
Semantic SearchPersonalization etc
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
A vision for future
Semantic Web Complex Relationships and Knowledge Discovery
Eg InfoQuilt project at LSDIS Lab Univ of Georgia
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 108
Beyond RDF ndash one proposal (cf Ora Lassila)
Structural modeling obviously not enoughwe need a ldquologic layerrdquo on top of RDFsome type of description logic is a possibility
Exposing a wide variety of data sources as RDF is useful particularly if we have logicrules which allow us to draw inference from this data
RDF + DL = ldquoFrame System for WWWrdquo
Source wwwontoknowledgeorgoil
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
HP 109
Semantic Web - next step in Web evolution
ldquoA Web in which machine reasoning will be ubiquitous and devastatingly powerfulrdquo [Berners-Lee]
ldquoA place where the whim of a human being and the reasoning of a machine coexist in an ideal powerful mixturerdquo [Berners-Lee]
ldquoA semantic Web would permit more accurate and efficient Web searches which are among the most important Web-based activitiesrdquo [Berners-Lee]
A personal definitionSemantic Web The concept that Web-accessible
content can be organized semantically rather than though syntactic and structural methods
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
What is DAML (DARPA Agent Markup Language)
a proposal to create technologies that will enable software agents to dynamically identify and understand information sources and to provide interoperability between agents in a semantic mannerBased on RDF+XMLAgent readable Tags
wwwdamlorg
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
DAML Example
Sou
rce
http
w
ww
zdn
etc
omp
cwee
kst
orie
sju
mps
04
270
2432
946
00h
tml
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
Three layered Architecture Of Semantic Web
Logical Layer
Formal Semantics and Reasoning Support ndash OIL DAML-O
Schema Layer
Definition of Vocabulary RDF Schema
Data Layer
Simple data model and syntax for metadata - RDF
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
OIL ndash as RDF Extension
ltrdfsClass rdfID=rdquoherbivorerdquogtltrdftype
rdfresource=rdquohttpwwwontoknowledgeorgDefinedClassrdquogtltrdfssubClassOf rdfresource=rdquoanimalrdquogtltrdfssubClassOfgt
ltoilNOTgtltoilhasOperand rdfresource=rdquocarnivorerdquogt
ltoilNOTgtltrdfssubClassOfgt
ltrdfsClassgt
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
DAML and OIL ndash Evolving towards Semantic Web
OIL MissionOIL is a Web-based representation and inference layer for ontologies which combines the widely used modeling primitives from frame-based languages with the formal semantics and reasoning services provided by description logics
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
Knowledge Discovery Knowledge Discovery --ExampleExample
Earthquake Sources(USGS NEIC)
Nuclear Test Sources(Oklahoma Observatory etc)
Nuclear Test May Cause Earthquakes
Is it really true
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
Complex RelationshipsComplex Relationships
A nuclear test could have caused an earthquakeif the earthquake occurred some time after thenuclear test was conducted and in a nearby region
NuclearTest Causes Earthquakelt= dateDifference( NuclearTesteventDate
EarthquakeeventDate ) lt 30AND distance( NuclearTestlatitude
NuclearTestlongitudeEarthquakelatitudeEarthquakelongitude ) lt 10000
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
Knowledge Discovery Knowledge Discovery --ExampleExample
When was the first recorded nuclear test conducted
1950Find the total number of earthquakes with a magnitude58 or higher on the Richter scale per year starting from 1900
Increase in number of earthquakes since 1945
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
Knowledge Discovery Knowledge Discovery --ExamplehellipExamplehellip
For each group of earthquakes with magnitudes in the ranges58-6 6-7 7-8 8-9 and gt9 on the Richter scale per yearstarting from 1900 find average number of earthquakes
Number of earthquakes with magnitude gt 7 almost constant So nuclear tests probably only cause earthquakes with magnitude lt 7
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
KnowledgeKnowledge DiscoveryDiscovery --ExampleExamplehelliphellip
Find pairs of nuclear tests and earthquakes such that the earthequakeoccurred within 30 days after the test was conducted and in a radius of 10000 miles from the epicenter of the earthquake
Demo
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
ResourcesReferences
RDFwwww3orgTRREC-rdf-syntaxICE wwwicestandardorgMeta Object Facility (MOF) Specification Version 13 September 27 1999 httpcgiomgorgcgi-bindocad99-09-05XML Metadata Interchange (XMI) Specification Version 11 October 25 1999 httpcgiomgorgcgi-bindocad9910-02httpcgiomgorgcgi-bindocad99-10-03DAML wwwdamlorgNEWSML newsshowcasereuterscomPRISM wwwprismstandardorgtechdevprismspec1aspXCM wwwvignettecomOIL wwwontoknowledgeorgoilSEMANTICWEB wwwsemanticweborgVOICEXML wwwvoicexmlorgMPEG7 wwwdarmstadtgmddemobileMPEG7Taalee wwwtaaleecomOingo wwwoingocom
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998
Multimedia Data Management Using Metadata to Integrate and Apply Digital Media Amit Sheth and Wolfgang Klas Eds McGraw Hill ISBN 0-07-057735-8 1998