Upload
kurt-cagle
View
867
Download
0
Tags:
Embed Size (px)
DESCRIPTION
This slide deck provides a high level overview of semantics and digital asset management systems, and shows how semantic technologies play a major part in making assets more discoverable and manageable
Citation preview
Semantics & the Art of Metadata Management
Avalon Consulting
Kurt CaglePrincipal EvangelistAvalon Consulting, LLC
Who We Are• Avalon Consulting, LLC
• Enterprise Web Presence, Enterprise Search, Big Data Solutions
• Founded 2003, headquartered in Plano, TX with consultants placed nationally
• Clients include media companies (Disney, Warner Bros.), publishers (McGraw Hill, Wolters Kluwer), government agencies (US Nat’l Archives, Library of Congress, US Patent & Trade Office, DoD, OECD)
• http://www.avalonconsult.com/DAMstrategies
• Kurt Cagle• Principal Evangelist, Semantic Technologies & Machine
Learning• Author of 18 books on web technologies• [email protected]
The Problems with Metadata• Metadata Describes Things …
• Videos, Documents, Pictures, People, Companies, Music, Places, Concepts, Units, etc.
• But Search Looks for Words, Not Things• Proximity of terms, not proximity of concepts or items
• Context Is Important, Yet Often Lost• A “large” planet is large on a different scale than a “large”
molecule
• Controlled Vocabularies Aren’t • Lists of terms change, often rapidly
Metadata in the Wild• Lack of discoverability• Potential for production or distribution errors• Siloization of metadata• Internationalization woes• Spiraling metadata capture costs• Others (outside entities) control your messaging
Metadata Acquisition• Production Time Metadata
• Entered by the producers of data • Subject Matter Experts, highest quality, need consistent standards
• Curational Metadata• Entered by archivists• Time/personnel intensive, less context, more error prone
• Synthetic Curators• Image, Speech, Music Recognition, OCR, Document/Entity
Enrichment• Third party data providers and linked data• Fastest, requires context, but limited by data base• Semantics goes here
Any reasonable curation strategy should use all three approaches
6
Semantic ModelingPeople, places, things and events can be linked and modeled
Event 2Event 1Jane Doe
John Smith
GeoRegion 1
GeoRegion 218
Dec
ABC CorpXYZ Corp
Aircraft AboutMem
ber O
f
Actor
Org
Loca
tion
Tim
e
Mem
ber O
f
Actor
Org
Loca
tion
Time
Subsidiary
Location
Abo
ut
Docu
ment
Document
Document 2
Document 1
7
Semantic ModelingPeople, places, things and events can be linked and modeled
org:ABC_Corp org:XYZ_Corp
topic:Aircraft
event:Evt1
event:Evt2
person:JaneDoe
geoRegion:Eurasia
geoRegion:Americas
time:2014-11-24
time:2014-12-18
pers
on:m
em
berO
f
event:organization
event:actor
event:location
person:JohnSmithevent:about
event:startTime
org:subsidiary
event:
org
aniz
ati
on
event:about
event:actor
event:
start
Tim
e
event:location
event:location
Semantic ModelingPeople, places, things and events can be linked and modeled
org:ABC_Corp rdf:type class:Organization; org:name “ABC Corporation”.org:XYZ_Corp rdf:type class:Organization; org:name “XYZ Corporation”. org:subsidiary org:ABC_Corp.person:JaneDoe rdf:type class:Person; person:name “Jane Doe”; person:org org:ABC_Corp.person:JohnSmith rdf:type class:Person; person:name “John Smith”; person:org org:XYZ_Corp.geoRegion:Eurasia rdf:type class:GeoRegion.geoRegion:Americas rdf:type class:GeoRegion.“2014-12-21”^^xs:date.
topic:Aircraft rdf:type class:Project; topic:name “Aircraft”.event:Evt1 rdf:type class:Event; event:agent person:JaneDoe; event:about topic:Aircraft ; event:location geoRegion:Eurasia, geoRegion:Americas; event:org org:ABC_Corp; event:label “Training Module”; event:startDate “2014-11-24”^^xs:date. event:endDate “2014-11-26”^^xs:date.event:Evt2 rdf:type class:Event; event:agent person:JohnSmith; event:about topic:Aircraft ; event:location geoRegion:Americas; event:label “Surveillance Component”; event:org org:XYZ_Corp; event:startDate “2014-12-18”^^xs:date. event:endDate
9
InferencingUsing RDF to Find Relationships and Surface New Information
Select ?eventLabel ?orgName ?actorName ?startDate ?endDate where { ?project rdf:type class:Project; project:name $projectName. ?event event:project ?project; event:actor ?actor;
event:label ?eventLabel; event:startDate ?startDateISO; event:endDate ?endDateISO. ?actor actor:name ?actorName; actor:memberOf ?org; ?org org:name ?orgName. bind (format-date($startDateISO, "[MNn] [D], [Y]“) as ?startDate) bind (format-date($endDateISO, "[MNn] [D], [Y]“) as ?endDate) } order by ?startDateISO ?endDateISO
For a given project, identify the names of those actors and their associated organizations who were involved in an event focused on that project, along with when the event started and ended, sorted by orgName, actor name, start and end dates respectively.
10
InferencingUsing RDF to Find Relationships and Surface New Information
For a given project, identify the names of those actors and their associated organizations who were involved in an event focused on that project, along with when the event started and ended, sorted by orgName, actor name, start and end dates respectively.
eventLabel orgName actorName startDate endDate
Training Module 5
ABC Corp. Jane Doe Nov 24, 2014 Dec 17, 2014
Surveillance Comp.
XYZ, Inc. John Smith Dec 18, 2014 Mar 12, 2015
Nav Sys 1 ABC Corp. Jane Doe Dec 19, 2014 Feb 9, 2015
Gyro Sys 3 ABC Corp. Jane Doe Dec 19, 2014 Mar 1, 2015
AutoSensor 2 XYZ, Inc Steve Deere Dec 21, 2014 Apr 7, 2015
Left Fore Aileron XYZ, Inc. John Smith Mar 14,2015 Jun 1, 2015
Semantic Data• Assets and concepts are globally identified• Discoverable Models• Open world Assumption
• Incompleteness of knowledge is a given
• Queries are Web Aware, Distributed & Federated• Standardized, Simple RESTful Interfaces• Format Agnostic – XML, JSON, Turtle, Yours• Data models are refinable• Plays well with Hadoop, NoSQL, Big Data Solutions
Search + Semantics• Search identifies potential related terms• Semantics build context from search
results• With context, relationships can be followed• With related items, relevance extends
beyond terminology• Semantics builds navigational structures,
and binds atomic content together• Enables more effective media search
Who’s Becoming Semantic?
Metadata ManagedSemantics • establishes a metadata management
framework,• simplifies automated ingest,• constrains manual metadata capture,• increases search relevance,• enables “atomization” of content,• learns over time,• encourages data interchange.
Questions?