Upload
dataversity
View
422
Download
0
Embed Size (px)
Citation preview
© COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
The Value of Metadata
CTO Media & Entertainment | @matt_turner_nycMatt [email protected]
Matt Turner
#Dataversity
SLIDE: 2 © COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Disclaimer
At no time will the speaker presenting this webinar seek to declaratively define the elusive and debatable term ‘metadata’ and attendees should be
warned that this term, ‘metadata’, will be used broadly in an attempt to convey the value and importance of this data. In deference to Mike Ellis,
Head of Production Architecture at the BBC, this presentation will attempt to not use the term ‘technical metadata’ because, that’s just ‘technical data’,
isn’t it! However, audience members should consider themselves warned that there may be other uses of the term ‘metadata’ in this presentation and ensuing discussion that some audience members could find unsuitable.
Sincerely, the Metadata Presenter Disclaimer Committee
• “Metadata is anything that describes or helps understand the data objects itself.”
• “Metadata is the term used to describe all the information about data, i.e. data name, type, business definition, business rules, etc.”
•“Data about data; including name, definition, Data Steward, data type, valid values, location, security classification, source of record, volumetrics, lifecycle and retention, and lineage.”
•“The context within which a given piece of data exists. Where it came from, where it is headed, what it is used for, why it is used that way, when it was created / last altered etc ”
Quotes from Emerging Trends in Metadata Management, Oct 2016
SLIDE: 6 © COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
MANAGEMENT1 GOVERNANCE2 OPERATIONS3 ANALYTICS4 DISCOVERY5 ARCHIVE6
The Value of Metadata
SEMANTICSTAXONOMIESTECHNICAL BUSINESSPROVENANCE
Metadata is the key to leveraging data across every type of data process Does this make metadata the most important type of data?
•Two-thirds of respondents said that Metadata is more important now than it was ten years ago
•36% of respondents aren’t doing metadata management or have been doing it for less than a year•>25% of respondents said they were
unable to manage critical metadata incl. RDBS data, data warehouses and business glossaries
SLIDE: 9 © COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
MANAGEMENT1 GOVERNANCE2 OPERATIONS3 ANALYTICS4 DISCOVERY5 ARCHIVE6
Metadata in Silos
SEMANTICSTAXONOMIESTECHNICAL BUSINESSPROVENANCE
Metadata is Sneaky!
SLIDE: 10 © COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Metadata in Rows and Columns Define everything up front
Complexity of schema versus flexibility
- How many of each entity?
- What deserves a separate table / sparse data problem
Selective inclusion of data
- MUST be designed for a single purpose
Difficult to adapt and include new data
Title ProductionDate Category AssetType Length
Film1 3/1/14 Feature HD Master 2:40
Show1 6/4/13 Series HD720 0:40
Film2 6/4/05 Feature Archive 1:55
SLIDE: 11 © COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
With Fixed Taxonomies Hierarchical levels of metadata
Fixed to a specific business purpose
Each asset can only be associated with one level
- How many category fields?
- What if they overlap?
CategoryFeature
Series
ActionDramaComedyDocumentary…
Cable
Broadcast
DramaComedy…
ActionDramaFamilyDocumentary…
SLIDE: 12 © COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Traditional Approach
Multiple categories, overlapping entities and real-world data problems stymie this approach
- 100s of metadata fields
- Not to mention provenance, governance and mastering
Metadata for a specific purpose can’t be used across all parts of the business
- The model is inflexible and incomplete
Title ProductionDate Category AssetType Length
Film1 3/1/14 Feature HD Master 2:40
Show1 6/4/13 Series HD720 0:40
Film2 6/4/05 Feature Archive 1:55
CategoryFeature
Series
ActionDramaComedyDocumentary…
Cable
Broadcast
DramaComedy…
ActionDramaFamilyDocumentary…?
SLIDE: 13 © COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Result?
ComplexityInflexibility
Lost Time!
SLIDE: 16 © COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Schema Flexibility with NoSQL What if you didn’t need to define
everything up front?
And what if many of the nasty data modeling issues with metadata were just as simple as adding new elements?
What if you can keep EVERYTHING!
- Values, provenance … every bit of data
SLIDE: 17 © COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Semantics to Model Relationships Data model to manage relationships and link together data
‘triples’ describe single facts
Collections of facts describe complex real-world scenarios
”Chevy" ”USA"livesIn ”New York" isIn
livesIn
SLIDE: 18 © COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Ontologies Instead of Categories Actually model information as it is in the
real world
Not limited to a single purpose
- Ontologies for all categories of metadata
- Even ‘impossible’ categories like fictional worlds
- Especially complex data used in governance and business glossaries
SLIDE: 19 © COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
NoSQL and Semantics: a New Way to Manage Metadata
!
SLIDE: 20 © COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
The MarkLogic AlternativeAn Operational and Transactional Enterprise NoSQL Database
Data ingested as is (no ETL)
Structured and unstructured data
Data and metadata together
Adapts to changing data and changing data structures
EASY TO GET DATA INFlexible Data Model
Index once and query endlessly
Real-time and lightning fast
Multi-model database unifies query across JSON, XML, text, geospatial, and semantic triples in one database
EASY TO GET DATA OUTAsk Anything Universal Index
Flexible cloud deployment
Enterprise-grade data security
Reliable data and transactions (100% ACID compliant)
Seamless integration with your existing environment
TRUSTED TO RUN YOUR BUSINESSEnterprise Ready
SLIDE: 21 © COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Technical Asset Inventory System That Enriches Data and Provides Real-Time AnalyticsMorgan Stanley – TAI FIRE
WHY NOSQL + SEMANTICS? Faster data integration
Structured and unstructured data
Graph view with semantics
Bitemporal audit trail
Real-time analytics
TAI CORE
VOLUME MGMT SYSTEM
FIXED ASSET RELISTER
AFS USAGE (FILE SYSTEM)
SERVER REPORT
ENTERPRISE SERVER
PORTFOLIO
LDAP
+ MORE
TAI CLASSIC
TAI FIRE DNA
SEMANTIC LAYER (ONTOLOGIES)
TEMPORAL LAYER
REAL-TIMESEARCH &
QUERY
100+ DATA FEEDS ASSET MANAGEMENT & TRACKING
SLIDE: 22 © COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Technical Asset Inventory System That Enriches Data and Provides Real-Time AnalyticsMorgan Stanley – TAI FIRE
WHY MARKLOGIC? Faster data integration
Structured and unstructured data
Graph view with semantics
Bitemporal audit trail
Real-time analytics
TAI CORE
VOLUME MGMT SYSTEM
FIXED ASSET RELISTER
AFS USAGE (FILE SYSTEM)
SERVER REPORT
ENTERPRISE SERVER
PORTFOLIO
LDAP
+ MORE
TAI CLASSIC
TAI FIRE DNA
SEMANTIC LAYER (ONTOLOGIES)
TEMPORAL LAYER
REAL-TIMESEARCH &
QUERY
100+ DATA FEEDS ASSET MANAGEMENT & TRACKING
Timely – First production deployment within 6 months
Flexible – Integrated over 100 data sources
Cost efficient – Elasticity with commodity hardware
Modern – Smarter data with semantics and bitemporal
Success – “TAI FIRE has turned into one of the bank’s most important systems, with significant growth plans”
- MANAGING DIRECTOR, MORGAN STANLEY
The Results
SLIDE: 23 © COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Integrated Data, Integrated Intelligence, Surveillance, and Reconnaissance (ISR)U.S. Combatant Command – DCGS-SOF Data Layer
WHY NOSQL + SEMANTICS? Oracle data model too rigid
Complex data integration
Trusted platform
Lower cost in the cloud
Support for disconnected, intermittent, latent (DIL) operations
DISCONNECTED USERS
FORWARD OPERATING BASE
THEATERCOMMAND
HQ OR EXTERNAL USERS
DATAENRICHMENT
METADATA CATALOG ENTERPRISE DATA LAYER
SLIDE: 24 © COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Integrated Data, Integrated Intelligence, Surveillance, and Reconnaissance (ISR)U.S. Combatant Command – DCGS-SOF Data Layer
WHY MARKLOGIC? Oracle data model too rigid
Complex data integration
Trusted platform
Lower cost in the cloud
Support for disconnected, intermittent, latent (DIL) operations
DISCONNECTED USERS
FORWARD OPERATING BASE
THEATERCOMMAND
HQ OR EXTERNAL USERS
DATAENRICHMENT
METADATA CATALOG ENTERPRISE DATA LAYER
Scalable – 100+ million documents on 70 clustered servers
Fast – 59 times faster than relational solution
Efficient – 57% reduction in disk space used
Flexible – Replicated data for global sharing
Success – Converting data to information
The Results
SLIDE: 27 © COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Semantics-driven search
TalentKristen Wiig
Acted in
Episode 4Anne Hathaway and Killers
Part of
Played
CharacterMaharelle Sister
Season 34
SegmentThe Lawrence Welk Show
Aired on
Date10/4/08
Era
Acted in
Includes
Part of
SLIDE: 29 © COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Great Content + Great Metadata =
SLIDE: 30 © COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
The Complete Picture of MetadataMANAGEMENT1 GOVERNANCE2 OPERATIONS3 ANALYTICS4 DISCOVERY5 ARCHIVE6
SEMANTICSTAXONOMIESTECHNICAL BUSINESSPROVENANCE
SLIDE: 32 © COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Get Value From Your Metadata! How can I make metadata mission critical in my organization?
- Emerging Trends in Metadata Management: http://whitepapers.dataversity.net/content60809
- Getty #MetadataMonday: http://blogs.getty.edu/iris/metadata-specialists-share-their-challenges-defeats-and-triumphs/
- SNL 40: http://www.marklogic.com/blog/live-on-enterprise-nosql-its-the-snl-40-app/
How can I get started with a metadata project?
- Financial Services: http://www.marklogic.com/resources/regulatory-compliance-insight-one-solution
- Healthcare: http://www.marklogic.com/blog/accelerate-real-world-evidence
What does a project look like?
- Enterprise Metadata Management Whitepaper:http://www.marklogic.com/resources/agile-enterprise-metadata-management/resource_download/whitepapers/
SLIDE: 33 © COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Getting Started
MarkLogic Universityhttp://www.marklogic.com/
training/
Real World Evidence Webinar
http://www.marklogic.com/events/real-world-data-transforming-disconnected-data-invaluable-
insight/
Meetwww.marklogic.com/company/contact-us/
© COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
Thank You!
CTO Media & Entertainment | @matt_turner_nycMatt [email protected]
Matt Turner
#Dataversity