34
© COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. The Value of Metadata CTO Media & Entertainment @matt_turner_nyc Matt [email protected] Matt Turner #Dataversity

The Value of Metadata

Embed Size (px)

Citation preview

© COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

The Value of Metadata

CTO Media & Entertainment | @matt_turner_nycMatt [email protected]

Matt Turner

#Dataversity

SLIDE: 2 © COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Disclaimer

At no time will the speaker presenting this webinar seek to declaratively define the elusive and debatable term ‘metadata’ and attendees should be

warned that this term, ‘metadata’, will be used broadly in an attempt to convey the value and importance of this data. In deference to Mike Ellis,

Head of Production Architecture at the BBC, this presentation will attempt to not use the term ‘technical metadata’ because, that’s just ‘technical data’,

isn’t it! However, audience members should consider themselves warned that there may be other uses of the term ‘metadata’ in this presentation and ensuing discussion that some audience members could find unsuitable.

Sincerely, the Metadata Presenter Disclaimer Committee

Metadata All Around Assets

… And Every Type of Data

• “Metadata is anything that describes or helps understand the data objects itself.”

• “Metadata is the term used to describe all the information about data, i.e. data name, type, business definition, business rules, etc.”

•“Data about data; including name, definition, Data Steward, data type, valid values, location, security classification, source of record, volumetrics, lifecycle and retention, and lineage.”

•“The context within which a given piece of data exists. Where it came from, where it is headed, what it is used for, why it is used that way, when it was created / last altered etc ”

Quotes from Emerging Trends in Metadata Management, Oct 2016

SLIDE: 6 © COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

MANAGEMENT1 GOVERNANCE2 OPERATIONS3 ANALYTICS4 DISCOVERY5 ARCHIVE6

The Value of Metadata

SEMANTICSTAXONOMIESTECHNICAL BUSINESSPROVENANCE

Metadata is the key to leveraging data across every type of data process Does this make metadata the most important type of data?

•Two-thirds of respondents said that Metadata is more important now than it was ten years ago

•36% of respondents aren’t doing metadata management or have been doing it for less than a year•>25% of respondents said they were

unable to manage critical metadata incl. RDBS data, data warehouses and business glossaries

What’s Going On?

SLIDE: 9 © COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

MANAGEMENT1 GOVERNANCE2 OPERATIONS3 ANALYTICS4 DISCOVERY5 ARCHIVE6

Metadata in Silos

SEMANTICSTAXONOMIESTECHNICAL BUSINESSPROVENANCE

Metadata is Sneaky!

SLIDE: 10 © COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Metadata in Rows and Columns Define everything up front

Complexity of schema versus flexibility

- How many of each entity?

- What deserves a separate table / sparse data problem

Selective inclusion of data

- MUST be designed for a single purpose

Difficult to adapt and include new data

Title ProductionDate Category AssetType Length

Film1 3/1/14 Feature HD Master 2:40

Show1 6/4/13 Series HD720 0:40

Film2 6/4/05 Feature Archive 1:55

SLIDE: 11 © COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

With Fixed Taxonomies Hierarchical levels of metadata

Fixed to a specific business purpose

Each asset can only be associated with one level

- How many category fields?

- What if they overlap?

CategoryFeature

Series

ActionDramaComedyDocumentary…

Cable

Broadcast

DramaComedy…

ActionDramaFamilyDocumentary…

SLIDE: 12 © COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Traditional Approach

Multiple categories, overlapping entities and real-world data problems stymie this approach

- 100s of metadata fields

- Not to mention provenance, governance and mastering

Metadata for a specific purpose can’t be used across all parts of the business

- The model is inflexible and incomplete

Title ProductionDate Category AssetType Length

Film1 3/1/14 Feature HD Master 2:40

Show1 6/4/13 Series HD720 0:40

Film2 6/4/05 Feature Archive 1:55

CategoryFeature

Series

ActionDramaComedyDocumentary…

Cable

Broadcast

DramaComedy…

ActionDramaFamilyDocumentary…?

SLIDE: 13 © COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Result?

ComplexityInflexibility

Lost Time!

SLIDE: 14 © COPYRIGHT 2016 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

A New Approach

SLIDE: 16 © COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Schema Flexibility with NoSQL What if you didn’t need to define

everything up front?

And what if many of the nasty data modeling issues with metadata were just as simple as adding new elements?

What if you can keep EVERYTHING!

- Values, provenance … every bit of data

SLIDE: 17 © COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Semantics to Model Relationships Data model to manage relationships and link together data

‘triples’ describe single facts

Collections of facts describe complex real-world scenarios

”Chevy" ”USA"livesIn ”New York" isIn

livesIn

SLIDE: 18 © COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Ontologies Instead of Categories Actually model information as it is in the

real world

Not limited to a single purpose

- Ontologies for all categories of metadata

- Even ‘impossible’ categories like fictional worlds

- Especially complex data used in governance and business glossaries

SLIDE: 19 © COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

NoSQL and Semantics: a New Way to Manage Metadata

!

SLIDE: 20 © COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

The MarkLogic AlternativeAn Operational and Transactional Enterprise NoSQL Database

Data ingested as is (no ETL)

Structured and unstructured data

Data and metadata together

Adapts to changing data and changing data structures

EASY TO GET DATA INFlexible Data Model

Index once and query endlessly

Real-time and lightning fast

Multi-model database unifies query across JSON, XML, text, geospatial, and semantic triples in one database

EASY TO GET DATA OUTAsk Anything Universal Index

Flexible cloud deployment

Enterprise-grade data security

Reliable data and transactions (100% ACID compliant)

Seamless integration with your existing environment

TRUSTED TO RUN YOUR BUSINESSEnterprise Ready

SLIDE: 21 © COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Technical Asset Inventory System That Enriches Data and Provides Real-Time AnalyticsMorgan Stanley – TAI FIRE

WHY NOSQL + SEMANTICS? Faster data integration

Structured and unstructured data

Graph view with semantics

Bitemporal audit trail

Real-time analytics

TAI CORE

VOLUME MGMT SYSTEM

FIXED ASSET RELISTER

AFS USAGE (FILE SYSTEM)

SERVER REPORT

ENTERPRISE SERVER

PORTFOLIO

LDAP

+ MORE

TAI CLASSIC

TAI FIRE DNA

SEMANTIC LAYER (ONTOLOGIES)

TEMPORAL LAYER

REAL-TIMESEARCH &

QUERY

100+ DATA FEEDS ASSET MANAGEMENT & TRACKING

SLIDE: 22 © COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Technical Asset Inventory System That Enriches Data and Provides Real-Time AnalyticsMorgan Stanley – TAI FIRE

WHY MARKLOGIC? Faster data integration

Structured and unstructured data

Graph view with semantics

Bitemporal audit trail

Real-time analytics

TAI CORE

VOLUME MGMT SYSTEM

FIXED ASSET RELISTER

AFS USAGE (FILE SYSTEM)

SERVER REPORT

ENTERPRISE SERVER

PORTFOLIO

LDAP

+ MORE

TAI CLASSIC

TAI FIRE DNA

SEMANTIC LAYER (ONTOLOGIES)

TEMPORAL LAYER

REAL-TIMESEARCH &

QUERY

100+ DATA FEEDS ASSET MANAGEMENT & TRACKING

Timely – First production deployment within 6 months

Flexible – Integrated over 100 data sources

Cost efficient – Elasticity with commodity hardware

Modern – Smarter data with semantics and bitemporal

Success – “TAI FIRE has turned into one of the bank’s most important systems, with significant growth plans”

- MANAGING DIRECTOR, MORGAN STANLEY

The Results

SLIDE: 23 © COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Integrated Data, Integrated Intelligence, Surveillance, and Reconnaissance (ISR)U.S. Combatant Command – DCGS-SOF Data Layer

WHY NOSQL + SEMANTICS? Oracle data model too rigid

Complex data integration

Trusted platform

Lower cost in the cloud

Support for disconnected, intermittent, latent (DIL) operations

DISCONNECTED USERS

FORWARD OPERATING BASE

THEATERCOMMAND

HQ OR EXTERNAL USERS

DATAENRICHMENT

METADATA CATALOG ENTERPRISE DATA LAYER

SLIDE: 24 © COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Integrated Data, Integrated Intelligence, Surveillance, and Reconnaissance (ISR)U.S. Combatant Command – DCGS-SOF Data Layer

WHY MARKLOGIC? Oracle data model too rigid

Complex data integration

Trusted platform

Lower cost in the cloud

Support for disconnected, intermittent, latent (DIL) operations

DISCONNECTED USERS

FORWARD OPERATING BASE

THEATERCOMMAND

HQ OR EXTERNAL USERS

DATAENRICHMENT

METADATA CATALOG ENTERPRISE DATA LAYER

Scalable – 100+ million documents on 70 clustered servers

Fast – 59 times faster than relational solution

Efficient – 57% reduction in disk space used

Flexible – Replicated data for global sharing

Success – Converting data to information

The Results

SLIDE: 26 © COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

The SNL App

SLIDE: 27 © COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Semantics-driven search

TalentKristen Wiig

Acted in

Episode 4Anne Hathaway and Killers

Part of

Played

CharacterMaharelle Sister

Season 34

SegmentThe Lawrence Welk Show

Aired on

Date10/4/08

Era

Acted in

Includes

Part of

SLIDE: 28 © COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Continuous Watch Experience

SLIDE: 29 © COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Great Content + Great Metadata =

SLIDE: 30 © COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

The Complete Picture of MetadataMANAGEMENT1 GOVERNANCE2 OPERATIONS3 ANALYTICS4 DISCOVERY5 ARCHIVE6

SEMANTICSTAXONOMIESTECHNICAL BUSINESSPROVENANCE

What Next?

SLIDE: 32 © COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Get Value From Your Metadata! How can I make metadata mission critical in my organization?

- Emerging Trends in Metadata Management: http://whitepapers.dataversity.net/content60809

- Getty #MetadataMonday: http://blogs.getty.edu/iris/metadata-specialists-share-their-challenges-defeats-and-triumphs/

- SNL 40: http://www.marklogic.com/blog/live-on-enterprise-nosql-its-the-snl-40-app/

How can I get started with a metadata project?

- Financial Services: http://www.marklogic.com/resources/regulatory-compliance-insight-one-solution

- Healthcare: http://www.marklogic.com/blog/accelerate-real-world-evidence

What does a project look like?

- Enterprise Metadata Management Whitepaper:http://www.marklogic.com/resources/agile-enterprise-metadata-management/resource_download/whitepapers/

SLIDE: 33 © COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Getting Started

MarkLogic Universityhttp://www.marklogic.com/

training/

Real World Evidence Webinar

http://www.marklogic.com/events/real-world-data-transforming-disconnected-data-invaluable-

insight/

Meetwww.marklogic.com/company/contact-us/

© COPYRIGHT 2017 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Thank You!

CTO Media & Entertainment | @matt_turner_nycMatt [email protected]

Matt Turner

#Dataversity