14
The Role of Metadata and Data Quality in Driving Enterprise Data Governance and Culture Michelle Pinheiro and Ram Kumar Enterprise Information Management and Privacy Chief Analytics Office February 2016

IAG presentation at the Chief Data Officer Forum, Sydney

Embed Size (px)

Citation preview

Page 1: IAG presentation at the Chief Data Officer Forum, Sydney

The Role of Metadata and Data Quality in Driving Enterprise Data Governance and Culture Michelle Pinheiro and Ram Kumar

Enterprise Information Management and Privacy

Chief Analytics Office

February 2016

Page 2: IAG presentation at the Chief Data Officer Forum, Sydney

00.00.2015 • Page 2

Insurance Australia Group (IAG) No. 1 General Insurer in Australia and New Zealand

Page 3: IAG presentation at the Chief Data Officer Forum, Sydney

00.00.2015 • Page 3

AN ORGANISATION THAT LEARNS TO TREAT DATA AS A CORPORATE ASSET

AND VALUE DATA WILL GAIN SIGNIFICANT ADVANTAGE

OVER ITS COMPETITORS

“ Some day, on the corporate balance sheet, there will be an entry that

reads, “information”; for in most cases, the information is more valuable

than the software and hardware that processes it.”

- Admiral Grace Murray Hopper (Inventor of Cobol), 1965

Page 4: IAG presentation at the Chief Data Officer Forum, Sydney

00.00.2015 • Page 4

Performing analytics/advanced analytics using Data/Big Data to drive business outcomes

does not necessarily mean an organisation is “data driven” or has a “data

driven culture”.

Data driven culture: How the Data Lifecycle is managed effectively end to end so that it enables the organisation

to organise activities, make decisions, manage risks and resolve conflicts.

Page 5: IAG presentation at the Chief Data Officer Forum, Sydney

00.00.2015 • Page 5

DATA LIFECYCLE

Collection Us Storage Use Retention Destruction

Collection

Data collection controls,

including manual entry of data

as well as automated data

feeds from

internal business units and

external sources, are designed

to ensure that newly introduced

data meets data quality,

security, and where applicable

ethical and privacy

requirements.

Use

Ensure that data processing (the

application of business rules to data)

and the output generated continue

to meet data quality, security, and

where applicable social, ethical and

privacy requirements. This includes

distribution of data to other storage

systems and retrieval of data from

these storage systems. The

originator and stakeholders using

the data must abide by the

organisation’s practices around

social and ethical obligations.

privacy and security of data to

maintain confidentiality, integrity and

quality.

Storage

Ensure data is stored in a

secure manner and only where

the organisation has a valid

need and as per legal and

regulatory requirements. While

stored, the organisation

maintains quality of the data,

ensuring the information is

correct, up-to-date, meets

business availability

requirements and where

applicable privacy requirements.

Destruction

Data destruction controls and

timeframes are established to

ensure that data quality, legal and

regulatory requirements are not

compromised. Where personal

information is no longer required, it

needs to be destroyed or de-

identified (e.g. removing names and

contact details) when there is no

valid business need for its storage.

Re-identification of the de-identified

data must not be possible.

Retention

Ensure that business, legal and

regulatory requirements with respect to

storing the data are not compromised

as hardware, software or data reach

the end of their useful life or the

hardware/software is recommissioned

for another use. Data retention controls

(includes archive) and timeframes are

established to ensure that data quality,

legal and regulatory requirements are

not compromised.

Organisation

Organisation Information is classified

according to its significance.

This enables control of access

to and use of information as

per roles and responsibilities.

Page 6: IAG presentation at the Chief Data Officer Forum, Sydney

00.00.2015 • Page 6

DATA ANALYTICS LIFECYCLE

Collection Preparation

and Exploration

Organisation and Storage

Generation Retention and Destruction

Collection

Data collection controls,

including manual entry of data

as well as automated data

feeds from internal business

divisions and external sources,

are designed to ensure that

newly introduced data is

trustworthy, meets data quality,

security, ethical and privacy

requirements where applicable.

Preparation and Exploration

Prepare the data for

exploration and analysis

without changing/cleansing the

data. Explore and analyse the

data to generate insights and

predictive models without

changing the data. Ensure

compliance to data

classification requirements.

Organisation and Storage

Classify the data according to

its significance (e.g.

confidentiality and sensitivity)

and ensure data is stored in a

secure manner in a hadoop file

system or structured database

environment and only where

the organisation has a valid

need. While stored, the

organisation where applicable,

maintains quality of the data,

ensuring the information is

correct, up-to-date and meets

the business, legal and

regulatory requirements.

Generation

Generate new data products

(e.g. insights, analytical and

predictive models) that would

provide disruptive, differentiated

and measurable value to the

business.

Retention and Destruction

Data retention controls and timeframes

are established to ensure that legal

and regulatory requirements are not

compromised as a result of risks

associated with the storage of data or

data products. From destruction

perspective, ensure that business,

legal and regulatory requirements with

respect to privacy and confidentiality

(e.g. de-identification) are not

compromised as hardware, software or

data reach the end of their useful life or

the hardware/software is

recommissioned for another use.

Publication and

Monetization

Publication and Monetization

Publish the generated data

product for consumption by

business and consumers to

create measurable value.

Publication or Monetization of

the generated data product must

consider social, ethical and

privacy obligations and, legal

and regulatory compliance

requirements.

Page 7: IAG presentation at the Chief Data Officer Forum, Sydney

00.00.2015 • Page 7

“Big Data” Governance does not mean “Big” Data Governance

Data Management Practices remain the same

Data Governance

Data Architecture

Information Quality

Management

Reference and Master Data

Management

Data Analytics

and Reporting

Structured Data

Lifecycle Management

Unstructured Data

Lifecycle Management

Metadata Management

Data Security Management

Reference and Master

Data

Transaction Data

Analytics and Reporting

Data

Unstructured and

Structured Data

(including Data Classification , Privacy and Ethics)

Data Processing Infrastructure

Bigger isn’t better, better is better. Although big

data may indeed be followed by more data that

doesn’t necessarily mean we require more data

management in order to prevent more data from

becoming “Morbidly Obese Data”. I think that we

just need to exercise better data management

practices

– Jim Harris

Whether you choose to measure it in terabytes,

petabytes, exabytes, HoardaBytes, or how much

reality bites, the truth is we were consuming way

more than our recommended daily allowance of

data long before the data management industry

took a tip from McDonald’s and put the word

“big” in front of its signature sandwich. More

Data becomes “Morbidly Obese Data” only if we

don’t exercise better data management

practices.

– Jim Harris

Irrespective of the size of the data....small, medium, big, bigger, lake, swamp, massive, infinite, ubiquitous???, ...

Page 8: IAG presentation at the Chief Data Officer Forum, Sydney

00.00.2015 • Page 8

Gartner’s Information Management Capability Maturity Model

Level 0 Level 1 Level 2 Level 3 Level 4 Level 5

Unaware Aware Reactive Proactive Managed Effective

• Some awareness of the importance of Information Strategy and Data Governance.

• Organisation understand the need for common standards, tools and models.

• Minimal investment in people, process and technology; lack of business sponsorship and involvement.

• Effort to document risk associated with uncontrolled information assets has begun.

• Awareness is growing of poor data quality and of fragmented and inconsistent information in key subject areas.

• IT departments seek efficiencies through vertical silo consolidation (Data Warehouse) using data modellers and databases administrators

• BI continue generating inconsistent or redundant reports.

• Business and IT do not know that information is a problem, while users mistrust the data.

• Organisation runs significant risk from under managed information, such as compliances failures, poor customer service and productivity.

• Emerging Information Management (IM) Strategy & Roadmap, Governance and Processes.

• Organisation takes steps toward cross-department sharing to achieve operational efficiency.

• Investments are being made to uplift IM capabilities in some areas.

• Risks of not managing information as corporate asset are well understood.

• Metrics are emerging to focus on retention for information, files, email, etc. to address known security and compliance risks.

• IM Polices and Standards are published, but adherence and compliance are still low.

• Essential elements of well-architected Data Warehouse (DW)/Business Intelligence (BI) framework / technology are emerging.

• Organisation sets standards for IM technology: distinct architecture for Analytics, Enterprise Data Management, Federated DW/BI unified at the business level - information as a service (Information service layer & Service-Oriented Architecture (SOA).

• Full compliance with IM Policies and Standards

• Operational risk is minimised: IM is embedded in SDLC as projects are implemented and Governance Committees with Information Owners and Information Stewards help manage information as an asset.

• Strategy & Roadmap, Governance and Processes are in place enabling proactive IM.

• Full business sponsorship and involvement, promoting enterprise-wide initiatives.

• Senior Management in Business and IT recognises Information as a strategic asset, embraces and funds Enterprise Information Management (EIM) Program that addresses key stakeholders, requirements and aligned with priorities based on the business strategy.

• Consistent and mature governance, policies, procedures and EIM operating model ensures coordination of all IM activities.

• Best practices are identified, and EIM team ensures that these are extended across the enterprise.

• Corporate performance is managed by DW/BI dashboards and scorecards.

• Senior management sees information as a competitive advantage and exploits it to create value and increase efficiency

• Enterprise Data is managed as a strategic asset across the entire value chain – effectively, ethically and efficiently.

• EIM Group coordinates all information efforts across the enterprise..

• Organisation strives to make IM more transparent to both Business and Technology organisations, with business-level data owners and stewards playing active roles.

• The organisation has achieved its EIM goals.

Where is your organisation positioned?

Page 9: IAG presentation at the Chief Data Officer Forum, Sydney

00.00.2015 • Page 9

How do we create a true data driven culture?

• Believe that data is a “strategic asset “ and that it has “value”

• Accountability from the TOP

• Think data centricity for any initiatives • Transformation, New business, New products, etc

• Define and implement processes

• Policies, procedures, governance, technology, tools, risk, etc

• Define and implement metrics to measure outcomes

• Educate, Communicate, Train

• Reward the good and punish the bad

• Continuous improvement through leanings

While 80% of CEOs claim to have operationalised the notion of data as an asset, only 10% say

that their company actually treats it that way – Gartner, 2015

Page 10: IAG presentation at the Chief Data Officer Forum, Sydney

00.00.2015 • Page 10

DQ for Advanced Analytics – It’s a different Approach

Throughout the lifecycle of Analytical Model development both data quality and

metadata are essential enablers to facilitate accuracy in our insights.

In Business

to

Customer

In

Advanced Analytical

Modelling

In

Data Acquisition

and

Analysis

Page 11: IAG presentation at the Chief Data Officer Forum, Sydney

00.00.2015 • Page 11

Data Acquisition and Analysis Analytics requires metadata to understand the meaning and context of the data, and to

know its lifecycle. For data quality, it’s a new approach that is needed.

• Data dictionary, business rules and mapping of data

elements to business vocabulary to provide context

• Data classification of all elements acquired into the

data lake to understand its significance

Our

vision

• To enable dynamic access to graded datasets to

Advanced Analytics teams based on an aggregation

of data quality metrics using new concepts.

• Mapping of relationships between data and

Information Assets, and enhancing with business IP

• Automated qualitative analysis, beyond data

profiling and aligned to data quality domains

“The practices and tools of

big data and data science

do not stand alone in the

data ecosystem.”1

1 Edd Dumbill – www.forbes.com, 2013

Page 12: IAG presentation at the Chief Data Officer Forum, Sydney

00.00.2015 • Page 12

Business 2 Customer Effective delivery of the right analytical “personalised insight” to the right customer at the

right time and with the right context is completely dependent upon your ability to

uniquely identify your customer.

“As soon as integration is introduced, just as in the past, data quality plays a critical role.”1

1 Michelle Goetz – Social Media Today

What do I know about

Christine Johnson?

Page 13: IAG presentation at the Chief Data Officer Forum, Sydney

00.00.2015 • Page 13

Metadata 101

Information

Asset

Information

Item

Instance of

an item

Customer

contact

database

Customer Customer

Call

Call #1 Call #2 Call #3

• Name and purpose

• Business usage

• Categories of data and their respective owners

• Database schema

Metadata of the tables and columns:

• Name and definition

• Security classification

• Relationships

• Business Rules & formulas

• Data Quality & standards

• Ownership

Metadata of the call:

• Date and time

• Duration

• Incoming number

• Consultant

• CAS Score

• is data about data, it is not the data itself

• is supplemental information that provides context relating to a physical or

digital object, a person, an event, or a process. It can be applied to anything;

Metadata

Large volumes and transactional in nature,

therefore it is managed as data in a

structured database repository.

High level, low volume

Granular, high volume

This level of metadata is in scope for

IAG Metadata Management.

This level of metadata is in scope for

the Australian Government’s Data

Retention scheme.

This level of metadata is treated as data

by IAG and is managed through our

operational core business applications

and data warehouses.

Low volumes of data, subsistent with

intellectual property that is managed

through a metadata application.

Object Type Example Object Example Metadata

Page 14: IAG presentation at the Chief Data Officer Forum, Sydney

Thank you. Questions?