Upload
corinium-global
View
67
Download
0
Embed Size (px)
Citation preview
The Role of Metadata and Data Quality in Driving Enterprise Data Governance and Culture Michelle Pinheiro and Ram Kumar
Enterprise Information Management and Privacy
Chief Analytics Office
February 2016
00.00.2015 • Page 2
Insurance Australia Group (IAG) No. 1 General Insurer in Australia and New Zealand
00.00.2015 • Page 3
AN ORGANISATION THAT LEARNS TO TREAT DATA AS A CORPORATE ASSET
AND VALUE DATA WILL GAIN SIGNIFICANT ADVANTAGE
OVER ITS COMPETITORS
“ Some day, on the corporate balance sheet, there will be an entry that
reads, “information”; for in most cases, the information is more valuable
than the software and hardware that processes it.”
- Admiral Grace Murray Hopper (Inventor of Cobol), 1965
00.00.2015 • Page 4
Performing analytics/advanced analytics using Data/Big Data to drive business outcomes
does not necessarily mean an organisation is “data driven” or has a “data
driven culture”.
Data driven culture: How the Data Lifecycle is managed effectively end to end so that it enables the organisation
to organise activities, make decisions, manage risks and resolve conflicts.
00.00.2015 • Page 5
DATA LIFECYCLE
Collection Us Storage Use Retention Destruction
Collection
Data collection controls,
including manual entry of data
as well as automated data
feeds from
internal business units and
external sources, are designed
to ensure that newly introduced
data meets data quality,
security, and where applicable
ethical and privacy
requirements.
Use
Ensure that data processing (the
application of business rules to data)
and the output generated continue
to meet data quality, security, and
where applicable social, ethical and
privacy requirements. This includes
distribution of data to other storage
systems and retrieval of data from
these storage systems. The
originator and stakeholders using
the data must abide by the
organisation’s practices around
social and ethical obligations.
privacy and security of data to
maintain confidentiality, integrity and
quality.
Storage
Ensure data is stored in a
secure manner and only where
the organisation has a valid
need and as per legal and
regulatory requirements. While
stored, the organisation
maintains quality of the data,
ensuring the information is
correct, up-to-date, meets
business availability
requirements and where
applicable privacy requirements.
Destruction
Data destruction controls and
timeframes are established to
ensure that data quality, legal and
regulatory requirements are not
compromised. Where personal
information is no longer required, it
needs to be destroyed or de-
identified (e.g. removing names and
contact details) when there is no
valid business need for its storage.
Re-identification of the de-identified
data must not be possible.
Retention
Ensure that business, legal and
regulatory requirements with respect to
storing the data are not compromised
as hardware, software or data reach
the end of their useful life or the
hardware/software is recommissioned
for another use. Data retention controls
(includes archive) and timeframes are
established to ensure that data quality,
legal and regulatory requirements are
not compromised.
Organisation
Organisation Information is classified
according to its significance.
This enables control of access
to and use of information as
per roles and responsibilities.
00.00.2015 • Page 6
DATA ANALYTICS LIFECYCLE
Collection Preparation
and Exploration
Organisation and Storage
Generation Retention and Destruction
Collection
Data collection controls,
including manual entry of data
as well as automated data
feeds from internal business
divisions and external sources,
are designed to ensure that
newly introduced data is
trustworthy, meets data quality,
security, ethical and privacy
requirements where applicable.
Preparation and Exploration
Prepare the data for
exploration and analysis
without changing/cleansing the
data. Explore and analyse the
data to generate insights and
predictive models without
changing the data. Ensure
compliance to data
classification requirements.
Organisation and Storage
Classify the data according to
its significance (e.g.
confidentiality and sensitivity)
and ensure data is stored in a
secure manner in a hadoop file
system or structured database
environment and only where
the organisation has a valid
need. While stored, the
organisation where applicable,
maintains quality of the data,
ensuring the information is
correct, up-to-date and meets
the business, legal and
regulatory requirements.
Generation
Generate new data products
(e.g. insights, analytical and
predictive models) that would
provide disruptive, differentiated
and measurable value to the
business.
Retention and Destruction
Data retention controls and timeframes
are established to ensure that legal
and regulatory requirements are not
compromised as a result of risks
associated with the storage of data or
data products. From destruction
perspective, ensure that business,
legal and regulatory requirements with
respect to privacy and confidentiality
(e.g. de-identification) are not
compromised as hardware, software or
data reach the end of their useful life or
the hardware/software is
recommissioned for another use.
Publication and
Monetization
Publication and Monetization
Publish the generated data
product for consumption by
business and consumers to
create measurable value.
Publication or Monetization of
the generated data product must
consider social, ethical and
privacy obligations and, legal
and regulatory compliance
requirements.
00.00.2015 • Page 7
“Big Data” Governance does not mean “Big” Data Governance
Data Management Practices remain the same
Data Governance
Data Architecture
Information Quality
Management
Reference and Master Data
Management
Data Analytics
and Reporting
Structured Data
Lifecycle Management
Unstructured Data
Lifecycle Management
Metadata Management
Data Security Management
Reference and Master
Data
Transaction Data
Analytics and Reporting
Data
Unstructured and
Structured Data
(including Data Classification , Privacy and Ethics)
Data Processing Infrastructure
Bigger isn’t better, better is better. Although big
data may indeed be followed by more data that
doesn’t necessarily mean we require more data
management in order to prevent more data from
becoming “Morbidly Obese Data”. I think that we
just need to exercise better data management
practices
– Jim Harris
Whether you choose to measure it in terabytes,
petabytes, exabytes, HoardaBytes, or how much
reality bites, the truth is we were consuming way
more than our recommended daily allowance of
data long before the data management industry
took a tip from McDonald’s and put the word
“big” in front of its signature sandwich. More
Data becomes “Morbidly Obese Data” only if we
don’t exercise better data management
practices.
– Jim Harris
Irrespective of the size of the data....small, medium, big, bigger, lake, swamp, massive, infinite, ubiquitous???, ...
00.00.2015 • Page 8
Gartner’s Information Management Capability Maturity Model
Level 0 Level 1 Level 2 Level 3 Level 4 Level 5
Unaware Aware Reactive Proactive Managed Effective
• Some awareness of the importance of Information Strategy and Data Governance.
• Organisation understand the need for common standards, tools and models.
• Minimal investment in people, process and technology; lack of business sponsorship and involvement.
• Effort to document risk associated with uncontrolled information assets has begun.
• Awareness is growing of poor data quality and of fragmented and inconsistent information in key subject areas.
• IT departments seek efficiencies through vertical silo consolidation (Data Warehouse) using data modellers and databases administrators
• BI continue generating inconsistent or redundant reports.
• Business and IT do not know that information is a problem, while users mistrust the data.
• Organisation runs significant risk from under managed information, such as compliances failures, poor customer service and productivity.
• Emerging Information Management (IM) Strategy & Roadmap, Governance and Processes.
• Organisation takes steps toward cross-department sharing to achieve operational efficiency.
• Investments are being made to uplift IM capabilities in some areas.
• Risks of not managing information as corporate asset are well understood.
• Metrics are emerging to focus on retention for information, files, email, etc. to address known security and compliance risks.
• IM Polices and Standards are published, but adherence and compliance are still low.
• Essential elements of well-architected Data Warehouse (DW)/Business Intelligence (BI) framework / technology are emerging.
• Organisation sets standards for IM technology: distinct architecture for Analytics, Enterprise Data Management, Federated DW/BI unified at the business level - information as a service (Information service layer & Service-Oriented Architecture (SOA).
• Full compliance with IM Policies and Standards
• Operational risk is minimised: IM is embedded in SDLC as projects are implemented and Governance Committees with Information Owners and Information Stewards help manage information as an asset.
• Strategy & Roadmap, Governance and Processes are in place enabling proactive IM.
• Full business sponsorship and involvement, promoting enterprise-wide initiatives.
• Senior Management in Business and IT recognises Information as a strategic asset, embraces and funds Enterprise Information Management (EIM) Program that addresses key stakeholders, requirements and aligned with priorities based on the business strategy.
• Consistent and mature governance, policies, procedures and EIM operating model ensures coordination of all IM activities.
• Best practices are identified, and EIM team ensures that these are extended across the enterprise.
• Corporate performance is managed by DW/BI dashboards and scorecards.
• Senior management sees information as a competitive advantage and exploits it to create value and increase efficiency
• Enterprise Data is managed as a strategic asset across the entire value chain – effectively, ethically and efficiently.
• EIM Group coordinates all information efforts across the enterprise..
• Organisation strives to make IM more transparent to both Business and Technology organisations, with business-level data owners and stewards playing active roles.
• The organisation has achieved its EIM goals.
Where is your organisation positioned?
00.00.2015 • Page 9
How do we create a true data driven culture?
• Believe that data is a “strategic asset “ and that it has “value”
• Accountability from the TOP
• Think data centricity for any initiatives • Transformation, New business, New products, etc
• Define and implement processes
• Policies, procedures, governance, technology, tools, risk, etc
• Define and implement metrics to measure outcomes
• Educate, Communicate, Train
• Reward the good and punish the bad
• Continuous improvement through leanings
While 80% of CEOs claim to have operationalised the notion of data as an asset, only 10% say
that their company actually treats it that way – Gartner, 2015
00.00.2015 • Page 10
DQ for Advanced Analytics – It’s a different Approach
Throughout the lifecycle of Analytical Model development both data quality and
metadata are essential enablers to facilitate accuracy in our insights.
In Business
to
Customer
In
Advanced Analytical
Modelling
In
Data Acquisition
and
Analysis
00.00.2015 • Page 11
Data Acquisition and Analysis Analytics requires metadata to understand the meaning and context of the data, and to
know its lifecycle. For data quality, it’s a new approach that is needed.
• Data dictionary, business rules and mapping of data
elements to business vocabulary to provide context
• Data classification of all elements acquired into the
data lake to understand its significance
Our
vision
• To enable dynamic access to graded datasets to
Advanced Analytics teams based on an aggregation
of data quality metrics using new concepts.
• Mapping of relationships between data and
Information Assets, and enhancing with business IP
• Automated qualitative analysis, beyond data
profiling and aligned to data quality domains
“The practices and tools of
big data and data science
do not stand alone in the
data ecosystem.”1
1 Edd Dumbill – www.forbes.com, 2013
00.00.2015 • Page 12
Business 2 Customer Effective delivery of the right analytical “personalised insight” to the right customer at the
right time and with the right context is completely dependent upon your ability to
uniquely identify your customer.
“As soon as integration is introduced, just as in the past, data quality plays a critical role.”1
1 Michelle Goetz – Social Media Today
What do I know about
Christine Johnson?
00.00.2015 • Page 13
Metadata 101
Information
Asset
Information
Item
Instance of
an item
Customer
contact
database
Customer Customer
Call
Call #1 Call #2 Call #3
• Name and purpose
• Business usage
• Categories of data and their respective owners
• Database schema
Metadata of the tables and columns:
• Name and definition
• Security classification
• Relationships
• Business Rules & formulas
• Data Quality & standards
• Ownership
Metadata of the call:
• Date and time
• Duration
• Incoming number
• Consultant
• CAS Score
• is data about data, it is not the data itself
• is supplemental information that provides context relating to a physical or
digital object, a person, an event, or a process. It can be applied to anything;
Metadata
Large volumes and transactional in nature,
therefore it is managed as data in a
structured database repository.
High level, low volume
Granular, high volume
This level of metadata is in scope for
IAG Metadata Management.
This level of metadata is in scope for
the Australian Government’s Data
Retention scheme.
This level of metadata is treated as data
by IAG and is managed through our
operational core business applications
and data warehouses.
Low volumes of data, subsistent with
intellectual property that is managed
through a metadata application.
Object Type Example Object Example Metadata
Thank you. Questions?