Upload
dataversity
View
555
Download
1
Embed Size (px)
Citation preview
Data Modeling & Master Data Management (MDM)
Donna BurbankGlobal Data Strategy Ltd.
Lessons in Data Modeling DATAVERSITY Series
September 28th, 2017
Global Data Strategy, Ltd. 2017
Donna Burbank
Donna is a recognised industry expert in information management with over 20 years of experience in data strategy, information management, data modeling, metadata management, and enterprise architecture. Her background is multi-faceted across consulting, product development, product management, brand strategy, marketing, and business leadership.
She is currently the Managing Director at Global Data Strategy, Ltd., an international information management consulting company that specializes in the alignment
of business drivers with data-centric technology. In past roles, she has served in key brand strategy and product management roles at CA Technologies and Embarcadero Technologies for several of the leading data management products in the market.
As an active contributor to the data management community, she is a long time DAMA International member, Past President and Advisor to the DAMA Rocky Mountain chapter, and was recently awarded the Excellence in Data Management Award from DAMA International in 2016. She was on the review committee for the Object Management Group’s Information Management Metamodel (IMM) and the Business Process Modeling Notation (BPMN). Donna is also an analyst at the Boulder BI Train Trust (BBBT) where she provides advices and gains insight on the
latest BI and Analytics software in the market.
She has worked with dozens of Fortune 500 companies worldwide in the Americas, Europe, Asia, and Africa and speaks regularly at industry conferences. She has co-authored two books: Data Modeling for the Business and Data Modeling Made Simple with ERwin Data Modeler and is a regular contributor to industry publications. She can be reached [email protected] is based in Boulder, Colorado, USA.
2
Follow on Twitter @donnaburbankToday’s hashtag: #LessonsDM
Global Data Strategy, Ltd. 2017
DATAVERSITY Lessons in Data Modeling Series
• January - on demand How Data Modeling Fits Into an Overall Enterprise Architecture
• February - on demand Data Modeling and Business Intelligence
• March - on demand Conceptual Data Modeling – How to Get the Attention of Business Users
• April - on demand The Evolving Role of the Data Architect – What does it mean for your Career?
• May - on demand Data Modeling & Metadata Management
• June - on demand Self-Service Data Analysis, Data Wrangling, Data Munging, and Data Modeling
• July - on demand Data Modeling & Metadata for Graph Databases
• August - on demand Data Modeling & Data Integration
• September 28 Data Modeling & Master Data Management (MDM)
• October 26 Agile & Data Modeling – How Can They Work Together?
• December 5 Data Modeling, Data Quality & Data Governance
3
This Year’s Line Up
Global Data Strategy, Ltd. 2017
What is Master Data?
• Master Data is the consistent and uniform set of identifiers and extended attributes that describes the core entities of the enterprise including customers, prospects, citizens, suppliers, sites, hierarchies and chart of accounts (sic).
• Master data management (MDM) is a technology-enabled discipline in which business and IT work together to ensure the uniformity, accuracy, stewardship, semantic consistency and accountability of the enterprise's official shared master data assets.
- Source Gartner
4
Definition
Global Data Strategy, Ltd. 2017
When?
A Data Model Describes the Entities of the BusinessThe “Who, What, Where, When, Why” of the Organization – the Nouns
Entity: A classification of the types of objects found in the real world --persons, places, things, concepts and events – of interest to the enterprise. 1
1 DAMA Dictionary of Data ManagementWho?
How?
Where?
What?
Product
Salesperson
Invoice
Why?
OrderPeriod
Location
Global Data Strategy, Ltd. 2017
A Data Model Is a Visual Representation of Core Entities
6
A data model is a graphical view of the core entities important to the organization.
Humans tend to think in Pictures.
But… All Entities are not Master Data Entities
Global Data Strategy, Ltd. 2017
A Data Model Is a Visual Representation of Core Entities
7From Data Modeling for the Business by Hoberman, Burbank, Bradley, Technics Publications, 2009
A data model is a graphical view of the core entities important to the organization.
Humans tend to think in Pictures.
Early Master Data
Global Data Strategy, Ltd. 2017
Transaction Data vs. Master Data
Customer Date Product Code Price Quantity Location
Stefan Kraus 1/2/2017 Scarpa Telemark Ski Boot SC1279 €250 1 St. Moritz, CH
Donna Burbank 1/5/2017 Scarpa Telemark Ski Boot SCU1289 $150 1 Boulder, CO
Stefan Kraus 1/2/2017 North Face Down Jacket NF8392 €450 1 Zurich, CH
Stefan Kraus 1/2/2017 Garmin Sports Watch GM29384 €200 2 Zurich, CH
Wendy Hu 3/4/2017 Prana Yoga Pant PN82734 $51 5 New York, NY
Joe Smith 4/1/2017 Garmin Sports Watch GM29384 $150 1 Albany, NY
8
Consider the following retail transaction data
Transaction Data• Describes an action (verb): E.g. “buy”
• May include measurements about the action: (Who, When, What, How Many, Where, How Much, etc.)
• E.g. Stefan Kraus, 1/2/2017/, Scarpa Telemark Ski Boot, St. Moritz, CH, €250
Master Data• Describes the key entities (nouns), e.g. Customer, Product,
Location
• Provides attributes & context for these nouns
• e.g. Wendy Hu, age 25, female, resident of New York, NY, Customer since 2005, preferred customer card, etc.
Global Data Strategy, Ltd. 2017
Customer Date Product Code Price Quantity Location
Stefan Kraus 1/2/2017 Scarpa Telemark Ski Boot SC1279 €250 1 St. Moritz, CH
Donna Burbank 1/5/2017 Scarpa Telemark Ski Boot SCU1289 $150 1 Boulder, CO
Stefan Kraus 1/2/2017 North Face Down Jacket NF8392 €450 1 Zurich, CH
Stefan Kraus 1/2/2017 Garmin Sports Watch GM29384 €200 2 Zurich, CH
Wendy Hu 3/4/2017 Prana Yoga Pant PN82734 $51 5 New York, NY
Joe Smith 4/1/2017 Garmin Sports Watch GM29384 $150 1 Albany, NY
Transaction Data vs. Master Data
9
Master Data: Customer
Master Data: ProductMaster Data: Location
Reference Data: Country Codes
Reference Data: State Codes
Transaction Data
Global Data Strategy, Ltd. 2017
Master Data – the Opportunity
10
A 360 Degree View through Data
Stefan KraussAge = 31
Occupation = Ski Instructor Purchased €500 in outdoor gear in 2016
100% of purchases online
Top Finisher in Engadin Ski Marathon 2010-2015
Member of Loyalty Program since 2010
Prefers Text Message
Address = Pontresina, Switzerland
Global Data Strategy, Ltd. 201711
Stefan KraussAge = 62
Master Data – the Opportunity (& Need)A 360 Degree View through Data
Occupation = Banker
Member of Loyalty Program since 1990
Football Fan
Prefers Physical Mail
100% of spending in store
75% of spending is while on holiday
Purchased €3.500 in outdoor gear in 2016
Address = Zurich, Switzerland
Global Data Strategy, Ltd. 2017
Master Data Management (MDM)
• There are many architectural approaches to MDM. Two are the following:
12
Centralized Virtualized/Registry
MDM
Virtualization Layer
• Core data stored in a common schema in a centralized “hub”.
• Used as a common reference for operational systems, DW, etc.
• Data remains in source systems.
• Referenced through a common virtualization layer.
BOTH require a Data Model
Global Data Strategy, Ltd. 2017
MDM Data Models
• In an MDM Data Model, the core attributes for master data entities can be identified.
• This is typically the superset of attributes used by core systems & stakeholders in the organization.
13
Core, Shared Attributes
Source System A
Source System B
Source System C
Global Data Strategy, Ltd. 2017
ETL
Master Data Overview
14
CRM In-Store Sales
MarketingFinance Online Sales
Supply Chain
Each system has its own unique functionality and associated data model.
MDM“Golden Record”
Data Warehouse
BI & Reporting
Data Model
Lookup
End User ApplicationsReference Data Sets
Data Quality& Matching
Publish & Subscribe
The MDM data model is a selected superset of the source system models.
MDM can feed the dimensional model for the data warehouse (e.g. customer,
location, etc.)
Applications can reference the “Golden Record” for
lookup.
Global Data Strategy, Ltd. 2017
Business Rules & Matching
• Once master attributes have been assigned, and their populating source systems identified, a next step in MDM is to clarify how records are identified as equivalent, in a process known as matching. • Matching rules provide the criteria used to match records from disparate systems as
candidates for a golden record.
• Matching strategies are based on identifying attributes, and multiple match strategies can be defined, for example:• Match Strategy 1: Match on Date of Birth + Social Security Number
• Match Strategy 2: Match on Social Security Number + Last Name
• Match strategies can be executed in sequential order. For example, if no match is found using Strategy 1, a match will be searched for using Strategy 2, and so on through the list of match strategies.
15
Global Data Strategy, Ltd. 2017
Data Model / Database Keys for Matching Rules
• Candidate attribute combinations for matching are often aligned with the primary and alternate keys from the logical data model.
16
Ideally, if all systems use the same unique identifier, matching is easier. But this isn’t often realistic in “real world” systems.
• First, match on Date of Birth + SSN• Then, match on SSN + Last Name• Etc.
Matching on Primary Key
Matching on Alternate Keys
Global Data Strategy, Ltd. 2017
Fuzzy Matching
• Fuzzy matching logic can also be used, which is particularly helpful in matching string fields such as names and addresses, where human error or different entry standards between systems can cause slight variations in similar values, e.g.• “101 Main St” vs. “101 Main Street”
• “John Smith” vs. “J Smith”
• In addition synonyms can be created to assist with matching, for example • “St”, “St.”, “Street”, etc. for addresses
• “Tim”, “Timothy” for names and nicknames.
• When using fuzzing matching, data quality thresholds can be defined for auto approval.• Match scores are created for each fuzzy match, for example .9 would indicate a strong match and .2 a
weak one.
• Using these scores as a guide, thresholds can be defined for which matches can be auto-approved, which can be auto-rejected, and which need human review from a data steward.
17
Global Data Strategy, Ltd. 2017
Matching Approval – Key Stewardship Role
• A key responsibility of Data Stewards for MDM is the manual review and approval of potential matches which cannot be auto-approved and which require human review.
• In these cases, the match score is below the defined threshold, and requires a data steward to review the proposed matches and MDM golden record. Each steward would review the items for their given area only.
18
Match Group ID Name Match Status Match Score Record Source
000007 John R Smith Proposed .7843 System A
000007 Jack Smith Proposed .6532 System B
000007 John Smith Proposed .6894 System C
000007 John R Smith Proposed etc. System D
Global Data Strategy, Ltd. 2017
Survivorship – Attribute Groups• Once matches have been approved, a golden record can be assigned from a match group through a
process of Survivorship. (Note: These rules are distinct from Matching Rules)
• In order to create a mastered record from the various source systems, a series of attribute groups are defined, with specific survivorship rules for each of those attributes.
• For example, address sets could be defined for the following scenarios• Name fields (e.g. containing First Name, Last Name, Maiden Name, etc.). Rules could be defined that these
attributes are populated from System A.
• Demographic fields (e.g. containing Race, Ethnicity, Gender, etc.). Rules could be defined that these fields are populated from System B.
19
System A
System B
MDM“Golden Record”
Global Data Strategy, Ltd. 2017
Harmonization
• Harmonization: The process of harmonization pushes the mastered record back to source systems. • While this helps keep the MDM and source systems in synch, and works to improve overall data
quality …
• … it should be handled carefully, with close coordination with the owners and stewards of the source systems.
20
Global Data Strategy, Ltd. 2017
Governance & Business Process for MDM
• Successful MDM is critical on collaboration between the owners and stewards of various systems, and between business and IT stakeholders.
• In fact, the top two reasons for failure of MDM systems cited by the Gartner analyst group1 are :• Failure of IT to Align With Business Process Improvements and Document Business Value
• Delaying or Mismanaging Information Governance Implementation
• While the implementation of the hub and population strategies is complex, more complex is understanding the business processes and governance processes around the populating and publishing systems.
21
1 Top Four Reasons Your MDM Program Will Fail, and How to Avoid Them, Gartner, 2016, ID: G00223675, by Bill O’Kane. Note: The remaining two reasons are: Failure to Manage Initial Master Data Quality & Defining Transactional (Fact) Data as Master Data
Global Data Strategy, Ltd. 2017
The Importance of Business Process
• Process models are a helpful tool for describing core business processes (e.g. BPMN).• “Swimlanes” outline organizational considerations
• Data can be mapped to key business processes to understand creation & usage of information.
• Understanding business process is critical to Data Governance• Who is using data?
• How is it used in business processes?
• Are there redundancies, conflicts, etc.?
22
Identifying key data dependencies in core business processes
Global Data Strategy, Ltd. 2017
CRUD Matrix – Understanding Data Usage
Product
Development
Supply Chain
Accounting
Marketing Finance
Product Assembly Instructions C R
Product Components C R
Product Price C U R
Product Name C U,D
Etc.
23
Create, Read, Update, Delete
• CRUD Matrices shows where data is Created, Read, Updated or Deleted across the various areas of the organization
• This can be a helpful tool in data governance & data quality to determine route cause analysis.
Data entities or attributes
Users, Departments, and/or Systems
Global Data Strategy, Ltd. 2017
Case Study: Linking Data with Process for MDM
• An international restaurant chain realized through its digital strategy that:• While menus are the core product that drives their business…
• They had little control or visibility over their menu data
• Menu data was scattered across multiple systems in the organization from supply chain to kitchen prep to marketing, restaurant operations, etc.
• Menu data was consolidated & managed in a central hub:• Master Data Management created a “single view of menu” for business efficiency & quality control
• Data Governance created the workflow & policies around managing menu data
• Process Models & Data Mappings were critical• Business Process Models to identify the flow of information
• CRUD Matrixes to understand usage, stewardship & ownership
24
Managing the Data that Runs the Business
Product Creation & Testing
Menu Display & Marketing
Supply Chain Point of Sale & Restaurant Operations
Global Data Strategy, Ltd. 2017
Summary
• Master Data focuses on the core entities of the business (e.g. Customer, Product, Supplier, etc.)
• Data models are a critical part of any MDM initiative – defining & managing these core entities
• Master Data Management can provide significant business opportunity, as long as governance, process, data quality, survivorship rules, etc. are managed correctly.
• Data Governance is critical to any MDM initiative
• Business Process Models and CRUD matrices are important tools in aligning MDM to business success
Global Data Strategy, Ltd. 2017
DATAVERSITY Lessons in Data Modeling Series
• January - on demand How Data Modeling Fits Into an Overall Enterprise Architecture
• February - on demand Data Modeling and Business Intelligence
• March - on demand Conceptual Data Modeling – How to Get the Attention of Business Users
• April - on demand The Evolving Role of the Data Architect – What does it mean for your Career?
• May - on demand Data Modeling & Metadata Management
• June - on demand Self-Service Data Analysis, Data Wrangling, Data Munging, and Data Modeling
• July - on demand Data Modeling & Metadata for Graph Databases
• August - on demand Data Modeling & Data Integration
• September 28 Data Modeling & Master Data Management (MDM)
• October 26 Agile & Data Modeling – How Can They Work Together?
• December 5 Data Modeling, Data Quality & Data Governance
26
This Year’s Line Up
Global Data Strategy, Ltd. 2017
About Global Data Strategy, Ltd
• Global Data Strategy is an international information management consulting company that specializes in the alignment of business drivers with data-centric technology.
• Our passion is data, and helping organizations enrich their business opportunities through data and information.
• Our core values center around providing solutions that are:• Business-Driven: We put the needs of your business first, before we look at any technology solution.• Clear & Relevant: We provide clear explanations using real-world examples.• Customized & Right-Sized: Our implementations are based on the unique needs of your organization’s
size, corporate culture, and geography. • High Quality & Technically Precise: We pride ourselves in excellence of execution, with years of
technical expertise in the industry.
27
Data-Driven Business Transformation
Business StrategyAligned With
Data Strategy
Visit www.globaldatastrategy.com for more information
Global Data Strategy, Ltd. 2017
Contact Info
• Email: [email protected]
• Twitter: @donnaburbank
@GlobalDataStrat
• Website: www.globaldatastrategy.com
28
Global Data Strategy, Ltd. 2017
Questions?
29
Thoughts? Ideas?