19
…optimise your IT investments Data Discovery Understanding data relationships Philip Howard Research Director – Bloor Research

…optimise your IT investments Data Discovery Understanding data relationships Philip Howard Research Director – Bloor Research

Embed Size (px)

Citation preview

Page 1: …optimise your IT investments Data Discovery Understanding data relationships Philip Howard Research Director – Bloor Research

…optimise your IT investments

Data DiscoveryUnderstanding data relationships

Philip HowardResearch Director – Bloor Research

Page 2: …optimise your IT investments Data Discovery Understanding data relationships Philip Howard Research Director – Bloor Research

…optimise your IT investmentsConfidential © Bloor Research 2009 telling the Information Management storyConfidential © Bloor Research 2009

Agenda

What are data relationships and why are they important?

Different approaches to discovering data relationships

Features you might look for in a data discovery tool

Page 3: …optimise your IT investments Data Discovery Understanding data relationships Philip Howard Research Director – Bloor Research

…optimise your IT investmentsConfidential © Bloor Research 2009 telling the Information Management storyConfidential © Bloor Research 2009

What is a data relationship?

1. A relationship between database tables, either within or across databases

2. A relationship within or across non-relational data sources

3. A relationship between a relational and non-relational source

Note that relationships may be complex and/or involve more than 2 elements

Page 4: …optimise your IT investments Data Discovery Understanding data relationships Philip Howard Research Director – Bloor Research

…optimise your IT investmentsConfidential © Bloor Research 2009 telling the Information Management storyConfidential © Bloor Research 2009

Why are data relationships important?

1. Data migration

Page 5: …optimise your IT investments Data Discovery Understanding data relationships Philip Howard Research Director – Bloor Research

…optimise your IT investmentsConfidential © Bloor Research 2009 telling the Information Management storyConfidential © Bloor Research 2009

Why are data relationships important?

2. Data archival

Page 6: …optimise your IT investments Data Discovery Understanding data relationships Philip Howard Research Director – Bloor Research

…optimise your IT investmentsConfidential © Bloor Research 2009 telling the Information Management storyConfidential © Bloor Research 2009

Why are data relationships important?

3. Master data management

Page 7: …optimise your IT investments Data Discovery Understanding data relationships Philip Howard Research Director – Bloor Research

…optimise your IT investmentsConfidential © Bloor Research 2009 telling the Information Management storyConfidential © Bloor Research 2009

Why are data relationships important?

4. Data governance

Page 8: …optimise your IT investments Data Discovery Understanding data relationships Philip Howard Research Director – Bloor Research

…optimise your IT investmentsConfidential © Bloor Research 2009 telling the Information Management storyConfidential © Bloor Research 2009

Why are data relationships important?

5. Data modelling

Page 9: …optimise your IT investments Data Discovery Understanding data relationships Philip Howard Research Director – Bloor Research

…optimise your IT investmentsConfidential © Bloor Research 2009 telling the Information Management storyConfidential © Bloor Research 2009

Why are data relationships important?

6. Business intelligence

Page 10: …optimise your IT investments Data Discovery Understanding data relationships Philip Howard Research Director – Bloor Research

…optimise your IT investmentsConfidential © Bloor Research 2009 telling the Information Management storyConfidential © Bloor Research 2009

Why are data relationships important?

7 & 8 & 9 & …

Data integration

Legacy migration

Data warehousing

Page 11: …optimise your IT investments Data Discovery Understanding data relationships Philip Howard Research Director – Bloor Research

…optimise your IT investmentsConfidential © Bloor Research 2009 telling the Information Management storyConfidential © Bloor Research 2009

Why are data relationships difficult?

No definition exists across multiple sources

Within a source many relationships are not explicit

Ownership of relationships is diverse

Many relationships are defined within application software and not in the data source

Page 12: …optimise your IT investments Data Discovery Understanding data relationships Philip Howard Research Director – Bloor Research

…optimise your IT investmentsConfidential © Bloor Research 2009 telling the Information Management storyConfidential © Bloor Research 2009

Data relationships in place

Different issues arise when you consider relationships within

systems versus across systems

Page 13: …optimise your IT investments Data Discovery Understanding data relationships Philip Howard Research Director – Bloor Research

…optimise your IT investmentsConfidential © Bloor Research 2009 telling the Information Management storyConfidential © Bloor Research 2009

Data relationships within systems

Typical functions:Identification of primary-foreign key pairs

Dependency analysis

Redundant columns

Usually provided through data profiling, which also provides error statistics

Page 14: …optimise your IT investments Data Discovery Understanding data relationships Philip Howard Research Director – Bloor Research

…optimise your IT investmentsConfidential © Bloor Research 2009 telling the Information Management storyConfidential © Bloor Research 2009

Data relationships across systems

Requirement for relationship discovery

No requirement for error statistics

Requirement for rule violations where this represents a violation of a cross-source relationship

Page 15: …optimise your IT investments Data Discovery Understanding data relationships Philip Howard Research Director – Bloor Research

…optimise your IT investmentsConfidential © Bloor Research 2009 telling the Information Management storyConfidential © Bloor Research 2009

Specific requirements

For MDM – overlap & precedence analysis, transformation & business rules and exceptions, outlier analysis, matching keys

For data migration & archival – business entities

Page 16: …optimise your IT investments Data Discovery Understanding data relationships Philip Howard Research Director – Bloor Research

…optimise your IT investmentsConfidential © Bloor Research 2009 telling the Information Management storyConfidential © Bloor Research 2009

General functions

Automation of MDM and Profiling functions

Visualisation of relationships

Semantics the semantic type of the data e.g. zip code

context-free discovery – e.g. recognising that cust# is equivalent to custID

Data classification: recognising the relationship between a pre-defined, business-user-maintained domain of values and the actual content of a field in order to identify the content of a field as well as unexpected values.

Business glossary

Page 17: …optimise your IT investments Data Discovery Understanding data relationships Philip Howard Research Director – Bloor Research

…optimise your IT investmentsConfidential © Bloor Research 2009 telling the Information Management storyConfidential © Bloor Research 2009

Tools LandscapeTools Landscape

Page 18: …optimise your IT investments Data Discovery Understanding data relationships Philip Howard Research Director – Bloor Research

…optimise your IT investmentsConfidential © Bloor Research 2009 telling the Information Management storyConfidential © Bloor Research 2009

Conclusion

Understanding data relationships across data sources is important in many data management disciplines

There are relatively few tools that are good at discovering such relationships – moreover, data discovery is a broad discipline and no one tool is good at all aspects of relationship discovery.

Page 19: …optimise your IT investments Data Discovery Understanding data relationships Philip Howard Research Director – Bloor Research

…optimise your IT investmentsConfidential © Bloor Research 2009