19
www.techinmotioninc.com Data Quality: Prerequisite for Data Sharing Bonnie K. O’Neil Sr. Principal Data Architect PPC Yazmin Rowe Data Architect Technology in Motion, Inc.

Www.techinmotioninc.com Data Quality: Prerequisite for Data Sharing Bonnie K. O’Neil Sr. Principal Data Architect PPC Yazmin Rowe Data Architect Technology

Embed Size (px)

Citation preview

Page 1: Www.techinmotioninc.com Data Quality: Prerequisite for Data Sharing Bonnie K. O’Neil Sr. Principal Data Architect PPC Yazmin Rowe Data Architect Technology

www.techinmotioninc.com

Data Quality: Prerequisite for Data Sharing

Bonnie K. O’Neil

Sr. Principal Data Architect

PPC

Yazmin Rowe

Data Architect

Technology in Motion, Inc.

Page 2: Www.techinmotioninc.com Data Quality: Prerequisite for Data Sharing Bonnie K. O’Neil Sr. Principal Data Architect PPC Yazmin Rowe Data Architect Technology

2

Agenda

►Case Study Background►Data Quality Framework►Proof of Concept►Incremental Approach►Take-aways

Page 3: Www.techinmotioninc.com Data Quality: Prerequisite for Data Sharing Bonnie K. O’Neil Sr. Principal Data Architect PPC Yazmin Rowe Data Architect Technology

3

Case Study

►Federal Bureau►Legal/Catching Bad Guys►Data, data everywhere…►President’s directives to share data

►Especially around terrorism

►Data quality an issue (where is it NOT an issue?!)►Initiative: Implement Data Quality Program to enable Data

Sharing

Page 4: Www.techinmotioninc.com Data Quality: Prerequisite for Data Sharing Bonnie K. O’Neil Sr. Principal Data Architect PPC Yazmin Rowe Data Architect Technology

4

State of Affairs…

►No enterprise data model►No enterprise data dictionary►They DO have Data Management in place

►DM Reviews►DM Handbook

►Data Sharing Challenges:►No “gold copy” or source of record for data elements►Different systems, supposedly same data element, different values►How many systems is this data element in?►Impact analysis►Data sharing simply not possible!

Page 5: Www.techinmotioninc.com Data Quality: Prerequisite for Data Sharing Bonnie K. O’Neil Sr. Principal Data Architect PPC Yazmin Rowe Data Architect Technology

5

Framework Fundamentals:Dictionary is the Heart of Data Quality

►How can you tell what “bad data” is if you don’t know what it is supposed to be in the first place?

►Dictionary tells you this►Good definitions spell out the expectation for the data

►You know it is a good definition when you are able to tell if the data conforms to expectations or not►Should be able to compare results of profiling with the definition for

the field►Does the data conform to the definition?

Page 6: Www.techinmotioninc.com Data Quality: Prerequisite for Data Sharing Bonnie K. O’Neil Sr. Principal Data Architect PPC Yazmin Rowe Data Architect Technology

6

Framework Fundamentals:Top Down and Bottom Up Data Management

►Profiling to see what’s really in the fields►Examine data element definitions (where they exist) to

determine what the business thinks the fields contain►Enabled business person (Data Steward) to be key player

Page 7: Www.techinmotioninc.com Data Quality: Prerequisite for Data Sharing Bonnie K. O’Neil Sr. Principal Data Architect PPC Yazmin Rowe Data Architect Technology

7

Study in 2007

►First Six Months 2007:►Enterprise-Wide Data Quality Standard/Procedures

►Standards are part of the infrastructure necessary to share data externally

►Enterprise Data Quality Framework ►Strategic and tactical approach

►Successful Proof-of-Concept Project►An enterprise data model

►A data dictionary

►In 2007:►Software Selection for Data Quality Framework►Kick-off First Business Unit Data Quality Project

Page 8: Www.techinmotioninc.com Data Quality: Prerequisite for Data Sharing Bonnie K. O’Neil Sr. Principal Data Architect PPC Yazmin Rowe Data Architect Technology

8

The Proof of Concept

►All kinds of politics with getting new projects approved►Especially with Production Data

►We performed a Proof of Concept (POC)►Isolated environment; testing, using PC’s►Agreed to get rid of production data after we have profiled it

►Language is Critical►Can’t call it “Data Profiling”►Instead, called it “Data Demographic Analysis”

Page 9: Www.techinmotioninc.com Data Quality: Prerequisite for Data Sharing Bonnie K. O’Neil Sr. Principal Data Architect PPC Yazmin Rowe Data Architect Technology

9

Proof of Concept Cont’d

►Found a Sponsor►Good friends with the DM Team Lead►Recent DQ “issue”►Has motivation to look into this

►Formed an interdisciplinary project team (IPT)►Involved many people from different areas of the business

Page 10: Www.techinmotioninc.com Data Quality: Prerequisite for Data Sharing Bonnie K. O’Neil Sr. Principal Data Architect PPC Yazmin Rowe Data Architect Technology

10

Shoestring Principle

►Bonnie’s Law:►“Use Whatever is Laying Around”

►You will be surprised at what you find when you look for “whatever is lying around”►Already purchased software►Software/hardware scrapped from a failed project►Under-utilized systems

Page 11: Www.techinmotioninc.com Data Quality: Prerequisite for Data Sharing Bonnie K. O’Neil Sr. Principal Data Architect PPC Yazmin Rowe Data Architect Technology

11

Using Bonnie’s Law

►Repository products were too expensive►Had Oracle Warehouse Builder (OWB) lying around►New OWB has a data profiling option

►Good News: Saved us from having to buy a separate profiling tool►Bad News: OWB was an option (meaning money)

►Still cheaper than having to buy separate profiling tool

►HAD TO HAVE Profiling!!►Using it NOT for ETL!

Page 12: Www.techinmotioninc.com Data Quality: Prerequisite for Data Sharing Bonnie K. O’Neil Sr. Principal Data Architect PPC Yazmin Rowe Data Architect Technology

12

Benefits

►Statistics on their data►Profiling: Min/Max, % NULL, % Distinct, format/pattern, etc.

►Cannot manage what you cannot measure!►Immediately pinpoint data quality issues►Traceability to data concepts (EDM)

►Show multiple occurrences of same type of data►Setting the infrastructure in place for a “super query”

►Provided a “straw man” data quality methodology (Framework)►In draft►Solicited comments from everybody►Helps get buy-in BIG TIME instead of shoving it down their throats►Users felt included instead of alienated

Page 13: Www.techinmotioninc.com Data Quality: Prerequisite for Data Sharing Bonnie K. O’Neil Sr. Principal Data Architect PPC Yazmin Rowe Data Architect Technology

13

Scoping

►Divide & conquer►Pick a subject area

►Less complex semantics►PEOPLE

►Limit systems for the POC►Three, but ended up with two►Not overly complex but not

simple either

►Kept refining the scope

Page 14: Www.techinmotioninc.com Data Quality: Prerequisite for Data Sharing Bonnie K. O’Neil Sr. Principal Data Architect PPC Yazmin Rowe Data Architect Technology

14

Incremental: Growing the Project

►If the user likes it, this project can “graduate” to a “real” project►More complex subject areas ►More systems

►This is actually what happened!►Sell to other groups in the bureau►EDM will grow incrementally►Successfully established a Data Quality Program at the

Bureau

Page 15: Www.techinmotioninc.com Data Quality: Prerequisite for Data Sharing Bonnie K. O’Neil Sr. Principal Data Architect PPC Yazmin Rowe Data Architect Technology

15

Take-Aways

► You must ALWAYS do data profiling!! Essential!! For anything. Period.► Try to use what you have instead of buying something new &

expensive► You’d be surprised what you find “lying around”

► Involve the users and other groups within the business ► Especially in creating a methodology

► Lets them feel a part of the creation

► Language is very important to people► Sometimes I have seen the term “Data Warehouse” disliked

► Get your project funded by tagging it along with a business goal► Funding the EDM by way of data quality

► Find business hot button and propose to solve it

Page 16: Www.techinmotioninc.com Data Quality: Prerequisite for Data Sharing Bonnie K. O’Neil Sr. Principal Data Architect PPC Yazmin Rowe Data Architect Technology

16

Future Plans

►Data Governance►Data Inventory►Master Data Management (MDM) ►Formal integration of data quality measurements into SLC►Linking the EDM to application data

►Suck in the data from the source systems►Suck in the EDM from data modeling tool►Map the two►“Virtual” mapping

►No data movement taking place

Page 17: Www.techinmotioninc.com Data Quality: Prerequisite for Data Sharing Bonnie K. O’Neil Sr. Principal Data Architect PPC Yazmin Rowe Data Architect Technology

17

Conclusion

►In order to achieve data sharing, you must clean up the data first

►You can get data quality projects funded if you:►Start small►Solve an important business problem►Establish a framework►Get a sponsor who sees the business value in what you are

providing►Be politically savvy about word usage; don’t use their charged

words►Get business people involved and participating►Limit expenditures at first, until you have proven the business

benefit►Do a POC to test drive your approach (“Try it, you’ll like it”)

►Isolate it from production applications

Page 18: Www.techinmotioninc.com Data Quality: Prerequisite for Data Sharing Bonnie K. O’Neil Sr. Principal Data Architect PPC Yazmin Rowe Data Architect Technology

18

Thanks!

Bonnie O’NeilProject Performance Corporation

24771 Westridge Rd.Golden, CO 80403

Office: 303-642-3534 Cell: 303-725-1737

[email protected]

PPC is based in the Washington, DC areaand performs both Government and Commercial work

IT Consulting, Project Management

Yazmin RoweTechnology in Motion, Inc.

Office: 703-278-0792 Cell: 301-915-4471

[email protected]

Page 19: Www.techinmotioninc.com Data Quality: Prerequisite for Data Sharing Bonnie K. O’Neil Sr. Principal Data Architect PPC Yazmin Rowe Data Architect Technology

19

Reference

►Newly released book : ►Business Metadata►Authors:

►Bill Inmon►Bonnie O’Neil►Lowell Fryman

►Making metadata useful to the business►Does metadata need to be translated into “business speak”?►Where does business metadata live?►What do you have to set in place to implement it?►How do you do a Vulcan Mind Meld to get it out of people’s heads?