16
Richard Wang, Ph.D. Deputy Chief Data Officer Chief Data Quality Officer Office of the U.S. Army CIO/G-6 Director, MIT Information Quality Program (on leave) Massachusetts Institute of Technology Vesting University Professor of Information Quality University of Arkansas at Little Rock On the Authoritative Data Sources: One Data Element at a Time DAMA National Capital Region Chapter Meeting March 9, 2010 Washington, DC

Richard Wang, Ph.D. Deputy Chief Data Officer Chief Data Quality Officer Office of the U.S. Army CIO/G-6 Director, MIT Information Quality Program (on

Embed Size (px)

Citation preview

Page 1: Richard Wang, Ph.D. Deputy Chief Data Officer Chief Data Quality Officer Office of the U.S. Army CIO/G-6 Director, MIT Information Quality Program (on

Richard Wang, Ph.D.Deputy Chief Data OfficerChief Data Quality Officer

Office of the U.S. Army CIO/G-6

Director, MIT Information Quality Program (on leave) Massachusetts Institute of Technology

Vesting University Professor of Information Quality

University of Arkansas at Little Rock

On the Authoritative Data Sources: One Data Element at a Time

DAMA National Capital Region Chapter Meeting

March 9, 2010Washington, DC

Page 2: Richard Wang, Ph.D. Deputy Chief Data Officer Chief Data Quality Officer Office of the U.S. Army CIO/G-6 Director, MIT Information Quality Program (on

© 1988-2010 MIT IQ Program, Army - 2 - For DAMA-NCR March 9, 2010 meeting

Data Quality Books by MIT Information Quality Program

http://mitiq.mit.edu/Publications.htm

2006 1999200020052006

Page 3: Richard Wang, Ph.D. Deputy Chief Data Officer Chief Data Quality Officer Office of the U.S. Army CIO/G-6 Director, MIT Information Quality Program (on

© 1988-2010 MIT IQ Program, Army - 3 - For DAMA-NCR March 9, 2010 meeting

MIT’s role in the foundations for IQ education (2007, Madnick)

Articles- 1990 Polygen Data Quality Model (VLDB + ICIS)

- 1996 Beyond Accuracy- 1998 Managing Information as a Product

Books- Journey to Data Quality (2006)- … and many others

Conferences and Certification Programs- 1996 International Conference on Information Quality (ICIQ)- 2002 MIT-IQ program for Executives- 2003 IQ-1: Principles and Foundations- 2007 IQ Industry Symposium

Journals- 2007 ACM Journal on Data and Information Quality (JDIQ)

Research Projects- 1988 Total Data Quality Management Program (TDQM)- 2002 MIT Information Quality (MITIQ) Program

Rich Wang(our Harry Potter)

* Not complete list

Lots of time & energy

IQ

IQ

UALR: MSIQ and IQ PhD Degree Programs

Page 4: Richard Wang, Ph.D. Deputy Chief Data Officer Chief Data Quality Officer Office of the U.S. Army CIO/G-6 Director, MIT Information Quality Program (on

© 1988-2010 MIT IQ Program, Army - 4 - For DAMA-NCR March 9, 2010 meeting

One Data Element At a Time:Federal Agency Case

Stakeholders Meeting

Data Element Identification

$1M+ impact per data element

90-day progress

Page 5: Richard Wang, Ph.D. Deputy Chief Data Officer Chief Data Quality Officer Office of the U.S. Army CIO/G-6 Director, MIT Information Quality Program (on

© 1988-2010 MIT IQ Program, Army - 5 - For DAMA-NCR March 9, 2010 meeting

Private Sector Case

Data Element Selection Criteria√ Critical to Business √ Recognized Pain Point√ $1M+ impact√ Practical to model √ Practical to Implement√ Owner identified√ Commitment by the Stakeholders: 3 C’s +

Management

Page 6: Richard Wang, Ph.D. Deputy Chief Data Officer Chief Data Quality Officer Office of the U.S. Army CIO/G-6 Director, MIT Information Quality Program (on

© 1988-2010 MIT IQ Program, Army - 6 - For DAMA-NCR March 9, 2010 meeting

Page 7: Richard Wang, Ph.D. Deputy Chief Data Officer Chief Data Quality Officer Office of the U.S. Army CIO/G-6 Director, MIT Information Quality Program (on

© 1988-2010 MIT IQ Program, Army - 7 - For DAMA-NCR March 9, 2010 meeting

Army Chief Data Quality Officer FY10 Priorities

1. 300-500 critical Army Data Elements in FY10, 5000 by FY13

2. Army Staffing of Data Elements from Bronze to be Silver, Gold

3. Vertical integration up with semantics, business logic, objects (U-Core, C2-Core ontology)

Authoritative Data SourcesDesignated Data Sources Authoritative Data Elements

Page 8: Richard Wang, Ph.D. Deputy Chief Data Officer Chief Data Quality Officer Office of the U.S. Army CIO/G-6 Director, MIT Information Quality Program (on

© 1988-2010 MIT IQ Program, Army - 8 - For DAMA-NCR March 9, 2010 meeting

Single Element Approach

Challenge:Establish a Total Data Quality Management (TDQM) Program in the Army while utilizing limited resources

Solution:1. Address one data element at a time using priority data elements within priority projects.

2. Take a first few data elements through the entire TDQM cycle to educate and illustrate value.

3. Establish and populate a catalog of data element quality specifications (the “Define” of TDQM) containing priority data elements for broad use.

TDQM Cycle

Page 9: Richard Wang, Ph.D. Deputy Chief Data Officer Chief Data Quality Officer Office of the U.S. Army CIO/G-6 Director, MIT Information Quality Program (on

© 1988-2010 MIT IQ Program, Army - 9 - For DAMA-NCR March 9, 2010 meeting

Early Success

Project: Suicide Mitigation - NIMH Study feedElements: UIC, SSN

Developed Data Quality Specification to define data quality rules. Constructed Information Product Map (IP-Map) that

shows the flow of the data element and its quality checks from data providers to NIMH Study consumer. ADCF implemented quality checks and reported results. Captured DQ Process metrics and DQ element metrics

in a Dashboard. Preparing DQ element metric details to feed back to data

providers.

Page 10: Richard Wang, Ph.D. Deputy Chief Data Officer Chief Data Quality Officer Office of the U.S. Army CIO/G-6 Director, MIT Information Quality Program (on

© 1988-2010 MIT IQ Program, Army - 10 - For DAMA-NCR March 9, 2010 meeting

(Army) Data Element Yellow Pages

Data Element Quality Definition

Process IP

IP Producer

B. IP Producers utilize the Data Element Yellow Pages to discover Data Element specifications and integrate them into their Information Products

C. IP Consumers access the Data Element Yellow Pages to find Data Element specifications for understanding and correctly using the data.

IP Consumer

IP = Information Product

IP

http:architecture.army.mil/data/DEYP

A. Army Data Elements specifications are developed thru the Data Element Quality Definition Process and entered in the Data Element Yellow Pages

Page 11: Richard Wang, Ph.D. Deputy Chief Data Officer Chief Data Quality Officer Office of the U.S. Army CIO/G-6 Director, MIT Information Quality Program (on

© 1988-2010 MIT IQ Program, Army - 11 - For DAMA-NCR March 9, 2010 meeting

Data Element Yellow Pages Content

Data Element Quality Specification:

• Element Name• Definition• Data Quality Rules• Approval Level• Examples• Data Element Owner (Steward?)• Authoritative References• Usage Notes• more…

Data Quality Rules:Supports “fit for use” Segmented into Three Levels

1. Container (conceptual format)2. Content (correct in itself)3. Context (correct in context)

Approval Level:1. Gold – ADB Approved2. Silver – ADC Approved3. Bronze – CDQO

Approved

Page 12: Richard Wang, Ph.D. Deputy Chief Data Officer Chief Data Quality Officer Office of the U.S. Army CIO/G-6 Director, MIT Information Quality Program (on

© 1988-2010 MIT IQ Program, Army - 12 - For DAMA-NCR March 9, 2010 meeting

Data Element Quality Specification Process

Page 13: Richard Wang, Ph.D. Deputy Chief Data Officer Chief Data Quality Officer Office of the U.S. Army CIO/G-6 Director, MIT Information Quality Program (on

© 1988-2010 MIT IQ Program, Army - 13 - For DAMA-NCR March 9, 2010 meeting

ADC Review and Comment Process (proposed)

1. Review DE Specifications with your SMEs

Note: you will find some documents cover the entire project; others have only the definition and quality sections completed. Review the definition and quality sections.

2. Gather and submit your comments to CDQO All comments welcomed (positive, corrections, content, format, unaddressed). No comment [silence] is concurrence. Send your comments to CDQO Office.

3. Suspense Date: Week before next ADC Meeting for readout at month ADC meeting.

Page 14: Richard Wang, Ph.D. Deputy Chief Data Officer Chief Data Quality Officer Office of the U.S. Army CIO/G-6 Director, MIT Information Quality Program (on

© 1988-2010 MIT IQ Program, Army - 14 - For DAMA-NCR March 9, 2010 meeting

ADS Defined

Authoritative Data Source:

A recognized or official data production source with a designated mission statement or source/product to publish reliable and accurate data for subsequent use by customers. An authoritative data source may be the functional combination of multiple, separate data sources.

Page 15: Richard Wang, Ph.D. Deputy Chief Data Officer Chief Data Quality Officer Office of the U.S. Army CIO/G-6 Director, MIT Information Quality Program (on

© 1988-2010 MIT IQ Program, Army - 15 - For DAMA-NCR March 9, 2010 meeting

To assure data quality…

A data source, is a mechanism through which the publication, storage, or retrieval of data is possible. Within the scope of the Information Technology domain, a data source is consists of digitized data, such as a database, a machine readable file, or a data stream. Data sources contain or provide information and fulfill specific data needs within an identified mission context.

A data element is an attribute in a database, a field in a machine readable file, or a basic unit in a data stream.

The association of a data need and a given mission characterizes a data source’s intended use.

A data source is referred to as a Designated Data Source if the mission and the needed data elements from the data source for this mission are clearly specified.

An authoritative body that has responsibility of fulfilling a particular data need attributes a data source as a designated data source.

A designated data source is referred to as an Authoritative Data Source if the underlying data of the data elements needed in the specified mission is certified as accurate, timely, and fit for subsequent use by data consumers.

Page 16: Richard Wang, Ph.D. Deputy Chief Data Officer Chief Data Quality Officer Office of the U.S. Army CIO/G-6 Director, MIT Information Quality Program (on

© 1988-2010 MIT IQ Program, Army - 16 - For DAMA-NCR March 9, 2010 meeting

Thank you!

Q & A