Upload
chastity-morgan
View
215
Download
2
Tags:
Embed Size (px)
Citation preview
Richard Wang, Ph.D.Deputy Chief Data OfficerChief Data Quality Officer
Office of the U.S. Army CIO/G-6
Director, MIT Information Quality Program (on leave) Massachusetts Institute of Technology
Vesting University Professor of Information Quality
University of Arkansas at Little Rock
On the Authoritative Data Sources: One Data Element at a Time
DAMA National Capital Region Chapter Meeting
March 9, 2010Washington, DC
© 1988-2010 MIT IQ Program, Army - 2 - For DAMA-NCR March 9, 2010 meeting
Data Quality Books by MIT Information Quality Program
http://mitiq.mit.edu/Publications.htm
2006 1999200020052006
© 1988-2010 MIT IQ Program, Army - 3 - For DAMA-NCR March 9, 2010 meeting
MIT’s role in the foundations for IQ education (2007, Madnick)
Articles- 1990 Polygen Data Quality Model (VLDB + ICIS)
- 1996 Beyond Accuracy- 1998 Managing Information as a Product
Books- Journey to Data Quality (2006)- … and many others
Conferences and Certification Programs- 1996 International Conference on Information Quality (ICIQ)- 2002 MIT-IQ program for Executives- 2003 IQ-1: Principles and Foundations- 2007 IQ Industry Symposium
Journals- 2007 ACM Journal on Data and Information Quality (JDIQ)
Research Projects- 1988 Total Data Quality Management Program (TDQM)- 2002 MIT Information Quality (MITIQ) Program
Rich Wang(our Harry Potter)
* Not complete list
Lots of time & energy
IQ
IQ
UALR: MSIQ and IQ PhD Degree Programs
© 1988-2010 MIT IQ Program, Army - 4 - For DAMA-NCR March 9, 2010 meeting
One Data Element At a Time:Federal Agency Case
Stakeholders Meeting
Data Element Identification
$1M+ impact per data element
90-day progress
© 1988-2010 MIT IQ Program, Army - 5 - For DAMA-NCR March 9, 2010 meeting
Private Sector Case
Data Element Selection Criteria√ Critical to Business √ Recognized Pain Point√ $1M+ impact√ Practical to model √ Practical to Implement√ Owner identified√ Commitment by the Stakeholders: 3 C’s +
Management
© 1988-2010 MIT IQ Program, Army - 6 - For DAMA-NCR March 9, 2010 meeting
© 1988-2010 MIT IQ Program, Army - 7 - For DAMA-NCR March 9, 2010 meeting
Army Chief Data Quality Officer FY10 Priorities
1. 300-500 critical Army Data Elements in FY10, 5000 by FY13
2. Army Staffing of Data Elements from Bronze to be Silver, Gold
3. Vertical integration up with semantics, business logic, objects (U-Core, C2-Core ontology)
Authoritative Data SourcesDesignated Data Sources Authoritative Data Elements
© 1988-2010 MIT IQ Program, Army - 8 - For DAMA-NCR March 9, 2010 meeting
Single Element Approach
Challenge:Establish a Total Data Quality Management (TDQM) Program in the Army while utilizing limited resources
Solution:1. Address one data element at a time using priority data elements within priority projects.
2. Take a first few data elements through the entire TDQM cycle to educate and illustrate value.
3. Establish and populate a catalog of data element quality specifications (the “Define” of TDQM) containing priority data elements for broad use.
TDQM Cycle
© 1988-2010 MIT IQ Program, Army - 9 - For DAMA-NCR March 9, 2010 meeting
Early Success
Project: Suicide Mitigation - NIMH Study feedElements: UIC, SSN
Developed Data Quality Specification to define data quality rules. Constructed Information Product Map (IP-Map) that
shows the flow of the data element and its quality checks from data providers to NIMH Study consumer. ADCF implemented quality checks and reported results. Captured DQ Process metrics and DQ element metrics
in a Dashboard. Preparing DQ element metric details to feed back to data
providers.
© 1988-2010 MIT IQ Program, Army - 10 - For DAMA-NCR March 9, 2010 meeting
(Army) Data Element Yellow Pages
Data Element Quality Definition
Process IP
IP Producer
B. IP Producers utilize the Data Element Yellow Pages to discover Data Element specifications and integrate them into their Information Products
C. IP Consumers access the Data Element Yellow Pages to find Data Element specifications for understanding and correctly using the data.
IP Consumer
IP = Information Product
IP
http:architecture.army.mil/data/DEYP
A. Army Data Elements specifications are developed thru the Data Element Quality Definition Process and entered in the Data Element Yellow Pages
© 1988-2010 MIT IQ Program, Army - 11 - For DAMA-NCR March 9, 2010 meeting
Data Element Yellow Pages Content
Data Element Quality Specification:
• Element Name• Definition• Data Quality Rules• Approval Level• Examples• Data Element Owner (Steward?)• Authoritative References• Usage Notes• more…
Data Quality Rules:Supports “fit for use” Segmented into Three Levels
1. Container (conceptual format)2. Content (correct in itself)3. Context (correct in context)
Approval Level:1. Gold – ADB Approved2. Silver – ADC Approved3. Bronze – CDQO
Approved
© 1988-2010 MIT IQ Program, Army - 12 - For DAMA-NCR March 9, 2010 meeting
Data Element Quality Specification Process
© 1988-2010 MIT IQ Program, Army - 13 - For DAMA-NCR March 9, 2010 meeting
ADC Review and Comment Process (proposed)
1. Review DE Specifications with your SMEs
Note: you will find some documents cover the entire project; others have only the definition and quality sections completed. Review the definition and quality sections.
2. Gather and submit your comments to CDQO All comments welcomed (positive, corrections, content, format, unaddressed). No comment [silence] is concurrence. Send your comments to CDQO Office.
3. Suspense Date: Week before next ADC Meeting for readout at month ADC meeting.
© 1988-2010 MIT IQ Program, Army - 14 - For DAMA-NCR March 9, 2010 meeting
ADS Defined
Authoritative Data Source:
A recognized or official data production source with a designated mission statement or source/product to publish reliable and accurate data for subsequent use by customers. An authoritative data source may be the functional combination of multiple, separate data sources.
© 1988-2010 MIT IQ Program, Army - 15 - For DAMA-NCR March 9, 2010 meeting
To assure data quality…
A data source, is a mechanism through which the publication, storage, or retrieval of data is possible. Within the scope of the Information Technology domain, a data source is consists of digitized data, such as a database, a machine readable file, or a data stream. Data sources contain or provide information and fulfill specific data needs within an identified mission context.
A data element is an attribute in a database, a field in a machine readable file, or a basic unit in a data stream.
The association of a data need and a given mission characterizes a data source’s intended use.
A data source is referred to as a Designated Data Source if the mission and the needed data elements from the data source for this mission are clearly specified.
An authoritative body that has responsibility of fulfilling a particular data need attributes a data source as a designated data source.
A designated data source is referred to as an Authoritative Data Source if the underlying data of the data elements needed in the specified mission is certified as accurate, timely, and fit for subsequent use by data consumers.
© 1988-2010 MIT IQ Program, Army - 16 - For DAMA-NCR March 9, 2010 meeting
Thank you!
Q & A