25
1 Business Register: Quality Practices Eddie Salyers Eddie.Joe. Salyers @Census.GOV 301-763-2638

1 Business Register: Quality Practices Eddie Salyers [email protected] 301-763-2638

Embed Size (px)

Citation preview

1

Business Register:Quality Practices

Eddie [email protected]

301-763-2638

2

An Assessment of Current Quality Assurance Practices and Ongoing Work to Develop a Comprehensive Quality Plan for the U.S. Census Bureau Business Register

3

• Introduction– Database Redesign– Quality Assurance Team

• Business Register Overview• Quality Assurance

– Migration – Administrative Records– Census Bureau Data Collections– Recommendations

• Conclusion

Business Register:Quality Practices

4

BR Database Redesign

• Complete redesign • Old Standard Statistical Establishment List

(SSEL) VAX RDB• New Business Register (BR) Oracle• All software rewritten• New BR production Fall 2002

5

Quality Assurance Team

Mission:Assure the quality of the new BR is a

minimum commensurate with the old SSEL which it replaces, and to establish a complete quality framework.

6

Quality Assurance Team

Definitions:Quality – "The totality of features and characteristics of

a product or service that bare on its ability to satisfy specified or implied needs." (ISO, 1986).

Reliability - “The ability of a system or component to perform its required functions under stated conditions for a specified period of time.” [IEEE 90].

Integrity - Information in the system follows designated standards and is consistent both within an individual table as well as between associated tables.

7

Business Register Overview

• Primary Functions– Economic Census enumeration list– Survey sampling frames– Central storage of administrative data– Control file for data collection/processing – Data for statistical products– Data for economic research

8

Key Concepts and Definitions

The BR’s UnitsBusiness/Statistical

– Establishment– Enterprise– Enterprise segment

(e.g., alternate reporting unit)Administrative

– EIN unit– SSN unit }

Standard Statistical Units

} Variable

}

Mainly for IRS tax reporting

9

Business Organization

Basic Types

Single-establishment enterprise:– An enterprise that operates just one

establishment (i.e., at one physical location) - a single unit or SU

Multi-establishment enterprise– An enterprise that operates two

establishments or more (2-plus locations)

10

Multiunit

EIN(Payroll Only)

EIN(Payroll Only)

EIN(ConsolidatedIncome Tax)

EIN(Payroll Only)

Enterprise(Parent)

SubsidiaryEstablish-

mentEstablish-

mentEstablish-

mentEstablish-

mentEstablish-

mentEstablish-

mentEstablish-

ment

Establish-ment

Establish-ment

A more complex MU may have: Multiple EIN units One subsidiary enterprise or more

11

Complex Multiunits

The largest U.S. Multi-units may have:

Several thousand EINs

More than 10,000 establishments

12

System

• Oracle Database

• Many Related Tables

• Interactive Web-Based Interface built with Oracle Forms & PL/SQL

• Interface used for research and updates

• Software for interactive and batch updates and edits

13

Migration

• Complete Redesign– New IDs– New Table Structures– All New Software– Copy Existing data - 2001– Load “new” data - 2002

14

Migration

• Quality Checks – Create SAS Datasets from Old SSEL and

New BR for 2001 Records– Record to Record Match of 2001 SSEL

and 2001 BR• After accounting for differences cause by

design no significant differences were found

– Comparison of 2001 BR to 2002 BR• Checks both migration and software used to

load 2002 records• Year to Year Changes as Expected

15

Administrative RecordsInternal Revenue Service:• Business Master File (BMF)• Payroll tax returns• Business income tax returns

• Bureau of Labor Statistics (BLS):– Description: Industrial classification assigned by State

Employment Security Agencies as part of Covered Employment and Wages

• Social Security Administration– Applications for new Employer Identification Number (EIN)

16

Administrative Records

Over 100 Million administrative records are received each year.

17

Administrative Records Quality Assurance

Current Practices:• Stage 1:

– Tabulate distributions of variables on incoming files and compare to expected values.

– Unchanged with redesign, works on inputs• Stage 2:

– Basic Validity Test: Edits to assure each item has a valid form (valid states, data type, etc.)

– Ratio Edits: Examine Consistency of correlated data, I.e. Payroll per employee

– Data failing edits are replaced with imputed values and referred to an analyst for review

– Done as part of load to BR database– Process is similar to old, but all software rewritten for new

BR

18

Administrative Records Quality Assurance

• Current Practices:– Strengths:

• Identifies systematic file errors well– Weaknesses

• Lack of Macro-Level Post Processing Quality Assurance

• Communication • Identifying significant problems with large

cases

19

Administrative Records Quality Assurance

• Recommendations– Using SAS datasets that are created monthly

from the BR perform a routine macro-level review. – Creation of a Centralized Administrative Record

Tracking System– Standardization and Automation of all Current QA

Reports– Increase Ability to Identify Important Companies

with Missing or Inaccurate Administrative Records– Development of Systematic Review of Post-

Processing Administrative Record QA – Monitor Cost of Current Administrative Record

Quality Assurance Activities

20

Census Bureau Data Collections

Company Organization SurveyDescription: Register proving survey directed to

selected multiunit enterprisesContent

– Ownership or control by a United States parent– Ownership or control by a foreign parent– Inventory of establishments, verifying or collecting the following for

each:• Primary and secondary name• Physical location• EIN used for payroll tax reporting• SIC• Employment for pay period including March 12• First quarter and annual payroll• Year-end operating status

21

Census Bureau Data Collections

Economic CensusDescription: Enumeration of establishments in covered industriesContent for each establishment:

– Ownership or control by a parent enterprise– Locations of operation– Primary and secondary name– Physical location address– EIN used for payroll tax reporting– SIC and Type of Operation – Employment for pay period including March 12– First quarter and annual payroll– Dollar volume of business (value of shipments, sales, receipts,

revenue)– Year-end operating status– Value of products and services by category (selectively)– Other industry-specific content

22

Census Bureau Data Collections Quality Assurance

Current Practices:• Data Entry

– Independent Verification of samples – Data are re-keyed and difference adjudicated– Lots accepted or rejected based on error rates.

• Batch Update Operations– Basic Validity Test: Edits to assure each item has a valid

form (valid states, data type, etc.)– Ratio Edits: Examine Consistency of correlated data, I.e.

Payroll per employee– Data failing edits are replaced with imputed values and

referred to an analyst for review– Done as part of load to BR database– Process is similar to old, but all software rewritten for new

BR

23

Census Bureau Data Collections Quality Assurance

Current Practices:• Clerical Operations

– A second person that is qualified as a verifier selects and inspects a sample of the referrals from each completed work unit (dependent verification);

– Rejected work units subjected to 100% re-inspection

– Note “old” SSEL had functionality to hold corrections until they passed inspections

24

Additional QA Team Recommendations• Improve Error Tracking • Improve Imputation for missing Employment and

Payroll Values• Evaluate ORACLE DQI (Data Quality Inspector)

as way to identify problems• Expand use of SAS datasets built from the BR to

assess quality• Review and documentation of user needs and

how the BR meets those needs• Comparison to Bureau of Labor Statistics (BLS)

Business Establishment List (BEL)-

25

Conclusion

• No identifiable difference in quality of new BR and old SSEL

• Most procedures remain same• Migration completed accurately• Concerns

– Clerical processing– Dependence on staff expertise

• Several Areas for Potential Improvements