Test Data Privacy Best Practices Methodology Bill Mackey Subject Matter Expert

Preview:

Citation preview

Test Data Privacy Best Practices Methodology Bill MackeySubject Matter Expert

2

Introduction

Why Do Companies Care About Data Privacy?

Worldwide Data Privacy Drivers

• Regulatory Compliance…– United States Gramm-Leach-Bliley Act, Sarbanes-Oxley Act– European Union Personal Data Protection Directive, 1998 – Health Insurance Portability and Accountability Act (HIPAA) – Australia Privacy Amendment Act of 2000– Japanese Personal Information Protection Law– Canadian Personal Information Protection and Electronic Documents

Act (PIPEDA)

• Internal auditors are forcing data protection controls and procedures, especially for offshore use/outsourcing arrangements

• Risk of exposure can cause significant damage – Corporate embarrassment, lawsuits, negative press, fines/penalties,

loss of customers, etc.

Data Breaches Reported Since the ChoicePoint Incident

2846 Incidents Reported Between 2-15-05 – 1-19-12543,066,426 Consumers Impacted

• The catalyst for reporting data breaches to the affected individuals has been the California law that requires notice of security breaches, the first of its kind in the nation, implemented July 2003.

• Personal information compromised includes data elements useful to identity thieves, such as Social Security numbers, account numbers, and driver's license numbers.

A Chronology of Data Breaches Reported Since the ChoicePoint Incident

Privacy Rights Clearinghouse, January 19, 2012

How are Companies Addressing this Issue?

• Signing non-disclosure agreements

• Restricting security access to sensitive/confidential data

• Applying minimal “de-identifying” rules

• Implementing a complete data disguise solution with processes and procedures

Low Effectiveness

High Effectiveness

6

Best Practices ApproachtoData Privacy

7

Technology alone is not the answer

Services

• Repeatable Best Practices • Assessment• Implementation• Superior Expertise with

o 3rd Party Software

o Financial

o Healthcare

o Government

• Meet dates within high risk projects

Technology• Related Data Extraction

• Data Sub-setting

• Data Format Conversion

• Disguise Rules Definition

• Common Rules Across the Enterprise

• Unified Rules Repository

• Support for Mainframe and Distributed Environments

• Roles Based Authorization

• Audit and Reporting

Methodology

• Data Analysis o Analyze metadata o Discover PII o Classify data

• Designo Associate disguise rules o Define extract criteria o Identify target environment(s)o Identify load method(s)o Define population strategy

• Developo Extract data and relationshipso Apply rules across data sourceso Load data

• Delivero Produce reportso Audit resultso Enable best practices

Comprehensive Solution

Deliver – Deploy and maintain data protection processes

Develop – Build the processes to disguise test data

Design – Define strategies for disguising test data

Process: Data Privacy Methodology

Analyze – Understand each application’s sensitive information

Data Privacy Best Practices

9

Data Privacy Project Plan

10

11

Data Privacy

Best Practices Process Overview

Deployment Approaches• Two project approaches:

– Progressive: Organizations that have large numbers of applications and multiple lines of business benefit more from a progressive approach. The progressive approach builds upon the success of early efforts, building up a library of disguise routines and process definitions that align with existing projects within the organization.

– Parallel: Organizations that have small to medium numbers of applications benefit more from the parallel approach. The parallel approach covers a wider range of applications at the same time, which is possible when the applications are less intertwined or more independent. Both approaches use a risk based methodology.

Operational StructureCentralized- A single team responsible for performing the data masking function for all lines of

business or application areas. This organization is also often referred to as a center of excellence model. Benefits

Fewer resources need to be trained on the data disguise software and activities;Increased control over consistency of the disguise techniques and behavior; and Increased productivity of these resources as they work across applications.

Drawbacks Increased effort during the Analyze phase as these resources gain the necessary application centric

knowledge; Increased duration as there are typically less of these resources, so more effort with less people results in long

duration.

Decentralized- Each application group is responsible for the data masking functions. Benefits

Existing application domain knowledge can be leveraged; The duration of Analyze phase may be shortened as activities can be performed in parallel; and This model streamlines the communication model between the groups.

Drawbacks Increased effort related to training; and Increased demand on communications in order to maintain consistency.

Process: How we get there

• Establish an actionable roadmap• Determine the scope

• Establish a strategy

• Identify constraints (internal and external)

• Select the technology• Recognized and adaptable

• Support multiple environments, platforms, & techniques

• Partner to gain the experience• Minimize first time hurdles, pit-falls, & dead-ends

• Maximize analysis and design efficiency

Project Overview – Planning

15

Project Phases

16

Perform the Analyze methodology phase Data Model Analysis Function Model Analysis

Perform the Design methodology phase Design extract process Design disguise techniques Design load process

Perform the Develop methodology phase Creation and population of Translation/Association tables Creation and population of Encryption keys Development and Unit Testing of Extract/Disguise/Load tasks

Perform the Deliver methodology phase Create the repeatable process

17

Data Privacy

AnalysisPhase

Analysis

18

Analysis phase can be broken down into two major activities: – Identification and documentation of the data

model (DM), – identification and documentation of the

functional model (FM) components of the application.

These two activities provide the cornerstone for a Data Privacy initiative, and as such, are arguably the most critical of the entire project scope.

Managing Analysis Tasks

19

Data Model Analysis

20

The goal of the Data Model Analysis activities is to provide knowledge about the environment’s data.

• determine the elements that are considered sensitive

• define their association to other data objects.

Data Privacy_1.1.1.4_Data_Model_Analysis

Function Model Analysis

22

identifies and documents information about the application processes.

• determine what business rules and logic apply to the data considered sensitive or private.

• Outline how the affected data should be changed.

• Identify all data validations and checks done against sensitive fields within the application programs.

Analysis Tasks

23

CONTACT _ TBL

PK , FK 1 CUSTOMER _ NUMBERPK CONTACT _ ID

CONTACT _ NAMETITLECONTACT _ CODEADDRESS

CITYSTATE

ZIP _ CODECOUNTRYAREA _ CODETELEPHONE _ NUM

PART _ TBL

PK PART _ NUMBER

PART _ NAMEEFFECT _ DATEEQUIVALENT _ PART

PURCH _ PRICESETUP _ COSTLABOR _ COSTUNIT _ OF _ MEASUREMATERIAL _ COSTREWORK _ COSTAVAILABILITY _ IND

ENGR _ DRAW _ NUM

ORDER _ LINE _ TBL

PK , FK 1 ORDER _ NUMPK ORDER _ LINE _ NUMBER

FK 2 PART _ NUMPLAN _ QTYUNITS _ COMPLETEUNITS _ STARTEDSCRAP _ QTYSTART _ DATELINE _ STATUS

CUSTOMER _ HIST _ TBL

CUSTOMER _ ROWIDCUSTOMER _ NUMBERCOMPANY _ NAMETELEPHONE _ NUMCONTACT _ NAMECONTACT _ TITLE

SUPPLIER _ TBL

PK , FK 1 PART _ NUMBERPK SUPPLIER _ CODE

SUPPLIER _ NAMESUPPLIER _ MODEL _ NUMWHOLESALE _ PRICEDISCOUNT _ QUANTITYPREFERRED _ SUPPLIER

LEAD _ TIMELEAD _ TIME _ UNITS

ORDER _ TBL

PK ORDER _ NUMBER

FK 1 CUST _ NUMSOC _ SEC _ NUMCREDIT _ CARD _ NUMMOTHERS _ MAID _ NAME

ORD _ TYPEORD _ DATEORD _ STATORD _ AMOUNTORD _ DEPOSITORD _ LINE _ COUNTSHIP _ CODESHIP _ DATEORD _ DESCRIPTION

CUSTOMER _ TBL

PK CUSTOMER _ NUMBER

COMPANY _ NAMEADDRESS

CITYSTATE

ZIP _ CODECOUNTRYAREA _ CODETELEPHONE _ NUMCONTACT _ NAMECONTACT _ TITLECONTACT _ ADDRCONTACT _ CITYCONTACT _ STATECONTACT _ ZIPCONTACT _ COUNTRYCONTACT _ AREA _ CDCONTACT _ TELEPHONE

Data Modeling Tools Data Management ToolsFile-AID/DB2 / DBA-Xpert Impact Analysis

File-AID/Data Solutions Analysis

Utilize Technology For Analysis

Understand the Sensitive Elements

Document Analysis Results

Data Privacy_1.1.1.5_Data_Model_Analysis

Design Overview

28

Design is the second phase of the Compuware Data Privacy Best Practices methodology and it is broken down into three major activities:

– Documentation of the Data Extracts to be created

– Identification and documentation of the data disguise rules to be created/implemented

– Documentation of the Data Loads to be created

These activities provide the background for the creation of the actual rules and specifications to create a Disguised copy of the data

Design

29

Define application disguise strategy and process– Field-level disguise rules

(encrypt, translate, age, generate) – Source extract criteria for data

(filters, naming conventions, etc.)– Security rules for supporting files– Structure, value domain (content),

population strategy for translate table(s)– Target environment(s) and load method(s) to be

used

Managing Design Tasks

Data Extract Design

31

Identifies the required information to extract the data from the original source tables/files/environments.

• Includes the following: – environmental data (region, subsystem, server, etc),

– driving object identification (which table/file do we drive the extract from),

– selection criteria information,

– extract specific information needed to pull the needed information from the source tables/files.

• Finally, the overall extract execution strategy will be documented (when to execute, frequency of execution, etc)

Data Disguise Design

32

• Takes the fields to be disguised and begin to scope out what exactly will be done to these fields to create a disguised test environment.

• Identifies the specific disguise technique

• selection criteria to be applied

• field masking to be applied

• If any translations will be done, the Translation Table information is also documented (creation data, fields to be created, etc).

Data Disguise Techniques

Replace sensitive values with meaningful, readable data using a translation table

Generate fictitious data from scratch or from some other source

Replace sensitive values with formulated data based on a user-defined key

Replace sensitive dates consistently while maintaining the integrity of a date field

Conceal partial fields

Encrypt

Translate

Age

Mask

Generate

Data Privacy_1.2.2.1_Disguise Rule Design

34

Data Privacy_1.2.2.3_Disguise Rule Design

35

Data Privacy_1.2.3.3_Data Load Design

36

Data Privacy_1.2.3.4_Data Load Design

37

38

Data Privacy Develop Phase

Develop Phase

39

Develop

40

Subset Extract

Load Maintain Integrity

• Build• Test• Validate

z/OS

Distributed

Test

z/OS

Distributed

Production

Data Privacy Manager

Develop - z/OS Relationships

41

AR/RI

Production

z/OS

Develop - z/OS Extract

42

z/OS

Production

SubsetExtract

Develop - Distributed Related Extract

43

Distributed

Production

SubsetExtract

Develop - Disguise

44

• Build• Test• Validate

Test Data PrivacyManager

Develop - z/OS Load

45

DisguisedExtract

Load Maintain Integrity

Test

z/OS

Develop - Distributed Load

46

Test

LoadMaintainIntegrity

ExtractFile

Distributed

Validate Results

47

Execution Reports

48

Audit Reports

49

50

Data Privacy

Deliver Phase

Deliver

Production TestSystem TestUnit Test

QA TestAcceptance Test

Apply Privacy Rules

Subset Extract

Load Maintain integrity

DataPrivacy Manager

z/OS

Distributed

z/OS

Distributed

z/OS

Distributed

z/OS

Distributed

z/OS

Distributed

z/OSz/OSz/OSz/OSz/OS

DistributedPrivacy Audit Reports

Managing Delivery Tasks

SystemUnit

QAAcceptance

Fictionalized Data

Privacy Audit Reports

Deliver - Disguise Rule Administration

53

DisguiseRules

Test Data Privacy Manager

Document - Extract & Disguise Reports

54

Document - Audit Reports

55

Data Privacy_1.4.1_Deliver Execution Sequence

56

Data Privacy_1.4.1.1_Deliver Execution Sequence

57

Data Privacy Solution

Product TechnologyTools that can deliver quality data that meets the integrity, consistency and usability demands of your data privacy requirements

ProcessA clear strategy backed up by a methodology that serves as a roadmap or blueprint for an enterprise-wide data privacy initiative

ExpertiseThe knowledge and experience to effectively manage the process and drive the technology to implement data privacy assurance in the application testing environment

© 2011 Compuware Corporation — All Rights Reserved

59

Recommended