Test Data Privacy Best Practices Methodology

  • Upload
    aletha

  • View
    49

  • Download
    1

Embed Size (px)

DESCRIPTION

Test Data Privacy Best Practices Methodology . Bill Mackey Subject Matter Expert. Introduction Why Do Companies Care About Data Privacy? . Worldwide Data Privacy Drivers. Regulatory Compliance… United States Gramm-Leach-Bliley Act, Sarbanes-Oxley Act - PowerPoint PPT Presentation

Citation preview

Title of Presentation

Test Data Privacy Best Practices Methodology Bill MackeySubject Matter Expert1Introduction

Why Do Companies Care About Data Privacy? 2Worldwide Data Privacy DriversRegulatory ComplianceUnited States Gramm-Leach-Bliley Act, Sarbanes-Oxley ActEuropean Union Personal Data Protection Directive, 1998 Health Insurance Portability and Accountability Act (HIPAA) Australia Privacy Amendment Act of 2000Japanese Personal Information Protection LawCanadian Personal Information Protection and Electronic Documents Act (PIPEDA) Internal auditors are forcing data protection controls and procedures, especially for offshore use/outsourcing arrangementsRisk of exposure can cause significant damage Corporate embarrassment, lawsuits, negative press, fines/penalties, loss of customers, etc.3At Compuware, we are seeing 3 primary Data Privacy Business drivers.

First, Federal, state and country regulations such as Sarbanes Oxley, Gramm, Leach, Bliley Act and the European Union Personal Data Protection Directive, are requiring that companies scrutinize the way they handle sensitive data.Secondly, we are seeing that internal auditors are playing an important role in enforcing data protection controls and procedures, especially for offshore development or support.And lastly, many companies are taking measures to protect personal or customer information because they want to mitigate the risk of exposure that could cause significant damage to their organization in the form of lawsuits, negative press, fines, penalties and even loss of customers.Data Breaches Reported Since the ChoicePoint Incident 2846 Incidents Reported Between 2-15-05 1-19-12543,066,426 Consumers Impacted

The catalyst for reporting data breaches to the affected individuals has been the California law that requires notice of security breaches, the first of its kind in the nation, implemented July 2003. Personal information compromised includes data elements useful to identity thieves, such as Social Security numbers, account numbers, and driver's license numbers.

A Chronology of Data Breaches Reported Since the ChoicePoint IncidentPrivacy Rights Clearinghouse, January 19, 20124Privacy Rights Clearinghouse, a nonprofit consumer information and advocacy organization actually tracks data breaches and have been doing so since the ChoicePoint incident in February of 2005. Their research indicates that there has been at least 480 reported data breaches from that starting point through April 8th impacting over 153 million consumers.

The catalyst for reporting data breaches has been the California Senate bill 1386 which requires notice of security breaches to California residents that are affected.And of course, the kind of information that the thieves are looking for are data elements such as Social Security numbers, account number, drivers license numbers and etc.

How are Companies Addressing this Issue?Signing non-disclosure agreementsRestricting security access to sensitive/confidential dataApplying minimal de-identifying rulesImplementing a complete data disguise solution with processes and proceduresLow EffectivenessHigh Effectiveness5But not all of companies address Data Privacy in the same manner. Today, they are taking different actions, for different reasons:

Some are doing nothing, and assuming great risk.Some require anyone with access to sensitive data to sign non-disclosures, in an effort to shift the liability.Some only allow minimal or no access to production data while continuing to assume the risk for those who do have access.Some simply blank out, X-out, or zero-out sensitive data fields.

But those organizations that are maximizing compliance and minimizing their risk; are those that are implementing a complete solution including a well defined methodology that wraps processes and procedures around the technology.

Best Practices ApproachtoData Privacy 6Technology alone is not the answer7ServicesRepeatable Best Practices AssessmentImplementationSuperior Expertise with3rd Party SoftwareFinancialHealthcareGovernmentMeet dates within high risk projects

TechnologyRelated Data Extraction Data Sub-settingData Format Conversion Disguise Rules DefinitionCommon Rules Across the EnterpriseUnified Rules RepositorySupport for Mainframe and Distributed EnvironmentsRoles Based AuthorizationAudit and ReportingMethodologyData Analysis Analyze metadata Discover PII Classify data DesignAssociate disguise rules Define extract criteria Identify target environment(s)Identify load method(s)Define population strategyDevelopExtract data and relationshipsApply rules across data sourcesLoad dataDeliverProduce reportsAudit resultsEnable best practicesComprehensive SolutionBased upon numerous Data Privacy engagements, weve learned that it takes more than just technology to successfully implement a Test Data Privacy Solution.

It takes great tools, but more importantly, it takes a proven methodology to accurately complete the job. At Compuware we also have a Solutions Delivery Group that can deliver whatever level of service our customers need to make them successful.

Deliver Deploy and maintain data protection processesDevelop Build the processes to disguise test dataDesign Define strategies for disguising test dataProcess: Data Privacy MethodologyAnalyze Understand each applications sensitive information8So let me introduce Compuwares Methodology for Data Privacy.

As with many other software development life cycles used by IT professionals, our methodology is structured in four major and easily recognizable phases as described in this slide: Analysis, Design, Development, and Deliver.

Regardless of the application development standards in place, our model integrates with existing practices, and provides a foundation for implementing Data Privacy Assurance in the Application Testing Environment. Data Privacy Best Practices9

999Script: Behind the methodology template Compuware also provides a Test Data Privacy Best Practices Guide which discusses each step of the process in detail and shares advice and recommendations based upon our experience having undertaken data privacy projects with many clients. As you can see from the highlighted Design Phase part, the guide is very detailed.

Data Privacy Project Plan10

101010Script:

.

As part of the methodology Compuware also delivers a Microsoft Project template that documents each phase of the project in detail. The template can easily be customized to accommodate client specific needs and also acts as a central repository for the project documentation. This ensures that over time none of the analysis and design work is lost and is available to be referenced for subsequent disguise projects. Notice the four phases we mentioned on the previous slide. Each phase of the project breaks down a series of tasks, these tasks are fully documented to help the project team fully understand the purpose and scope of each task as they work through the project.

Data Privacy

Best Practices Process Overview1111Deployment ApproachesTwo project approaches: Progressive: Organizations that have large numbers of applications and multiple lines of business benefit more from a progressive approach. The progressive approach builds upon the success of early efforts, building up a library of disguise routines and process definitions that align with existing projects within the organization. Parallel: Organizations that have small to medium numbers of applications benefit more from the parallel approach. The parallel approach covers a wider range of applications at the same time, which is possible when the applications are less intertwined or more independent. Both approaches use a risk based methodology.

12Operational StructureCentralized- A single team responsible for performing the data masking function for all lines of business or application areas. This organization is also often referred to as a center of excellence model. Benefits Fewer resources need to be trained on the data disguise software and activities;Increased control over consistency of the disguise techniques and behavior; and Increased productivity of these resources as they work across applications. Drawbacks Increased effort during the Analyze phase as these resources gain the necessary application centric knowledge; Increased duration as there are typically less of these resources, so more effort with less people results in long duration.Decentralized- Each application group is responsible for the data masking functions. BenefitsExisting application domain knowledge can be leveraged; The duration of Analyze phase may be shortened as activities can be performed in parallel; and This model streamlines the communication model between the groups. Drawbacks Increased effort related to training; and Increased demand on communications in order to maintain consistency.

13Process: How we get there Establish an actionable roadmap Determine the scope Establish a strategy Identify constraints (internal and external) Select the technology Recognized and adaptable Support multiple environments, platforms, & techniques Partner to gain the experience Minimize first time hurdles, pit-falls, & dead-ends Maximize analysis and design efficiencyLevel Set Us, the Market, and You

Best Practices - Achieving your objectives

Experience What is being done

14Project Overview Planning15

15Project Phases 16Perform the Analyze methodology phaseData Model AnalysisFunction Model AnalysisPerform the Design methodology phaseDesign extract processDesign disguise techniques Design load processPerform the Develop methodology phaseCreation and population of Translation/Association tablesCreation and population of Encryption keysDevelopment and Unit Testing of Extract/Disguise/Load tasksPerform the Deliver methodology phaseCreate the repeatable processAuthor: Stuart Feravich, Bill PettyDate: March 28, 2011

Introductions: All participates, including their Job FunctionProject Overview: Anjan and Stuart; Quick recap including the overall project schedule and FormatAnalyze Overview: Stuart and Gigi; Discuss the methodology and review the deliverablesRoles and Responsibilities: Stuart, Gigi, and Anjan: We would like to put names to the roles for the Franklin Templeton resourcesAnalyze Workshop: FT Resources, Bill Petty, and Stuart: Take one object (table/file) from the GID or INVESTAR application and step through the workbooks.16Data Privacy

AnalysisPhase1717Analysis 18Analysis phase can be broken down into two major activities: Identification and documentation of the data model (DM), identification and documentation of the functional model (FM) components of the application. These two activities provide the cornerstone for a Data Privacy initiative, and as such, are arguably the most critical of the entire project scope.18Managing Analysis Tasks19

1919 Dont worry, The project template supplied with the methodology will guide you. If youll look closely youll see a breakdown of Analysis activities in the project plan, illustrating how each task has an informational note to describe it. The notes icon next to each task points to additional information to help the project team to fully understand the purpose and scope of each task as they work through the project.

In addition to these notes, each task provides links to additional documentation templates, such as spreadsheets and workbooks, where the project team can record additional documentation, or retrieve any other objects that can be accessed through a hyperlink. The project plan is customizable to our clients specific needs.

As you can see, this project template is a very well thought-out guide and eliminates the costly How do we get started? and Whats next? type problems encountered in many un-managed disguise projects.

These documents are part of the reason weve been so successful in this space, and a big reason our customers have made us the market leader.

Data Model Analysis20The goal of the Data Model Analysis activities is to provide knowledge about the environments data.

determine the elements that are considered sensitive define their association to other data objects.

20Data Privacy_1.1.1.4_Data_Model_Analysis

Function Model Analysis22identifies and documents information about the application processes.

determine what business rules and logic apply to the data considered sensitive or private. Outline how the affected data should be changed.Identify all data validations and checks done against sensitive fields within the application programs.

22Analysis Tasks23CONTACT_TBLPK,FK1CUSTOMER_NUMBERPKCONTACT_IDCONTACT_NAMETITLECONTACT_CODEADDRESSCITYSTATEZIP_CODECOUNTRYAREA_CODETELEPHONE_NUMPART_TBLPKPART_NUMBERPART_NAMEEFFECT_DATEEQUIVALENT_PARTPURCH_PRICESETUP_COSTLABOR_COSTUNIT_OF_MEASUREMATERIAL_COSTREWORK_COSTAVAILABILITY_INDENGR_DRAW_NUMORDER_LINE_TBLPK,FK1ORDER_NUMPKORDER_LINE_NUMBERFK2PART_NUMPLAN_QTYUNITS_COMPLETEUNITS_STARTEDSCRAP_QTYSTART_DATELINE_STATUSCUSTOMER_HIST_TBLCUSTOMER_ROWIDCUSTOMER_NUMBERCOMPANY_NAMETELEPHONE_NUMCONTACT_NAMECONTACT_TITLESUPPLIER_TBLPK,FK1PART_NUMBERPKSUPPLIER_CODESUPPLIER_NAMESUPPLIER_MODEL_NUMWHOLESALE_PRICEDISCOUNT_QUANTITYPREFERRED_SUPPLIERLEAD_TIMELEAD_TIME_UNITSORDER_TBLPKORDER_NUMBERFK1CUST_NUMSOC_SEC_NUMCREDIT_CARD_NUMMOTHERS_MAID_NAMEORD_TYPEORD_DATEORD_STATORD_AMOUNTORD_DEPOSITORD_LINE_COUNTSHIP_CODESHIP_DATEORD_DESCRIPTIONCUSTOMER_TBLPKCUSTOMER_NUMBERCOMPANY_NAMEADDRESSCITYSTATEZIP_CODECOUNTRYAREA_CODETELEPHONE_NUMCONTACT_NAMECONTACT_TITLECONTACT_ADDRCONTACT_CITYCONTACT_STATECONTACT_ZIPCONTACT_COUNTRYCONTACT_AREA_CDCONTACT_TELEPHONE

Data Modeling ToolsData Management ToolsFile-AID/DB2 / DBA-Xpert Impact AnalysisFile-AID/Data Solutions Analysis

2323In summary, The Analysis phase is about understanding the application data and locating the sensitive information.

Heres an example of just three of the types of resources that could help accomplish individual tasks. As part of the analysis project, a hyperlink to a data model diagram, a DBA-XPERT database impact report, or a File-AID Data Analysis Report can be created and accessed in order to document data relationships defined within an application. No matter where the information resides, the project plan will help to obtain and document it.

Utilize Technology For Analysis

Understand the Sensitive Elements

Document Analysis Results

Data Privacy_1.1.1.5_Data_Model_Analysis

Design Overview 28Design is the second phase of the Compuware Data Privacy Best Practices methodology and it is broken down into three major activities: Documentation of the Data Extracts to be createdIdentification and documentation of the data disguise rules to be created/implementedDocumentation of the Data Loads to be created These activities provide the background for the creation of the actual rules and specifications to create a Disguised copy of the data28Design29Define application disguise strategy and processField-level disguise rules (encrypt, translate, age, generate) Source extract criteria for data (filters, naming conventions, etc.)Security rules for supporting filesStructure, value domain (content), population strategy for translate table(s)Target environment(s) and load method(s) to be used2929The Design phase focuses on defining a strategy for the privacy solution, determining things like what method of disguise do we use on specific fields, Ill discuss this further in a minute; How data is to be selected and extracted from the source environment; If we are to use replacement data values, where these values are to come from and whether we need to set up any specific security rules to protect project data that may itself be sensitive. The design also covers how the target environment is to be built and loaded.

Managing Design Tasks

30As the project transitions into the Design phase, you can see that the breakdown of Design tasks include the same type of embedded information, leading the project team through the definition of privacy strategies, and the specification of detailed extract, disguise and load rules.

Because the methodology flows from one activity to the next, a properly executed and thoroughly documented Analysis, will lead to the creation of accurate and reusable design blueprints for each application disguise strategy.

This in turn will enable a much more rapid and less costly development phase.

Data Extract Design

31Identifies the required information to extract the data from the original source tables/files/environments. Includes the following: environmental data (region, subsystem, server, etc), driving object identification (which table/file do we drive the extract from), selection criteria information, extract specific information needed to pull the needed information from the source tables/files. Finally, the overall extract execution strategy will be documented (when to execute, frequency of execution, etc)

31Data Disguise Design32Takes the fields to be disguised and begin to scope out what exactly will be done to these fields to create a disguised test environment. Identifies the specific disguise techniqueselection criteria to be appliedfield masking to be appliedIf any translations will be done, the Translation Table information is also documented (creation data, fields to be created, etc).

32Data Disguise TechniquesReplace sensitive values with meaningful, readable data using a translation tableGenerate fictitious data from scratch or from some other sourceReplace sensitive values with formulated data based on a user-defined keyReplace sensitive dates consistently while maintaining the integrity of a date field Conceal partial fieldsEncryptTranslateAgeMaskGenerate33This is a closer look at some of the various disguise methods that are available to assist in building and implementing disguise rules. These methods are all available in our File-AID products. The specific method chosen will be determined during the design phase, based on specific privacy needs, and then implemented during the development phase.

You may need to expand on some of these. This is for reference purposes only.Encryption - Replace sensitive values with formulated data based on a user-defined key. This is much like scrambling the data.Translation - Replace sensitive values with meaningful, readable data using a translation table / look up table.Aging - Replace sensitive dates consistently while maintaining the integrity of a date field. This allows dates to be aged according to the business need. Masking - This provides the ability to conceal only portions of a field by Xing certain portions out.Generation - Generate fictitious data from scratch or from some other source

Data Privacy_1.2.2.1_Disguise Rule Design

34

34Data Privacy_1.2.2.3_Disguise Rule Design

35

35Data Privacy_1.2.3.3_Data Load Design

36

36Data Privacy_1.2.3.4_Data Load Design

37

37Data Privacy Develop Phase3838Develop Phase 39

39Develop40Subset ExtractLoad Maintain Integrity

BuildTestValidatez/OSDistributed

Testz/OSDistributed

ProductionData Privacy Manager

404040 Script:

The first step in Development is obtaining the source data. This involves the construction of extract processes while taking into consideration the existing complex relationships, and applying selection criteria techniques to obtain the right set of data. These complex relationships include both the relationships defined within the databases management system, and the relationships coded into the application code.

As part of this, the actual disguise rules down to the field level need to be defined according to the strategies determined in the analysis and design phases.

The final step focuses on making the disguised data reach its destination, through a load process. This results in a quality test environment which maintains integrity, offers consistent results, adheres to privacy policy, and is useful.

Compuware Test Data Privacy solution offers technology that allows data to be extracted, disguised and loaded across both the mainframe and distributed environments.

Develop - z/OS Relationships41

AR/RIProduction

z/OS4141 On the mainframe side we have a relationship repository that contains details of all of the database managed relationships and application managed relationships for our environment. This is the part that ensures that we selectively extract the data we need in order for our application to run. This information is also used in the disguise process to make sure that new disguised values are correctly propagated across the related fields and columns, this maintains the data integrity.

Develop - z/OS Extract42

z/OSProductionSubsetExtract

4242In building the extract we have full control as to where we start, which files or tables we wish to included or exclude, how we chase data between objects, for example for a given relationship do we want to only select parent rows; what selection criteria and quantity limits we wish to impose and whether we wish to disguise the data at extract time. Notice in the highlighted area the WORD DISGUISED this indicates that this extract will also disguise the data at extract time.

The bottom line here is that this allows you to get just the data you need and ensure it meets your privacy obligations.

Develop - Distributed Related Extract43

DistributedProductionSubsetExtract

4343For organizations that have application spanning mainframe and distributed, the good news is that Compuwares test data privacy solution also covers the distributed environment, in addition to the mainframe, offering and end to end solution! In the distributed world, our Test Data Privacy Solution supports Oracle, SQL-Server, DB2 UDB and Sybase.As with the mainframe we support database managed relationships as well as application defined relationships and offer full control as to what we extract and how we extract. .

Develop - Disguise44

BuildTestValidate

Test Data PrivacyManager

4444As we discussed earlier, Compuwares Test Data Privacy solution provides many options as to how you disguise the data, generally Date Ageing, Translation, Generation and Encryption are the main methods we see deployed, in addition to field exits which allows custom built disguise methods to be also used. For example, an organization may have very specific requirements on how ID numbers or client account numbers are processed, field exits allow these site specific needs to be accommodated. Develop - z/OS Load45

DisguisedExtractLoad Maintain IntegrityTest

z/OS4545Like the extract, the load process is very flexible. This is where we now take the disguised data and populate the target environment. In our options here we can control which objects we load, just because we extracted something does not necessarily mean we wish to load it, for example suppose in our testing we have corrupted one table, then we can come back and just refresh that table and not reload all of the tables we originally extracted. We also have full control over what we do with existing data if we are loading into an existing environment. This can be helpful if we needed to supplement the data we already have in order to satisfy additional test cases.

Develop - Distributed Load 46

Test

LoadMaintainIntegrity

ExtractFile

Distributed4646The load here in the distributed world is as flexible and easy to use as the mainframe side. We even have the capability to use a mainframe extract file as input to the distributed load making it really easy to mirror a mainframe based table in the distributed environment.

Validate Results47

47 Execution Reports48

48 Audit Reports

49

49Data Privacy

Deliver Phase5050DeliverProductionTestSystem TestUnit TestQA TestAcceptance TestApply Privacy RulesSubset ExtractLoad Maintain integrityDataPrivacy Manager

z/OSDistributed

z/OSDistributed

z/OSDistributed

z/OSDistributed

z/OSDistributed

z/OS

z/OS

z/OS

z/OS

z/OSDistributedPrivacy Audit Reports

51The Delivery phase is the implementation and execution of the Data Privacy Project.

Analysis has been completed, the extract, disguise, and load strategies have been designed, developed, tested, and validated; and now the process can be deployed across the different test environments.

This will provide the customer with an enterprise-wide process, that is well documented, repeatable, and can be audited for compliance.

Managing Delivery Tasks

SystemUnitQAAcceptanceFictionalized Data

Privacy Audit Reports

52The final phase in the project plan is the Delivery Phase.

This is where the privacy solution is rolled out across all of the application testing environments in the organization. Since the project templates are so carefully thought out and organized, the final outcome is a fully documented and auditable process that results in fictionalized test data that not only complies with the data privacy regulations, but seamlessly integrates into your current application testing standards and practices.

Deliver - Disguise Rule Administration53DisguiseRules

Test Data Privacy Manager

5353Over time requirements may change or disguise rules need to be checked. The disguise rule administration allows rules to be easily view and/or changed. This enables organizations to adapt to an ever changing application and legislative landscape.Document - Extract & Disguise Reports54

545454Script:

Reporting and auditing is another unique strength of Compuwares solution. We provide not only meaningful end user reports, an example shown here, but also critical audit reports as well.

Each type of report provides the proper level of detail needed by their intended audience. Well see an example of an audit report on the next slide.

Examples like these can be captured and stored in the project templates, showing what was extracted and which fields had disguise rules applied against them.

Document - Audit Reports 55

555555Script:

To validate compliance, auditors and/or risk management personal will require more detailed supporting documentation, this too can be generated and stored if needed for future reference.

In this slide, you can see that Customer Name and Social Security Number have been disguised. Notice also that Audit reports are provided for both the mainframe and distributed environments.

Data Privacy_1.4.1_Deliver Execution Sequence

56

56Data Privacy_1.4.1.1_Deliver Execution Sequence

57

57Data Privacy SolutionProduct TechnologyTools that can deliver quality data that meets the integrity, consistency and usability demands of your data privacy requirementsProcessA clear strategy backed up by a methodology that serves as a roadmap or blueprint for an enterprise-wide data privacy initiativeExpertiseThe knowledge and experience to effectively manage the process and drive the technology to implement data privacy assurance in the application testing environment58We have also learned that in order for an organization to achieve those conditions successfully, a Data Privacy solution requires more than a de-identification, scrubbing or masking tool.

Compuware offers a unique and comprehensive Data Privacy solution that involves a combination of processes, technology, and expertise to productively drive a data privacy initiative through all phases of its life cycle.

59 2011 Compuware Corporation All Rights Reserved