26
© 2011 IBM Corporation Information Management InfoSphere Optim Test Data Management Solution– IMS Focus Peter Costigan – Product Line Manager, Optim Solutions 9/28/2011

Optim test data management for IMS 2011

Embed Size (px)

Citation preview

Page 1: Optim test data management for IMS 2011

© 2011 IBM Corporation

Information Management

InfoSphere Optim Test Data Management Solution– IMS Focus

Peter Costigan – Product Line Manager, Optim Solutions9/28/2011

Page 2: Optim test data management for IMS 2011

© 2011 IBM Corporation2

Information Management

Agenda

Information Governance Overview

Risks and Challenges of Poor Test Data Management

Best Practices in Test Data Management

InfoSphere Optim Test Data Management

Data Privacy Concerns with Non-Production Data

IMS and z/OS Considerations

Other InfoSphere Optim Solutions: Discovery, Archiving, Application Retirement

Conclusion

Page 3: Optim test data management for IMS 2011

© 2011 IBM Corporation3

Information Management

Transactional & Collaborative Applications

Business Analytics Applications

External Information Sources

Mastering information across the Information Supply Chain

Trusted Relevant Governed

Analyze

Integrate

Manage Cubes

Streams

Big Data

Master Data

Content

Data

StreamingInformation

Information Governance

Data Warehouses

ContentAnalytics

Govern

Quality Security & Privacy

Lifecycle Standards

Page 4: Optim test data management for IMS 2011

© 2011 IBM Corporation4

Information Management

Requirements to manage data across its lifecycle

Validate test resultsDefine policiesReport & retrieve

archived data

Enable compliance with retention &

e-discovery

Move only the needed information

Integrate into single data source

Create & refresh test data

Manage data growthClassify & define

data and relationships

Develop database structures & code

Enhance performance

Discover where data resides

Develop &Develop &TestTest

Discover &Discover &DefineDefine

Optimize, Archive Optimize, Archive & Access& Access

Consolidate &Consolidate &RetireRetire

Information Governance Core Disciplines Lifecycle Management

Page 5: Optim test data management for IMS 2011

© 2011 IBM Corporation5

Information Management

How test data creation is often accomplished

Positives Negatives Simple to do

Requires little knowledge of the data model or infrastructure

Creates an exact duplication of production

Uses more storage than needed, multiple times Production data is a privacy risk

Data model changes are expected in Dev/Test, but require significant manual rework

Takes much time to create and refresh

No way to compare to original after test is complete

Cannot span multiple data sources/applications

Developer/Tester downtime when sharing data accessibility

Clone

Production Database Test Database Development

Page 6: Optim test data management for IMS 2011

© 2011 IBM Corporation6

Information Management

Test Data Management Best Practices

TDM refers to the need to manage data used in testing and other non-production environments

Extract related subsets of production data that are targeted to functionality under test

De-identify / mask related test data to protect privacy

Quickly and easily refresh test environments

Edit data to create error and boundary conditions

Compare “before” and “after” images of test data

Benefits: Improving application quality & customer satisfaction

Page 7: Optim test data management for IMS 2011

© 2011 IBM Corporation7

Information Management

Optim Captures Complete Business Objects

Business data is related across a wide variety of data sources

Page 8: Optim test data management for IMS 2011

© 2011 IBM Corporation8

Information Management

InfoSphere Optim Test Data Management Solution

100 GB100 GB

25 GB

50 GB50 GB

Create targeted, right-sized test environments

Automate support for Data Model changes

Replace sensitive data with masked data

Refresh, reset and maintain test environments

Compare and resolve application defects

Accelerate release schedules

Production or Production Clone

25 GB

2TB

Development

Unit Test

TrainingIntegration

Test

Extract

Related subsetsMask / RemapInsert / Update / LoadCompare

Page 9: Optim test data management for IMS 2011

© 2011 IBM Corporation9

Information Management

Business benefits of Test Data Management More time for testing

– In many organizations, 30-40% of test script execution is spent on manufacturing new test data. Test Data Management will reduce the amount of time spent creating new data thereby allowing for the execution of more tests

Reduce cost– Maximize allocated disk space– Catch errors earlier in the testing cycle– Shift errors from production to test

Increase data quality– Refreshing test data from a baseline will minimize the amount of manual

intervention currently required when creating new test data reducing triaging efforts and increasing test repeatability

Enforce data ownership– Often the “honor system” and spreadsheets are used to control test data

ownership. Test Data Management offers role driven security to support level segmentation of the development and testing teams

Reduce data dependencies across test sets– Multiple test sets often use the same data, but different tests can negatively

impact other tests using the same data. Test Data Management allows for the creation of an unlimited number of test data sets and can create unique IDs each time to ensue clean data is used when testing

Page 10: Optim test data management for IMS 2011

© 2011 IBM Corporation10

Information Management

TDM Business Value Assessment: Detailed Financial Analysis

Page 11: Optim test data management for IMS 2011

© 2011 IBM Corporation11

Information Management

Sensitive Production Data: What’s the risk?

Hackers obtained personal information on 70 million subscribers. April 2011: Malicious outsiders stole name, address (city, state, zip), country, email address, birth date, PlayStation Network/Qriocity password and login, and handle/PSN online ID, and possibly credit card numbers from 70 million Sony PlayStation users.

SQL injection is fast becoming one of the biggest and most high profile web security threats.April 2011: A mass SQL injection attack that initially compromised 28,000 websites shows no sign of slowing down. Known as LizaMoon, this malicious code is after anything stored in a database.

Unprotected test data sent to and used by test/development teams as well as third-party consultants.February 2009: An FAA server used for application development & testing was breached, exposing the personally identifiable information of 45,000+ employees.

Hundreds of thousands of secret reports regarding US wars in Iraq and Afghanistan published on WikiLeaks.December 2010: A private in the US military, downloaded top secret military documents and passed them to journalist for publication. This puts US national security at risk as well as the lives of those named in reports.

Page 12: Optim test data management for IMS 2011

© 2011 IBM Corporation12

Information Management

What is data masking? Definition

Method for creating a structurally similar but inauthentic version of an organization's data. The purpose is to protect the actual data while having a functional substitute for occasions when the real data is not required.

RequirementEffective data masking requires data to be altered in a way that the actual values cannot be determined or reengineered, functional appearance is maintained.

Other Terms UsedObfuscation, scrambling, data de-identification

Commonly masked data typesName, address, telephone, SSN/national identity number, credit card number

Methods– Static Masking: Extracts rows from production databases, obfuscating data

values that ultimately get stored in the columns in the test databases– Dynamic Masking: Masks specific data elements on the fly without touching

applications or physical production data store

Page 13: Optim test data management for IMS 2011

© 2011 IBM Corporation13

Information Management

InfoSphere Optim Data Masking Solution / Option

Example 2Example 2Example 1Example 1

PersNbr FstNEvtOwn LstNEvtOwn

27645 Elliot Flynn

27645 Elliot Flynn

Event TableEvent Table

PersNbr FstNEvtOwn LstNEvtOwn

10002 Pablo Picasso

10002 Pablo Picasso

Event TableEvent Table

Personal Info TablePersonal Info Table

PersNbr FirstName LastName

08054 Alice Bennett

19101 Carl Davis

27645 Elliot Flynn

Personal Info TablePersonal Info Table

PersNbr FirstName LastName

10000 Jeanne Renoir

10001 Claude Monet

10002 Pablo Picasso

Data masking techniques include:

String literal valuesCharacter substringsRandom or sequential numbers

Arithmetic expressionsConcatenated expressionsDate aging

Lookup valuesGeneric mask

Referential integrity is maintained with key propagation

Patient InformationPatient InformationPatient InformationPatient Information

Patient No. SSN

Name

Address

City State Zip

Patient No. SSN

Name

Address

City State Zip

112233 123-45-6789

Amanda Winters

40 Bayberry Drive

Elgin IL 60123

123456 333-22-4444

Erica Schafer

12 Murray Court

Austin TX 78704

Data is masked with contextually correct data to preserve integrity of test data

Satisfy Privacy regulations Reduce risk of data breaches Maintain value of test data

Page 14: Optim test data management for IMS 2011

© 2011 IBM Corporation14

Information Management

What is IMS Data to InfoSphere Optim?

IMS = Hierarchical Database– Database consists of segments – Segments are related (physically)

Optim uses a relational model of tables, rows and columns

Optim Distributed uses Middleware to access IMS. More tied to relational model.

Optim z/OS uses native (DL/I) access to IMS data.

-- ---- ---- ---- ------- ----EMPLOYEE

-- ---- ---- ---- ------- ----DEPARTMENT

-- ---- ---- ---- ------- ------ ---- ---- ---- ------- ----

JOB

Page 15: Optim test data management for IMS 2011

© 2011 IBM Corporation15

Information Management

InfoSphere Optim z/OS IMS Definitions

Legacy Table Definition(s)Legacy Table

Definition(s)Legacy Table Definition(s)

Describes physical layout of segment

Create from COBOL or PL/I copybook– Associated with IMS segment– Definition stored in the Optim Directory

Relate to other tables (DB2 or Legacy) via Optim Relationship

Segment treated as virtual DB2 table by any Optim process

IMS DB

IMS Definitions

Maps

Legacy Tables

Relationships

Definitions

Optim Directory

EMPLOYEE

VENDITEM

OPT.PROD.PSTDEPDB

VSAMFileOPT.PROD.

VENDITEM

copybooks

Page 16: Optim test data management for IMS 2011

© 2011 IBM Corporation16

Information Management

InfoSphere Optim z/OS Platform Access to Data Sources

DB2AS400

DB2 IMS VSAM / Seq

Native Client Access

InfoSphere Optim

& DB2 for z/OS

Excluded for IMS/VSAM/Seq:

-TDM Compare

-TDM Edit

-Archive

-Application Retirement

Page 17: Optim test data management for IMS 2011

© 2011 IBM Corporation17

Information Management

InfoSphere Optim Distributed Platform Access to Data Sources

Data sources / tables exposed as NicknamesIBM

FederationServer

Oracle 9HP UX

DB2 AIX SQL ServerWin 2003

DB2 LinuxIMS / VSAMz/OS

DB2AS400

ClassicFederation

ODBC Client Client Client

Optim Server Native Client Access

Leverage Middleware

Page 18: Optim test data management for IMS 2011

© 2011 IBM Corporation18

Information Management

InfoSphere Optim z/OS Requirements for IMS / VSAM / Sequential

Available:– IMS V12 Support (Optim z/OS V6 and V7)– Support for masking data in fixed length arrays (OCCURS) – IMS Sequential Dependent (SDEP) Segment Support– Support multiple record layouts for an IMS segment– Batch IMS/VSAM/Seq Table definition utility– Date/Time/Timestamp data types in IMS/VSAM/Seq Table Definitions – IMS Compression Exit

High Priority:– VSAM, Sequential and IMS Related Compare– Support for masking data in variable length arrays (ODO)– More flexible Optim relationship support– Tester productivity enhancements via Self-Service– Improvements in unkeyed segment support (over time)– Improvements in IMS access path selection (over time)– Extract IMS data during IMS Unload– Archive IMS, VSAM and Sequential natively on z/OS– Common Eclipse-based UI (Optim Designer and Manager)

Page 19: Optim test data management for IMS 2011

© 2011 IBM Corporation19

Information Management

Requirements to manage data across its lifecycle

Validate test resultsDefine policiesReport & retrieve

archived data

Enable compliance with retention &

e-discovery

Move only the needed information

Integrate into single data source

Create & refresh test data

Manage data growthClassify & define

data and relationships

Develop database structures & code

Enhance performance

Discover where data resides

Develop &Develop &TestTest

Discover &Discover &DefineDefine

Optimize, Archive Optimize, Archive & Access& Access

Consolidate &Consolidate &RetireRetire

Information Governance Core Disciplines Lifecycle Management

Page 20: Optim test data management for IMS 2011

© 2011 IBM Corporation20

Information Management

Discovery: You can’t manage what you don’t understand

?

??

??

??

?

???

?

?

?

?

?

?

?

??

?

??

?

??

?

?

?

?

Challenges:– How do I know what data is

needed for test cases– Lack of understanding of where

data is located and how the data is related

– Limited understanding of confidential data elements

– Cost prohibitive to conduct manual analysis and hand coding

Result:– Lack of agility in testing– Poor data governance– Bad data = Bad business

decisions– Inadvertent exposure of sensitive

information

Page 21: Optim test data management for IMS 2011

© 2011 IBM Corporation21

Information Management

InfoSphere Discovery Speeds Understanding Data

21

Row Member SS # Age Phone Sex

1 595846226 123-45-6789 15 (123) 456-7890 M

2 567472596 138-27-1604 8 (138) 271-6037 F

3 540450092 154-86-4196 22 (154) 864-1961 M

4 514714372 173-44-7900 55 (173) 447-8996 F

5 490204164 194-26-1648 4 (194) 261-6476 F

6 466861109 217-57-3046 66 (217) 573-0453 M

987,623 444629628 243-68-1812 25 (243) 681-8107 F

987,624 423456789 272-92-3629 87 (272) 923-6280 M

ID Demo1

595846226 0

567472596 1

540450091 2

514714372 3

490204164 1

466861109 0

444629628 3

423456789 2

Table 1Table 25

The Discovery Engine analyzes data values to automatically discover the columns that relate rows across data sources, and the columns which contain sensitive data.

IBM InfoSphere Discovery

Hit

Ra

te: 98

%

X -

Page 22: Optim test data management for IMS 2011

© 2011 IBM Corporation22

Information Management

InfoSphere Optim Data Growth Solution

Compressed Archives

Compressed Archives

2 -

4 Y

ears

Act

ive/

His

toric

al O

nlin

e2

- 4

Years

Act

ive/

His

toric

al O

nlin

e

InfoSphere Optim

Business Value:

Saves Production storage costs

Improves Production performance

Manage Archive Files through their lifecycle: retention policy compliance

Mitigates risks of removing data from Prod.

ArchiveArchive

RestoreRestore

Additional Options

ODBC / JDBC

XML

SQL

Excel

Access

Non DBMS Retention Platform

ATA File ServerEMC Centera™, DR550, Etc.

Non DBMS Retention Platform

ATA File ServerEMC Centera™, DR550, Etc.4

- 6

Years

On/

Nea

r-Li

ne A

rchi

ve4

- 6

Years

On/

Nea

r-Li

ne A

rchi

ve

Off-line Retention PlatformCD,Tape,Optical, WORM,IBM TSM,NetApp NearStore® SnapLock™,IBM Total Storage® solutions (including the DR550) EMC Centera™.

Off-line Retention PlatformCD,Tape,Optical, WORM,IBM TSM,NetApp NearStore® SnapLock™,IBM Total Storage® solutions (including the DR550) EMC Centera™.

6+

Years

Off-

Line

Arc

hive

6+

Years

Off-

Line

Arc

hive

Native access

UNIVERSAL

ACCESS

UNIVERSAL

ACCESS

ProductionData

1 - 2 YearsCurrent Data

1 - 2 YearsCurrent Data

Page 23: Optim test data management for IMS 2011

© 2011 IBM Corporation23

Information Management

InfoSphere Optim Application Retirement

Preserve application data in its business context

Retire out-of-date packaged applications as well as legacy custom applications

Shut down legacy system without a replacement

Infrastructure before RetirementInfrastructure before Retirement Archived Data after ConsolidationArchived Data after Consolidation

`

User Archive DataArchive Engine

`

User

`

User

`

User DatabaseApplication Data

`

User DatabaseApplication Data

`

User DatabaseApplication Data

Page 24: Optim test data management for IMS 2011

© 2011 IBM Corporation24

Information Management

Conclusion

Test Data Management allows development teams to accelerate testing activities on a project

Test Data Management exploits production data while ensuring security of confidential data

Providing testers and developers with access to test data can improve operational efficiency and optimize resources on a project

A comprehensive Test Data Management solution is needed to minimize cost and shorten development cycles

Page 25: Optim test data management for IMS 2011

© 2011 IBM Corporation25

Information Management

Learn more

Product Family Webpage

Solution Sheet: InfoSphere Test Data Management Solution brief

Whitepaper: Integrated Strategies to Improve Application Testing

Case Study: InfoSphere Test Data Management

Page 26: Optim test data management for IMS 2011

© 2011 IBM Corporation26

Information Management