BGOUG "Agile Data: revolutionizing database cloning

Preview:

Citation preview

Agile Data: Revolutionizing database cloning

1

http://kylehailey.comkyle@delphix.com

Tim Gormantim@delphix.com

Are you too busy to Innovate?

Inertia

A new way : Welcome Agile & DevOps!

Waterfall, Agile, Devops

• Waterfall

• Agile

• Agile with Continuous Deploy

Continuous Deploy requires DevOps

Design Code test Deploy

Design Code test Code test DeployCode test Code test

Design

What is DevOps = tools + culture

• Culture :

– Bridging silos between Dev & Ops

– Empathy avoid blame

– Collaboration

• Tools :

– Automation VMs, Puppet, Jenkins

– Self-service

– Measurement

4

Note: DevOps > Tools + Culture

DevOps Goal= optimizing flow from Dev to Ops to Pro

5

Don’t copy steps. Copy the goal

Goal = company’s bottom line

GoalAgile & CI Achieved !

Missed !

Agile & CI vs Waterfall

bugs

time

GoalAgile & CI Achieved !

Missed !

Bugs

profit

time

GoalAgile & CI Achieved !

Missed !

Profit

GoalAgile & CI Achieved !

Missed !

CostPer Deployment time

Cost per Deployment

DevOps and Data : Impossible?

Waterfall

Agile & DevOps

DevOps Goal= optimizing flow from Dev to Ops to Pro

Big Software Release

Small Continuous Releases

The Goal : Theory of Constraints

Improvementnot made at the constraintis an illusion

factory floor optimization

Factory floor

Factory floor

constraint

Not a relay race

Tune before constraint

constraint

Tuning here

Stock piling

Tune after constraint

constraint

Tuning here

Starvation

Factory floor : straight forward

constraint

Goal: find constraint optimize it

Theory of Constraints work for IT ?

• Goals Clarify • Metrics Define • Constraints Identify • Priorities Set • Iterations Fast

• CI• Cloud • Agile • Kanban• Kata

“IT is the factory floor of this century”

The Phoenix Project

What is the constraint

in IT ?

What are the top 5 constraints in IT?

1. Dev environments setup2. QA setup3. Code Architecture4. Development5. Product management

- Gene Kim

“One of the most powerful things that organizations can do is to enable development and testing to get environment they need when they need it“

Data is the constraint

60% Projects Over Schedule

85% delayed waiting for data

Data is the Constraint

CIO Magazine Survey:

only getting worse

Gartner: Data Doomsday, by 2017 1/3rd IT in crisis

• Data Constraint• Solution• Use Cases

In this presentation :

• Data Constraint• Solution• Use Cases

Typical Architecture

Production

Instance

File system

Database

Typical Architecture

Production

Instance

Backup

File system

Database

File system

Database

Typical Architecture

Production

Instance

Reporting Backup

File system

Database

Instance

File system

Database

File system

Database

Typical Architecture

Production

Instance

File system

Database

Instance

File system

Database

File system

Database

File system

Database

InstanceInstance

Instance

File system

Database

File system

Database

Dev, QA, UAT Reporting Backup

Triple Tax

Typical Architecture

Production

Instance

File system

Database

Instance

File system

Database

File system

Database

File system

Database

InstanceInstance

Instance

File system

Database

File system

Database

– Storage & Systems– Personnel – Time

moving data is hard

–Servers–Storage–Network–Data center floor space, power, cooling

copies take up space

Never enough environments

Your Project

Available Resources

• People 1000s hours per year just for DBAs – DBAs– SYS Admin– Storage Admin– Backup Admin – Network Admin

• $100s Millions for data center modernizations

Copies require People & Time

Data floods infrastructure

92% of the cost of business,

in financial services business , is “data”

www.wsta.org/resources/industry-articles

Most companies average5% IT spending , ½ on “data”

http://uclue.com/?xq=1133

companies unaware

companies unaware

Developer or AnalystBoss, Storage Admin, DBA

Metrics

– Time – Old Data – Storage

Other – Analysts – Audits

companies unaware

1. Bottlenecks2. Waiting for environments3. Waiting to check in code4. Production Bugs5. Expensive Slow QA

What Problems does Data Constraint Cause

Development : waiting

Development : bottlenecks

Frustration Waiting

Development : Bugs

Old Unrepresentative Data

Development : subsets

False NegativesFalse PositivesBugs in Production

Production Wall

42

Development : silos

QA : Long Build times

BugX

010203040506070

1 2 3 4 5 6 7

Delay in Fixing the bug

Cost ToCorrect

Software Engineering Economics – Barry Boehm (1981)

• Need lots of copies

• Each copy is like

DevOps : Impossible with databaes?

Design

• Data Constraint

• Solution• Use Cases

In this presentation :

Development UATQA

99% of blocks are identical

Solution

Development QA UAT

Thin Clone

• EMC – 16 snapshots on Symmetrix– Write performance impact– No snapshots of snapshots

• Netapp– 255 snapshots

• ZFS– Compression– Unlimited snapshots– Snapshots of Snapshots

• DxFS– “”– Storage agnostic– Shared cache in memory

Technology Core : file system snapshots

Also check out new SSD storage such as:Pure Storage, EMC XtremIO

Fuel not equal car

Challenges

1. Technical2. Bureaucracy

Bureaucracy

Developer Asks for DB Get Access

Manager approves

DBA Request system

Setup DB

System Admin

Requeststorage

Setupmachine

Storage Admin

Allocate storage (take snapshot)

Why are hand offs so expensive?

1hour1 day

9 days

Bureaucracy

Technical Challenge

Database Luns

Production FilerTarget A

Target B

Target C

snapshotclones

InstanceInstance

InstanceInstance

InstanceInstance

InstanceInstance

Instance

Source

Database LUNs

snapshot

clonesProduction Filer

Development Filer

Technical Challenge

Instance

Target A

Target B

Target C

InstanceInstance

InstanceInstance

InstanceInstance

Instance

Technical Challenge

Copy

Time Flow

Purge

Production

File System Instance

DevelopmentStorage

Clone (snapshot)

Compress

Share Cache

Provision

Mount, recover, rename

Self Service, Roles & Security

Instance

21 3

Technical ChallengeProduction DevelopmentStorage

21 3

– ZFS

– EMC + SRDF

– Netapp + SMO

– Oracle EM 12c + data guard + Netapp /ZFS

– Actifio - hardware

– Delphix - software

2 1

2 13 1 2

How to get a Data Virtualization?

Sourcesync

Deployautomation

Storagesnapshots

21 3

2

31 2

31 2

Goal : virtualize, govern, deliver

59

• Masking: Masking• Security: Chain of custody• Self Service: Logins• Developer: Versioning , branching• Audit: Live Archive

Snap Shots

Thin Cloning

Data Virtualization

Data Supply Chain31 2

2

32

Intel hardware

DB2DataFile SystemsBinaries

Install Delphix on x86 hardware

Allocate Any Storage to Delphix

Allocate StorageAny type

Pure Storage + DelphixBetter Performance for 1/10 the cost

One time backup of source database

Database

Production

File systemFile system

InstanceInstanceInstance

DxFS (Delphix) Compress Data

Database

Production

Data is compressed typically 1/3 size

File system

InstanceInstanceInstance

Incremental forever change collection

Database

Production

File system

Changes

• Collected incrementally forever• Old data purged

File system

Production

InstanceInstanceInstance

Time Flow

Snapshot 1 – full backup once only at link time

Jonathan Lewis © 2013 Virtual DB

65 / 30

a b c d e f g h i

We start with a full backup - analogous to a level 0 rman backup. Includes

the archived redo log files needed for recovery. Run in archivelog mode.

Snapshot 2 (from SCN)

Jonathan Lewis © 2013

b' c'

a b c d e f g h i

The "backup from SCN" is analogous to a level 1

incremental backup (which includes the relevant

archived redo logs). Sensible to enable BCT.

Delphix executes standard rman scripts

Apply Snapshot 2

Jonathan Lewis © 2013

a b c d e f g h ib' c'

The Delphix appliance unpacks the rman backup and "overwrites" the

initial backup with the changed blocks - but DxFS makes new copies of

the blocks

Drop Snapshot 1

Jonathan Lewis © 2013

b' c'a d e f g h i

The call to rman leaves us with a new level 0 backup, waiting for recovery.

But we can pick the snapshot root block. We have EVERY level 0 backup

Creating a vDB

Jonathan Lewis © 2013

b' c'a d e f g h i

The first step in creating a vDB is to take a snapshot of the filesystem as at

the backup you want (then roll it forward)

My vDB(filesystem)

Your vDB(filesystem)

b' c'a d e f g h i

Creating a vDB

Jonathan Lewis © 2013

b' c'a d e f g h i

The first step in creating a vDB is to take a snapshot of the filesystem as at

the backup you want (then roll it forward)

My vDB(filesystem)

Your vDB(filesystem)

i’b' c'a d e f g h ib' c'a d e f g h i

Database Virtualization

Three Physical CopiesThree Virtual Copies

Data Virtualization Appliance

Before Virtual Data

Production Dev, QA, UAT

Instance

Reporting Backup

File system

Database

Instance

File system

Database

File system

Database

File system

Database

InstanceInstance

Instance

File system

Database

File system

Database

“triple data

tax”

With Virtual DataProduction

Instance

Database

Dev & QA

Instance

Database

Reporting

Instance

Database

Backup

Instance Instance Instance

Database

InstanceInstance

Database

InstanceInstance

File system

Database

Data Virtualization Appliance

• Problem in the Industry• Solution• Use Cases

1. Development and QA 2. Production Support3. Business

Use Cases

1. Development and QA2. Production Support3. Business

Use Cases

Development: Virtual Data

• Unlimited • Full size • Self Service

Development

Virtual Data: Easy

Instance

Instance

Instance

Instance

Source

DVA

Development Virtual Data: Parallelize

gif by Steve Karam

Development Virtual Data: Full size

Development Virtual Data: Self Service

QA : Virtual Data• Fast • Parallel• Rollback• A/B testing

Dev

QA

Instance

Prod

DVA

• Low Resource

• Find bugs Fast

QA Virtual Data : Fast

Production Time Flow

QA with Virtual Data: Rewind

Instance

QA

Prod

Production Time Flow

QA with Virtual Data: A/B

Instance

Instance

Instance

Index 1

Index 2

Production Time Flow

Data Version Control

12/3/2014 87

Dev

QA

2.1

Dev

QA

2.2

2.1 2.2

Instance

Prod

DVA Production Time Flow

1. Development and QA2. Production Support3. Business

Use Cases

• Backups• Recovery• Forensics• Migration• Consolidation

Recovery

9TB database 1TB change day 30 day backups storage requirements

90

0

10

20

30

40

50

60

70

wee

k 1

wee

k 2

wee

k 3

wee

k 4

original

Oracle

Delphix

Recovery

Instance

Instance

Recover VDB

Drop

Source

DVA Production Time Flow

Forensics

Instance

Development

DVA

Source

Production Time Flow

Development (the new production)

Instance

Development

DVA

Source

Development

Prod & VDB Time Flow

X

Migration

Consolidation

1. Development and QA2. Production Support3. Business Intelligence

Use Cases

Business Intelligence

• ETL• Temporal• Confidence Testing• Federated Databases• Audits

Business Intelligence: ETL and DW Refreshes

Instance

Prod

Instance

DW & BI

• Collect only Changes• Refresh in minutes

Instance

Prod

BI and DW

ETL24x7

DVA

Virtual Data: Fast Refreshes

Production Time Flow

Temporal Data

Confidence testing

Modernization: Federated

Instance

Instance

Source1

Source2

DVAProduction Time Flow 1

Production Time Flow 2

Modernization: Federated

“I looked like a hero”Tony Young, CIO Informatica

Modernization: Federated

Production Time Flow

Audit

12/3/2014 105

Instance

Prod

DVA

Live Archive

1. Development & QA2. Production Support3. Business

Use Case Summary

How expensive is the Data Constraint?

DVA at Fortune 500 :

Dev throughput increase by 2x

Faster

• Financial Close• BI refreshes• Surgical recovery• Projects

How expensive is the Data Constraint?

• Projects “12 months to 6 months.”– New York Life

• Insurance product “about 50 days ... to about 23 days”– Presbyterian Health

• “Can't imagine working without it”– State of California

Virtual Data Quotes

• Problem: Data is the constraint • Solution: Virtualize Data• Results:

• Half the time for projects• Higher quality• Increase revenue

Summary

Thank you!

• Kyle Hailey| Oracle ACE and Technical Evangelist, Delphix– Kyle@delphix.com

– kylehailey.com

– slideshare.net/khailey