48
Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

Embed Size (px)

Citation preview

Page 1: Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

Software Development using Production Data

By Karen AmbroseWellcome Trust Sanger Institute (WTSI, UK)

Monday 28th April 2008 – OLSUG, BOSTON, MA

Page 2: Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

Disclaimer

The information contained within this

presentation is based on systems at the

Wellcome Trust Sanger Institute.

Page 3: Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

Overview

About the Wellcome Trust Sanger Institute (WTSI)

What do we do? Case Study – Sequencing pipeline

Development requirements Smoke and Mirrors Logical intervention – Logical Standby

Questions?

Page 4: Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

About the WTSI

One of the leading genomics centres in the world. Founded in 1993 by the Wellcome Trust (major

funding provider) and the UK Medical Research Council (MRC).

Formerly the Sanger Centre. Named after the double Nobel prize-winning

biochemist Dr. Fred Sanger

Page 5: Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

What do we do?

Human Genetics - e.g. Cancer studies, SNP, WTCCC, Copy Number Variation (CNV)

Model Organism & Pathogen Genetics -Mouse, Zebrafish, Pathogens

Page 6: Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

What do we do?

Model Organism Sequencing – e.g. Human, Mouse, Yeast, Worm

Bioinformatics –Analysis, Annotation, Data Storage, Data Processing etc.

Page 7: Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

What do we do?

Current Finished Sequence

Total:  3,684,395,290 bases (25th April 2008)

Data produced is made freely available to

researchers worldwide.

Page 8: Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

Case Study – Sequencing Pipeline

High throughput sequencing application pipelines to support laboratory practices.

Mixture of new bespoke and legacy systems.

Multiple Oracle databases support a collection of co-operating production application systems.

Result of multi-developer effort over a period of 10 years.

Page 9: Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

Abstract sequencing pipeline map

Mapping

Dna_receptionSubcloning

PickingPreppingSequencing

Post sequence processing

Internal trace

External trace

Assembly

Finishing

Checking

Gull/kitETS Archives Chromoview EPS

Genetrap

Exoseq

Viralsequencing

EST Back end

Devmin

Corf

Epigenomics

Restriction digests

Page 10: Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

Development requirements

Availability of “up to date” datasets for development testing.

Developer flexibility and autonomy.

Stable and robust development environment.

Ability to test component parts in isolation.

Easily define, store, recreate and test use cases.

Page 11: Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

Smoke and Mirrors – 1st solution

Using the production database system with an

additional development database to support

read/write processes.

Page 12: Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

Smoke and Mirrors - Architecture

PRODUCTION/PRIMARY

DATABASE(100GB)

DEVELOPMENTDATABASE

(20GB)

PHYSICALSTANDBY

DATABASE

PHYSICALSTANDBY

DATABASE(100GB)

Archive Redo Logs

Manual copies of datasets

ORACLE 9i (9.2.0.5) on Compaq Tru64 UNIX V5.1

Page 13: Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

Smoke and Mirrors - Setup

Development DB

DS1

PrivatesynonymsAcross dblinks.

Publicsynonyms

Production/Primary DB

Production schema

SS1

● Setup new “special_” user on the production database i.e. SS1 ● Setup new user on the development database i.e. DS1

Page 14: Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

Smoke and Mirrors - Setup

PrivatesynonymsAcross dblinks.

Publicsynonyms

Development schemaswith read/write Tables

Development DB

Production/Primary DB

Production schema

SS1

DS1

Database linksT1

T

Triggers replace FK which reference R/O production tables

● Isolate read/write (r/w) and read only (r/o) table access for an application system.

● Foreign keys replaced with triggers (where applicable).

Page 15: Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

Smoke and Mirrors - Setup

PrivatesynonymsAcross dblinks.

Publicsynonyms

Development schemaswith read/write Tables

Development DB

Production/Primary DB

Production schema

SS1

DS1

Database links

Database linksS T1

T

T1@

Triggers replace FK which reference R/O production tables

● For r/w access - Setup private synonyms over database links to the new development user.

● For r/o access – Use public synonyms from production schemas or create private synonyms.

Page 16: Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

Smoke and Mirrors - Setup

● Run the software application using the “special_” new user login.

PrivatesynonymsAcross dblinks.

Publicsynonyms

Development schemaswith read/write Tables

Development DB

Production/Primary DB

Production schema

SS1

DS1

Database links

Database links

Application Login

S T1T

T1@

Triggers replace FK which reference R/O production tables

Page 17: Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

Smoke and Mirrors - Setup

Run SQL check to ensure there are no leakages using the new special_ user.

select * from all_tab_privs where grantee in (select granted_role from user_role_privs where username = (SELECT USER FROM DUAL))UNIONselect * from all_tab_privs where GRANTEE = (SELECT USERFROM DUAL)UNIONselect * from all_tab_privs where GRANTEE = 'PUBLIC' and TABLE_SCHEMA not in ('SYS', 'SYSTEM', ‘WMSYS’, 'EXFSYS', 'DMSYS', 'XDB’) and PRIVILEGE != 'SELECT';

Page 18: Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

Smoke and Mirrors

Advantages

Current dataset available for R/O access.

Less disk space required for development database.

Quick setup.

Disadvantages

Possibility of leakage into production database.

Possible performance issues across database links.

Less developer autonomy.

Page 19: Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

Logical Standby – 2nd solution

Using the logical standby technology, the

development schemas and production

schemas are within the same database without

possibility of leakage onto the production

database.

Page 20: Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

Logical Standby - Architecture

ORACLE 10GR2 (10.2.0.2) on SUSE LINUX (x86_64) SLES9

PRODUCTION/PRIMARY

DATABASE(100GB)

LOGICALSTANDBY

DATABASE(120GB)

PHYSICALSTANDBY

DATABASE

PHYSICALSTANDBY

DATABASE(100GB)

Archive Redo Logs

Archive Redo Logs

SQLapply

Page 21: Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

Logical Standby - Setup

Prepare the Production DB to support a

logical standby.

Determine support for Datatypes and storage attributes.

Ensure table rows can be uniquely identified.

Page 22: Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

Logical Standby – Setup

Production DB - Dataguard parameters

db_name=‘DB1’db_unique_name=‘DB1’log_archive_config='DG_CONFIG=(DB1,DB3)' log_archive_dest_3='service=DB3 valid_for=(online_logfiles,primary_role) db_unique_name=DB3 optional reopen=15' log_archive_dest_state_3=ENABLE

Page 23: Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

Create physical standby using RMANDuplicate target database for standby dorecover;

Stop Redo apply.

Build redo dictionary on production db

execute DBMS_LOGSTDBY.BUILD

Logical Standby - Setup

Page 24: Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

Physical Standby - Dataguard parameters

*.db_name = ‘DB1’*.db_unique_name=‘DB3‘*.fal_client=‘DB3'*.fal_server=‘DB1'*.standby_file_management='AUTO'

Logical Standby – Setup

Page 25: Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

On the Standby

Transition the Physical to a Logical standby

Alter database recover to logical standby new_dbname;

Shutdown and amend the pfile parameters.*.db_name=‘DB3‘

*.standby_archive_dest='/oracle/lnnn/logstby/DB3'

Logical Standby - Setup

Page 26: Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

Logical Standby – Setup

●Create a new password file.

●Create new SPFILE.

●Open database resetlogs.

●Commence the SQL apply process.

Alter database start logical standby apply;

Page 27: Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

Logical Standby – Development Setup

Alter Dataguard level to STANDBY.

Create separate tablespaces for new development users and objects.

Assign space quotas for each new user’s development schema.

Create new development user accounts in a standard format. *

Page 28: Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

Logical Standby - Development Setup

Create tables in development schema for read write (r/w) access.

Use tables maintained by SQL apply process for read only (r/o) access.

Foreign keys are replaced with triggers (where applicable). **

Run application with development user login.

Page 29: Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

Logical Standby – Development Setup

Logical Standby DB

Archived redolog transfer

Production/Primary DB

Production schema

Production schema

DS1

Public synonyms

Development schema areasPrivate synonyms.Local objects with R/W access.Triggers replace FK. Use public synonyms for R/O table access.

R/O

R/W

Application login

T1

T

S

SQLapply

Page 30: Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

Create additional development users in a standard format using the USER_ADMINISTRATION package

PROCEDURE CREATE_USER

Argument Name Type In/Out Default?

------------------------------ ---------------------- ------ --------

P_DEBUG BOOLEAN IN DEFAULT

P_USERNAME VARCHAR2(30) IN

Development Area - Setup

Page 31: Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

Development Area - Setup

PROCEDURE DROP_USER

Argument Name Type In/Out Default?------------------------------ ----------------------- ------ --------P_DEBUG BOOLEAN IN DEFAULTP_USERNAME VARCHAR2(30) IN

PROCEDURE VALIDATE

Argument Name Type In/Out Default?------------------------------ ----------------------- ------ --------P_DEBUG BOOLEAN IN DEFAULTP_USERNAME VARCHAR2(30) INP_MODE VARCHAR2 IN

Page 32: Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

Development Area – Setup

Package Usage Examples:-

exec USER_ADMINSTRATION.CREATE_USER(TRUE,’DEV_user’);

Output as follows:-

CREATE USER DEV_user IDENTIFIED BY user DEFAULT TABLESPACE DEV_TB_01 TEMPORARY TABLESPACE DEV_TMP_01 QUOTA 1024M ON DEV_TB_01 GRANT RESOURCE, CONNECT, DEV_USER_ADMINISTRATION TO DEV_user CREATE SYNONYM DEV_user.QUICK_FK_TRIGGER_PACKAGE GRANT EXECUTE ON QUICK_FK_TRIGGER_PACKAGE TO DEV_user

Page 33: Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

NO foreign keys permitted from the development sandbox to SQL apply maintained schemas.

2 solutions:Copy the required table locally and create

the FK within the development area. Use a package which replaces FK's with

triggers.

Development Area - Setup

Page 34: Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

Create triggers to replace foreign keys between developer sandboxes and schemas maintained by SQL apply process using the QUICK_FK_TRIGGER package.

PROCEDURE QUICK_CREATE_FK_TRIG

Argument Name Type In/Out Default?------------------------------ ----------------------- ------ --------LOCAL_TABLE VARCHAR2 INREFERENCE_TABLE VARCHAR2 INREFERENCE_DBLINK VARCHAR2 IN DEFAULTDEBUG NUMBER IN DEFAULT

Development Area - Setup

Page 35: Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

Development Area – Setup

Package Usage Examples:-

exec QUICK_FK_TRIGGER_PACKAGE.

QUICK_CREATE_FK_TRIG('DEV_user.table_name.column_name',‘ref_user.table_name.column_name',[db.domain|NULL],1);

e.g.QUICK_FK_TRIGGER_PACKAGE.QUICK_CREATE_FK_TRIG('DEV_USER.FINISH_BATCH.PROJECTNAME',‘REF_USER.PROJECT.PROJECTNAME','testdb.world',1);

Page 36: Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

Development Area - Setup

CREATE OR REPLACE TRIGGER AG_PROJECTNAME_FINISH_B_BR_IUBEFORE INSERT OR UPDATE ON FINISH_BATCHFOR EACH ROWDECLAREFK_ENTRY FINISH_BATCH.PROJECTNAME%TYPE;BEGIN------------------------------------------ AUTO GENERATED TRIGGER FROM THE-- QUICK_FK_TRIGGER_PACKAGE---- NAME: AG_PROJECTNAME_FINISH_B_BR_IU-- AUTH: kva-- DATE: 05-DEC-07----------------------------------------FK_ENTRY:= :NEW.PROJECTNAMEIF FK_ENTRY IS NOT NULL THENQUICK_FK_TRIGGER_PACKAGE.QUICK_FK_CHECKS(FK_ENTRY,’REF_USER',‘PROJECT',‘PROJECTNAME',‘TESTDB.WORLD');END IF;END AG_PROJECTNAME_FINISH_B_BR_IU;

Page 37: Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

Development Area - Setup

PROCEDURE QUICK_FK_CHECKS

Argument Name Type In/Out Default?------------------------------ ----------------------- ------ --------VAL VARCHAR2 INREFOWNER VARCHAR2 INREFTABLE VARCHAR2 INREFCOL VARCHAR2 INREFDBLINK VARCHAR2 IN DEFAULTDEBUG NUMBER(38) IN DEFAULT

Page 38: Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

Development Area - Setup

PROCEDURE QUICK_FK_CHECKS

If no DBLINK specified

SELECT projectname FROM ref_user.project WHERE projectname = ‘PROJECT1’;

If DB link specified

SELECT projectname FROM [email protected] WHERE projectname= ‘PROJECT1;

EXCEPTION RAISED FOR MISSING ENTRY

RAISE_APPLICATION_ERROR(-20000,'Entry does not exist for PROJECT1 - MISSING ENTRY IN ref_user.project.projectname');

Page 39: Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

Any attempt to alter the schema being maintained by the SQL apply process, raises the following error:

ORA-16224: DATABASE GUARD IS ENABLED

Development Area - Setup

Page 40: Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

Logical StandbyAdvantages

Current dataset for RO access.

All development contained within one database.

More developer autonomy. No direct interaction with

the production DB. Full copy of production

data.

Disadvantages

More disk space required (1 database plus development area)

Security of sensitive production data.

Page 41: Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

Maintenance & Monitoring

Skipping sensitive schemas by executing the DBMS_LOGSTDBY.SKIP package.

exec dbms_logstdby.skip('SCHEMA_DDL',‘schema','%');

exec dbms_logstdby.skip('DML',‘schema','%');

Page 42: Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

Maintenance & Monitoring

Various views to monitor the processes which maintain the Logical Standby database.

DBA_LOGSTDBY_EVENTS Records SQL apply events

DBA_LOGSTDBY_LOG Details archived logs processed

V$LOGSTDBY_STATS SQL apply statistics

V$LOGSTDBY_PROCESS Processes involved with SQL apply process

V$LOGSTDBY_PROGRESS Progress made by SQL apply process

V$LOGSTDBY_STATE Current state of SQL apply process

Page 43: Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

Summary

Both solutions provide access to current datasets for application integration and development testing.

The Smoke and mirrors solution is relatively quick to setup.

Logical standby is a more stable solution and provides more development flexibility and scaling.

Page 44: Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

Conclusion

The Logical Standby has currently proved to be

a more stable and popular solution for our

immediate development requirements.

Page 45: Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

Future Plans

Investigation of Rapid Application Testing (RAT), new in Oracle 11g.

Page 47: Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

Acknowledgements

DBA group @Sanger Institute (UK) Production Development group @Sanger

Institute (UK) Wellcome Trust Sanger Institute (UK)

Page 48: Software Development using Production Data By Karen Ambrose Wellcome Trust Sanger Institute (WTSI, UK) Monday 28 th April 2008 – OLSUG, BOSTON, MA

Questions?

Contact [email protected], if you require more information.