44

Eric Grancher, CERN IT department, eric.grancher@cern · Computing and storage needs 7 • Data volume • 25 PB per year (in files) • > 5.25 * 1012 rows in an Oracle table (IOT,

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Eric Grancher, CERN IT department, eric.grancher@cern · Computing and storage needs 7 • Data volume • 25 PB per year (in files) • > 5.25 * 1012 rows in an Oracle table (IOT,
Page 2: Eric Grancher, CERN IT department, eric.grancher@cern · Computing and storage needs 7 • Data volume • 25 PB per year (in files) • > 5.25 * 1012 rows in an Oracle table (IOT,

Eric Grancher, CERN IT department, [email protected]

(documents available at https://indico.cern.ch/conferenceDisplay.py?confId=276758)

Page 3: Eric Grancher, CERN IT department, eric.grancher@cern · Computing and storage needs 7 • Data volume • 25 PB per year (in files) • > 5.25 * 1012 rows in an Oracle table (IOT,

Agenda

3

• A few words on CERN and the computing challenges, Oracle at CERN

• Consolidation challenge

• Oracle multitenant database

• Real Application Testing / capture and replay

• Conclusions (your turn!)

• (demos, experience and tips)

Page 4: Eric Grancher, CERN IT department, eric.grancher@cern · Computing and storage needs 7 • Data volume • 25 PB per year (in files) • > 5.25 * 1012 rows in an Oracle table (IOT,

Jürgen Knobloch- cern-it Slide-4

CERN CERN

27 km circumference

Staff members: about 2500

Research community: 10,000 scientists

Page 5: Eric Grancher, CERN IT department, eric.grancher@cern · Computing and storage needs 7 • Data volume • 25 PB per year (in files) • > 5.25 * 1012 rows in an Oracle table (IOT,

Large Hadron Collider - LHC

The most complex machine on earth

• The world biggest particle accelerator

• 600 million collisions / second

5

• Fundamental physics • Why do fundamental particles weigh the

amount they do? • What is 96% of the Universe made of? • Where did the antimatter go to? • What was the universe like just after the

« Big Bang »? • Are there extra dimensions of space?

Page 6: Eric Grancher, CERN IT department, eric.grancher@cern · Computing and storage needs 7 • Data volume • 25 PB per year (in files) • > 5.25 * 1012 rows in an Oracle table (IOT,

ATLAS/CMS, March 1st 2013

6

• “Having analysed two and a half times more data than was available for the discovery announcement in July, they find that the new particle is looking more and more like a Higgs boson, the particle linked to the mechanism that gives mass to elementary particles. It remains an open question, however, whether this is the Higgs boson of the Standard Model of particle physics, or possibly the lightest of several bosons predicted in some theories that go beyond the Standard Model. Finding the answer to this question will take time.

• Whether or not it is a Higgs boson is demonstrated by how it interacts with other particles, and its quantum properties. For example, a Higgs boson is postulated to have no spin, and in the Standard Model its parity – a measure of how its mirror image behaves – should be positive.“

• http://home.web.cern.ch/fr/about/updates/2013/03/new-results-indicate-new-particle-higgs-boson

Page 7: Eric Grancher, CERN IT department, eric.grancher@cern · Computing and storage needs 7 • Data volume • 25 PB per year (in files) • > 5.25 * 1012 rows in an Oracle table (IOT,

Computing and storage needs

7

• Data volume • 25 PB per year (in files)

• > 5.25 * 1012 rows in an Oracle table (IOT, compression, partition) in one of the databases

• Computing and storage capacity, world-wide distributed • > 150 sites (grid computing)

• > 260 000 CPU cores

• > 269 Po disk capacity

• > 210 Po tape capacity

• Distributed analysis with costs spread in the different sites (« LHC Computing Grid »)

Page 8: Eric Grancher, CERN IT department, eric.grancher@cern · Computing and storage needs 7 • Data volume • 25 PB per year (in files) • > 5.25 * 1012 rows in an Oracle table (IOT,

Oracle at CERN

9

• 1982: start

with

Oracle

at CERN

(accelerator

control)

Credit: N. Segura Chinchilla

Page 9: Eric Grancher, CERN IT department, eric.grancher@cern · Computing and storage needs 7 • Data volume • 25 PB per year (in files) • > 5.25 * 1012 rows in an Oracle table (IOT,

10

Credit: M. Piorkowski

Page 10: Eric Grancher, CERN IT department, eric.grancher@cern · Computing and storage needs 7 • Data volume • 25 PB per year (in files) • > 5.25 * 1012 rows in an Oracle table (IOT,

Consolidation, not easy! (a priori)

11

• Change version, parameters, statistics gathering, hardware! • Errors (ORA-600, 745), different execution plans, different

results?

• Does it fit on the one system? (average / peak!)

• Does one workload impact the others, take all resources at some point?

• Multi instance, schema, virtualisation consolidation, etc.

Page 11: Eric Grancher, CERN IT department, eric.grancher@cern · Computing and storage needs 7 • Data volume • 25 PB per year (in files) • > 5.25 * 1012 rows in an Oracle table (IOT,

Oracle Multitenant Database

12

• Introduced in Oracle DB 12.1

• Ideal for consolidation, like virtualisation for database

• … but also additional features (cloning, rapid provisioning, regression testing, faster upgrades, move from one database to another -same storage-)

• SQL level compatibility, tablespace, users, PL/SQL, application unchanged • Any difference can be reported as a bug

• But is this the case for your application?

Page 12: Eric Grancher, CERN IT department, eric.grancher@cern · Computing and storage needs 7 • Data volume • 25 PB per year (in files) • > 5.25 * 1012 rows in an Oracle table (IOT,

Oracle Multitenant Database

13

Non CDB

List of users / roles

User PL/SQL software

User tables / indexes

Oracle foreground processes

Database and instance parameters

SYS PL/SQL sofware

Oracle background processes

CDB - 1 PDB

Database and instance parameters

SYS PL/SQL sofware

Oracle background processes

CDB - 2 PDBs

Database and instance parameters

SYS PL/SQL sofware

Oracle background processes

List of users / roles

User PL/SQL software

User tables / indexes

Oracle foreground processes

List of users / roles

User PL/SQL software

User tables / indexes

Oracle foreground processes

List of users / roles

User PL/SQL software

User tables / indexes

Oracle foreground processes

Page 13: Eric Grancher, CERN IT department, eric.grancher@cern · Computing and storage needs 7 • Data volume • 25 PB per year (in files) • > 5.25 * 1012 rows in an Oracle table (IOT,

Demo 1

14

• Create a pluggable database

• Create a tablespace, one user, two tables

• Clone a pluggable database

Page 14: Eric Grancher, CERN IT department, eric.grancher@cern · Computing and storage needs 7 • Data volume • 25 PB per year (in files) • > 5.25 * 1012 rows in an Oracle table (IOT,

Real Application Testing Capture and Replay

15

• What if you could capture the workload, all workload at the database level (better than client level): select, insert, delete, update, PL/SQL calls, all?

• Real Application Testing Capture and Replay

• Used at CERN for capture as of 10.2, replay on 11.1, 11.2 and 12.1

• Was a key component for our successful migration from 10.2 to 11.2

Page 15: Eric Grancher, CERN IT department, eric.grancher@cern · Computing and storage needs 7 • Data volume • 25 PB per year (in files) • > 5.25 * 1012 rows in an Oracle table (IOT,

Your objective for the testing

16

• New hardware -> time matters

• New version -> execution plans, LIO matters

• Difference in results

• Resource management or parameters

impact…

• All lead to different tests and observations

Page 16: Eric Grancher, CERN IT department, eric.grancher@cern · Computing and storage needs 7 • Data volume • 25 PB per year (in files) • > 5.25 * 1012 rows in an Oracle table (IOT,

Capture and Replay

17

Capture

Upgrade to

12.1 and

nonCDB to

PDB

Replay

Copy of the

database: RMAN, DG, expdp

flashback_scn=

nnn

Page 17: Eric Grancher, CERN IT department, eric.grancher@cern · Computing and storage needs 7 • Data volume • 25 PB per year (in files) • > 5.25 * 1012 rows in an Oracle table (IOT,

Open sessions

18

• In principle can create errors/issues at

replay

• In our experience, little of an issue, marginal

differences

Page 18: Eric Grancher, CERN IT department, eric.grancher@cern · Computing and storage needs 7 • Data volume • 25 PB per year (in files) • > 5.25 * 1012 rows in an Oracle table (IOT,

In flight transaction • Recommendation is to stop the database instances, then start the

instance(s) in restricted mode and then enable capture. Not possible in

most cases

• It means that it can incur errors for dependent transactions

• Not an issue if errors are negligible percentage of the workload

Transaction B

Transaction A

Possible cascading effect

on some other transactions

19

Page 19: Eric Grancher, CERN IT department, eric.grancher@cern · Computing and storage needs 7 • Data volume • 25 PB per year (in files) • > 5.25 * 1012 rows in an Oracle table (IOT,

Capture files (1/2)

• One file created per server process (each

session for dedicated server process)

• Sequential, buffered writing per session

20 20

access("/…/wcr_7jya5h0000009.rec", F_OK) = -1 ENOENT (No such file or directory)

open("/…/wcr_7jya5h0000009.rec", O_RDWR|O_CREAT|O_TRUNC, 0666) = 10

Page 20: Eric Grancher, CERN IT department, eric.grancher@cern · Computing and storage needs 7 • Data volume • 25 PB per year (in files) • > 5.25 * 1012 rows in an Oracle table (IOT,

Capture files (2/2) $ ls -lrt /proc/7576/fd/

lrwx------ 1 oracle ci 64 Sep 20 22:06 9 -> […]wcr_7jwjrh0000002.rec

$ strace -tt -T -p 7937 2>&1 | grep "write(9"

22:24:30.745968 write(9, “…"..., 4096) = 4096 <0.000028>

22:24:30.746050 write(9, “…"..., 45056) = 45056 <0.000044>

22:24:30.746149 write(9, “…"..., 684) = 684 <0.000016>

22:24:31.111495 write(9, “…"..., 4096) = 4096 <0.000026>

22:24:31.111584 write(9, “…"..., 45056) = 45056 <0.000038>

22:24:31.111675 write(9, "..."..., 713) = 713 <0.000017>

22:24:31.474120 write(9, “…"..., 4096) = 4096 <0.000027>

22:24:31.474193 write(9, “…"..., 45056) = 45056 <0.000040>

22:24:31.474282 write(9, “…”…,712) = 712 <0.000019>

21 21

Page 21: Eric Grancher, CERN IT department, eric.grancher@cern · Computing and storage needs 7 • Data volume • 25 PB per year (in files) • > 5.25 * 1012 rows in an Oracle table (IOT,

Synchronisation

22

• SCN: the COMMIT order in the captured workload will be preserved during replay and all replay actions will be executed only after all dependent COMMIT actions have completed

• OBJECT_ID: all replay actions will be executed only after all relevant COMMIT actions have completed

• OFF: no dependency (if independent transaction?)

Page 22: Eric Grancher, CERN IT department, eric.grancher@cern · Computing and storage needs 7 • Data volume • 25 PB per year (in files) • > 5.25 * 1012 rows in an Oracle table (IOT,

sysdate and sequence

23

• Latest replay patch bundle (16086826, see

reference) captures sysdate and sequence

calls so that they can be used at replay

Page 23: Eric Grancher, CERN IT department, eric.grancher@cern · Computing and storage needs 7 • Data volume • 25 PB per year (in files) • > 5.25 * 1012 rows in an Oracle table (IOT,

Divergence

24

• DBA_WORKLOAD_REPLAY_DIVERGENCE

• GET_DIVERGING_STATEMENT procedure in

DBMS_WORKLOAD_REPLAY

• Replay report provides a summary

Page 24: Eric Grancher, CERN IT department, eric.grancher@cern · Computing and storage needs 7 • Data volume • 25 PB per year (in files) • > 5.25 * 1012 rows in an Oracle table (IOT,

SQL tuning set

25

• capture_sts => TRUE is not supported

in RAC

• Tuning set can be used to compare SQL

executions

Page 25: Eric Grancher, CERN IT department, eric.grancher@cern · Computing and storage needs 7 • Data volume • 25 PB per year (in files) • > 5.25 * 1012 rows in an Oracle table (IOT,

Demo 2

26

• Capture on one database

• Check the capture files

• Replay

• Check the replay report

Page 26: Eric Grancher, CERN IT department, eric.grancher@cern · Computing and storage needs 7 • Data volume • 25 PB per year (in files) • > 5.25 * 1012 rows in an Oracle table (IOT,

Methodology matters! (1/2)

27

• Reproducible tests (scripted): reload

database, upgrade, set parameters, disable

some of jobs and resource manager time

based settings

• Gather statistics, logs and reports

Page 27: Eric Grancher, CERN IT department, eric.grancher@cern · Computing and storage needs 7 • Data volume • 25 PB per year (in files) • > 5.25 * 1012 rows in an Oracle table (IOT,

Methodology matters! (2/2: caching)

28

• Buffer cache and shared pool (globally less impact for

long replay)

Multiple strategies:

1. Take the AWR

reports after a first period of replay

2. Do not compare execution time but LIO, execution

plans, etc.

3. Pre-warm (hint: use Capture and Replay!)

Warming cache Measure

Page 28: Eric Grancher, CERN IT department, eric.grancher@cern · Computing and storage needs 7 • Data volume • 25 PB per year (in files) • > 5.25 * 1012 rows in an Oracle table (IOT,

Several replays • Advise to compare between replays when doing multiple changes and

replay versus the capture (same version, measure only the differences not

the capture/replay differences) Capture

Replay 2

Replay 1

New platform,

new version,

capture->replay

New platform,

new version,

capture->replay,

changes changes

29

Page 29: Eric Grancher, CERN IT department, eric.grancher@cern · Computing and storage needs 7 • Data volume • 25 PB per year (in files) • > 5.25 * 1012 rows in an Oracle table (IOT,

Demo 3

30

• Methodology, example CASTORNS

Page 30: Eric Grancher, CERN IT department, eric.grancher@cern · Computing and storage needs 7 • Data volume • 25 PB per year (in files) • > 5.25 * 1012 rows in an Oracle table (IOT,

Multitenant Database – Capture/Replay

31

Non CDB,

SQL ordered by Gets

CDB,

SQL ordered by Gets

Page 31: Eric Grancher, CERN IT department, eric.grancher@cern · Computing and storage needs 7 • Data volume • 25 PB per year (in files) • > 5.25 * 1012 rows in an Oracle table (IOT,

Consolidated replay

32

• In 11.2.0.3 apply patch 16086826

• In 11.2.0.4 and 12.1, no patch required

• My Oracle Support note 1453789.1, “Real

Application Testing: Consolidated Database

Replay Feature”

Page 32: Eric Grancher, CERN IT department, eric.grancher@cern · Computing and storage needs 7 • Data volume • 25 PB per year (in files) • > 5.25 * 1012 rows in an Oracle table (IOT,

Consolidated replay

33

1. Copy data (plug)

Page 33: Eric Grancher, CERN IT department, eric.grancher@cern · Computing and storage needs 7 • Data volume • 25 PB per year (in files) • > 5.25 * 1012 rows in an Oracle table (IOT,

Consolidated replay

34

2. Copy and process

capture files

Page 34: Eric Grancher, CERN IT department, eric.grancher@cern · Computing and storage needs 7 • Data volume • 25 PB per year (in files) • > 5.25 * 1012 rows in an Oracle table (IOT,

Consolidated replay

35

3. Configure and

initialize replay

EXEC DBMS_WORKLOAD_REPLAY.BEGIN_REPLAY_SCHEDULE ('CONS_SCHEDULE');

SELECT DBMS_WORKLOAD_REPLAY.ADD_CAPTURE ('DBA') FROM dual;

SELECT DBMS_WORKLOAD_REPLAY.ADD_CAPTURE ('DBB') FROM dual;

EXEC DBMS_WORKLOAD_REPLAY.END_REPLAY_SCHEDULE;

EXEC DBMS_WORKLOAD_REPLAY.INITIALIZE_CONSOLIDATED_REPLAY

('CONS_REPLAY','CONS_SCHEDULE');

Page 35: Eric Grancher, CERN IT department, eric.grancher@cern · Computing and storage needs 7 • Data volume • 25 PB per year (in files) • > 5.25 * 1012 rows in an Oracle table (IOT,

Consolidated replay

36

4. Remap connections

EXEC DBMS_WORKLOAD_REPLAY.REMAP_CONNECTION (schedule_cap_id =>

1,CONNECTION_ID => 1, replay_connection => 'db121ol5/pdba');

Page 36: Eric Grancher, CERN IT department, eric.grancher@cern · Computing and storage needs 7 • Data volume • 25 PB per year (in files) • > 5.25 * 1012 rows in an Oracle table (IOT,

Consolidated replay

37

5. prepare, launch wrc,

start replay

EXEC DBMS_WORKLOAD_REPLAY.PREPARE_CONSOLIDATED_REPLAY (synchronization

=> 'OBJECT_ID');

wrc

EXEC DBMS_WORKLOAD_REPLAY.START_CONSOLIDATED_REPLAY;

Page 37: Eric Grancher, CERN IT department, eric.grancher@cern · Computing and storage needs 7 • Data volume • 25 PB per year (in files) • > 5.25 * 1012 rows in an Oracle table (IOT,

Demo 4

38

• Replay multiple workloads into a pluggable

database

Page 38: Eric Grancher, CERN IT department, eric.grancher@cern · Computing and storage needs 7 • Data volume • 25 PB per year (in files) • > 5.25 * 1012 rows in an Oracle table (IOT,

Resource manager

39

• 16 4.496583 select /*+ parallel(3) */

• 16 7.048278 select /*+ parallel(3) */

• 16 3.281641 select /*+ parallel(3) */

Page 39: Eric Grancher, CERN IT department, eric.grancher@cern · Computing and storage needs 7 • Data volume • 25 PB per year (in files) • > 5.25 * 1012 rows in an Oracle table (IOT,

Resource management

40

• Resource management is critical for consolidation

• Example:

• BEGIN

• DBMS_RESOURCE_MANAGER.CREATE_CDB_PLAN_DIRECTIVE(

• plan => 'newcdb_plan',

• pluggable_database => 'salespdb',

• shares => 3,

• utilization_limit => 100,

• parallel_server_limit => 100);

• END;

• /

Page 40: Eric Grancher, CERN IT department, eric.grancher@cern · Computing and storage needs 7 • Data volume • 25 PB per year (in files) • > 5.25 * 1012 rows in an Oracle table (IOT,

If start_capture hangs

41

• From Szymon Skorupinski: dbms_workload_capture.start_capture can hang, “ADDM Jobs are in Status Executing or Running for a Long Time” (Doc ID 1557550.1). Workaround to disable automatic ADDM runs after snapshot taking works. alter system set "_addm_auto_enable"=false scope=both sid='*’;

Page 41: Eric Grancher, CERN IT department, eric.grancher@cern · Computing and storage needs 7 • Data volume • 25 PB per year (in files) • > 5.25 * 1012 rows in an Oracle table (IOT,

Library

42

• We have built of library of {source DB,

captured workload}

• Very useful for testing new version, new OS,

new platform

Page 42: Eric Grancher, CERN IT department, eric.grancher@cern · Computing and storage needs 7 • Data volume • 25 PB per year (in files) • > 5.25 * 1012 rows in an Oracle table (IOT,

Conclusion

43

• Oracle Database 12c Multitenant database for consolidation

• Replay with your applications is the only way to prepare

• Use Real Application Testing not only for major upgrades, patching and/or parameter changes.

• It is integrated with Multitenant

• Capture and Replay to measure the differences if any

• Methodology, reproducibility

• Your turn!

Page 43: Eric Grancher, CERN IT department, eric.grancher@cern · Computing and storage needs 7 • Data volume • 25 PB per year (in files) • > 5.25 * 1012 rows in an Oracle table (IOT,

References

44

• Oracle 12c testing guide,

http://docs.oracle.com/cd/E24628_01/server.121/e20852/part2.htm#CHDGFGCC

• Oracle Database Replay http://www.vldb.org/pvldb/2/vldb09-588.pdf

• Consistent Synchronization Schemes for Workload

Replayhttp://www.vldb.org/pvldb/vol4/p1225-morfonios.pdf

• Master Note for Real Application Testing Option (MOS Doc ID 1464274.1)

• Real Application Testing: Consolidated Database Replay Feature (MOS Doc ID

1453789.1)

• Pre and Post Installation Readme for Patch 16086826 DBREPLAY Patch Bundle 2 and

Database Replay Workload Consolidation Feature (MOS Doc ID 1565663.1)

• Scripts To Debug Slow Replay (MOS Doc ID 760402.1)

• Julian Dyke presentations on Database Replay

http://www.juliandyke.com/Presentations/Presentations.html

Page 44: Eric Grancher, CERN IT department, eric.grancher@cern · Computing and storage needs 7 • Data volume • 25 PB per year (in files) • > 5.25 * 1012 rows in an Oracle table (IOT,