27
Automatic Storage Management (ASM) Metrics are a Goldmine: Let’s Use Them! Bertrand Drouvot

Automatic Storage Management (ASM) metrics are a goldmine: Let's use them!

Embed Size (px)

Citation preview

Page 1: Automatic Storage Management (ASM) metrics are a goldmine: Let's use them!

Automatic Storage Management (ASM) Metrics are a Goldmine:

Let’s Use Them!

Bertrand Drouvot

Page 2: Automatic Storage Management (ASM) metrics are a goldmine: Let's use them!

Oracle DBA since 1999

OCP 9i,10g,11g

Rac certified Expert

Exadata certified implementation specialist

Blogger since 2012

@bertranddrouvot

BasketBall fan

About Me

Page 3: Automatic Storage Management (ASM) metrics are a goldmine: Let's use them!

asmcmd iostat?

asmiostat.sh from MOS [ID 437996.1])?

Me, I am not: The metrics provided are not enough, the way we can extract and display them is not customizable enough, and we don’t see the I/O repartitions within all the ASM or database instances into a RAC environment.

Are you happy with?

Page 4: Automatic Storage Management (ASM) metrics are a goldmine: Let's use them!

1. It provides useful real-time metrics:

Reads/s: Number of read per second. KbyRead/s: Kbytes read per second. Avg ms/Read: ms per read in average. AvgBy/Read: Average Bytes per read. Writes/s: Number of write per second. KbyWrite/s: Kbytes write per second. Avg ms/Write: ms per write in average. AvgBy/Write: Average Bytes per write.

2. It is RAC aware: You can display the metrics for all the ASM and (or) database instances or just a subset.

3. You can aggregate the results following your needs in a customizable way: Aggregate per ASM Instances, database instances, Diskgroup, Failgroup or a combination of all of them.

4. It does not need any change to the source: Simply download it and use it.

Welcome to asm_metrics

Page 5: Automatic Storage Management (ASM) metrics are a goldmine: Let's use them!

The script takes a snapshot each second (default interval) from the gv$asm_disk_iostat cumulative view (or gv$asm_disk_stat) and computes the delta with the previous snapshot.

The only difference with gv$asm_disk_stat is the information available in memory while v$asm_disk access the disks to re-collect some information.

Since the information required doesn’t require to “re-collect” it from the disks (as a discovery of new disks is not needed), gv$asm_disk_stat is more appropriated here.

How does it work?

Page 6: Automatic Storage Management (ASM) metrics are a goldmine: Let's use them!

Important remark:

The blank value for one of those fields (INST, DBINST, DG, FG, DSK) means that the values have been aggregated for this particular field.

Let’s use it

Page 7: Automatic Storage Management (ASM) metrics are a goldmine: Let's use them!

The metrics are computed this way:

Reads/s comes from the delta computation of the READS column divided by the snapshot wait interval.

KbyRead/s comes from the delta computation of the BYTES_READ column divided by the snapshot wait interval.

Avg ms/Read comes from the delta computation of the READ_TIME / READS columns. AvgBy/Read comes from the delta computation of the BYTES_READ / READS columns.

Writes/s comes from the delta computation of the WRITES column divided by the snapshot wait interval.

KbyWrite/s comes from the delta computation of the BYTES_WRITTEN column divided by the snapshot wait interval.

Avg ms/Write comes from the delta computation of the WRITE_TIME / WRITES columns.

AvgBy/Write comes from the delta computation of the BYTES_WRITTEN / WRITES columns.

How are the metrics computed?

Page 8: Automatic Storage Management (ASM) metrics are a goldmine: Let's use them!

What are the features? (1/3) To explain the features, let’s have a look to

the help

Page 9: Automatic Storage Management (ASM) metrics are a goldmine: Let's use them!

What are the features? (2/3)

1. You can choose the number of snapshots to display and the time to wait between the snapshots. The purpose is to see a limited number of snapshots of a specified amount of wait time between snapshots.

2. You can choose on which ASM instance to collect the metrics thanks to the -INST= parameter. Useful in RAC configuration to see the repartition of the ASM metrics per ASM instances.

3. You can choose for which DB instance to collect the metrics thanks to the -DBINST= parameter (wildcard allowed). In case you need to focus on a particular database or a subset of them.

4. You can choose on which Diskgroup to collect the metrics thanks to the -DG= parameter (wildcard allowed). In case you need to focus on a particular diskgroup or a subset of them.

5. You can choose on which Failgroup to collect the metrics thanks to the -FG= parameter (wildcard allowed). In case you need to focus on a particular failgroup or a subset of them.

Page 10: Automatic Storage Management (ASM) metrics are a goldmine: Let's use them!

What are the features? (3/3)

6. You can choose on which Exadata Cells to collect the metrics thanks to the -IP= parameter (wildcard allowed). In case you need to focus on a particular cell or a subset of them.

7. You can aggregate the results on the ASM instances, DB instances, Diskgroup, Failgroup (or Exadata cells IP) level thanks to the -SHOW= parameter. Useful to get an overview of what is going on per ASM Instances, per diskgroup or whatever you want, as this is fully customizable.

8. You can display the metrics per snapshot, the average metrics value since the collection began (that is to say since the script has been launched) or both thanks to the -DISPLAY= parameter. So that you can get the metrics per snapshots, since the script has been launched or both.

9. You can sort based on the number of reads, number of writes or number of IOPS (reads+writes) thanks to the -SORT_FIELD= parameter (so that you could for example find out which database is the top responsible for the I/O). So that you can find the ASM instances, the database Instances, or the diskgroup, or the failgroup or whatever you want that is generating most of the I/O reads, most of the I/O writes or most of the IOPS (reads+writes).

Page 11: Automatic Storage Management (ASM) metrics are a goldmine: Let's use them!

Find out the most physical IO consumers through ASM in real time. This is useful as you don’t need to connect to any database instance to get this info as this is "centralized" into the ASM instances.

Let's sort first based on the number of reads per second that way:

./asm_metrics.pl -show=dbinst -sort_field=reads

11

Use case 1

Page 12: Automatic Storage Management (ASM) metrics are a goldmine: Let's use them!

I want to see the ASM preferred read in action for a particular diskgroup

(BDT_PREF for example) and see the IO metrics for the associated

failgroups. I want to see that no reads are done "outside" the preferred

failgroup.

Let’s configure the ASM preferred read parameters:

SQL> alter system set asm_preferred_read_failure_groups='BDT_PREF.WIN' sid='+ASM1';

System altered.

SQL> alter system set asm_preferred_read_failure_groups='BDT_PREF.JMO' sid='+ASM2';

System altered.

And check its behaviour thanks to the utility:

./asm_metrics.pl -show=dg,inst,fg -dg=BDT_PREF

12

Use case 2

Page 13: Automatic Storage Management (ASM) metrics are a goldmine: Let's use them!

I want to see the IO distribution on Exadata across the Cells

(storage nodes). For example I want to check that the IO

load is well balanced across all the cells. This is feasible

thanks to the show=ip option:

./asm_metrics.pl -show=dbinst,dg,ip -dg=BDT

13

Use case 3

Page 14: Automatic Storage Management (ASM) metrics are a goldmine: Let's use them!

I want to see the IO distribution recorded into the ASM instances:

./asm_metrics.pl -show=inst

I want to see the IO distribution recorded into the ASM instances for each

database instance:

./asm_metrics.pl -show=inst,dbinst

I want to see the IO distribution recorded into the ASM instances for the

database instances linked to the BDT database:

./asm_metrics.pl -show=inst,dbinst -dbinst=%BDT%

14

Use case 4, 5 & 6

Page 15: Automatic Storage Management (ASM) metrics are a goldmine: Let's use them!

I want to see the IO distribution over the FAILGROUPS:

./asm_metrics.pl -show=fg

I want to see the IO distribution and their associated metrics across the

ASM instances and the failgroups:

./asm_metrics.pl -show=fg,inst

I want to see the IO distribution across the ASM instances, diskgroups and

failgroups:

./asm_metrics.pl -show=fg,inst,dg

15

Use case 7,8 & 9

Page 16: Automatic Storage Management (ASM) metrics are a goldmine: Let's use them!

The use cases focused only on snapshots taken during the last second but you could also:

Takes snapshots of longer period of time thanks to the interval parameter:

./asm_metrics.pl -interval=10 (for snaps of 10 seconds)

View the average since the collection began (not only the snaps delta) thanks to the display parameter that way:

./asm_metrics.pl -show=dbinst -sort_field=iops -display=avg

16

Remark

Page 17: Automatic Storage Management (ASM) metrics are a goldmine: Let's use them!

For this I created the csv_asm_metrics utility to produce a csv file from the output of the asm_metrics utility.

Once you get the csv file you can graph the metrics with your favourite visualization tool (I’ll use Tableau as an example).

First you have to launch the asm_metrics utility that way (To ensure that all the fields are displayed):

-show=inst,dbinst,fg,dg,dsk for ASM >= 11g

-show=inst,fg,dg,dsk for ASM < 11g

and redirect the output to a text file:

./asm_metrics.pl -show=inst,dbinst,fg,dg,dsk > asm_metrics.txt

17

Graphing ASM metrics

Page 18: Automatic Storage Management (ASM) metrics are a goldmine: Let's use them!

./csv_asm_metrics.pl -if=asm_metrics.txt -of=asm_metrics.csv -

d='2014/07/04'

The csv file looks like:

Snap Time,INST,DBINST,DG,FG,DSK,Reads/s,Kby Read/s,ms/Read,By/Read,Writes/s,Kby Write/s,ms/Write,By/Write

2014/07/04 13:48:54,+ASM1,BDT10_1,DATA,HOST31,HOST31CA0D1C,0,0,0.0,0,0,0,0.0,0

2014/07/04 13:48:54,+ASM1,BDT10_1,DATA,HOST31,HOST31CA0D1D,0,0,0.0,0,0,0,0.0,0

2014/07/04 13:48:54,+ASM1,BDT10_1,DATA,HOST32,HOST32CA0D1C,0,0,0.0,0,0,0,0.0,0

2014/07/04 13:48:54,+ASM1,BDT10_1,DATA,HOST32,HOST32CA0D1D,2,32,0.2,16384,0,0,0.0,0

2014/07/04 13:48:54,+ASM1,BDT10_1,FRA,HOST31,HOST31CC8D0F,0,0,0.0,0,0,0,0.0,0

2014/07/04 13:48:54,+ASM1,BDT10_1,FRA,HOST32,HOST32CC8D0F,0,0,0.0,0,0,0,0.0,0

2014/07/04 13:48:54,+ASM1,BDT10_1,REDO1,HOST31,HOST31CC0D13,0,0,0.0,0,0,0,0.0,0

As you can see:

1.The day has been added (to create a date) and next ones will be calculated (should the snaps 18

Produce the csv file

Page 19: Automatic Storage Management (ASM) metrics are a goldmine: Let's use them!

19

Visualize top IO consumers

Page 20: Automatic Storage Management (ASM) metrics are a goldmine: Let's use them!

20

By ASM instances

Page 21: Automatic Storage Management (ASM) metrics are a goldmine: Let's use them!

21

Read IO distribution by Failgroup

Page 22: Automatic Storage Management (ASM) metrics are a goldmine: Let's use them!

22

Should I use the ASM preferred read?

Page 23: Automatic Storage Management (ASM) metrics are a goldmine: Let's use them!

Without (Read IOPS and Throughput)

23

Simulate the ASM preferred read (1/3)

Page 24: Automatic Storage Management (ASM) metrics are a goldmine: Let's use them!

Create calculated field

24

Simulate the ASM preferred read (2/3)

Page 25: Automatic Storage Management (ASM) metrics are a goldmine: Let's use them!

With simulated FG

25

Simulate the ASM preferred read (3/3)

Page 26: Automatic Storage Management (ASM) metrics are a goldmine: Let's use them!

Thanks to these use cases, I hope you can see how customizable the utility is and how you could take benefit of it in a day-to-day work with ASM.

The main entry for the tool is located to this blog page: http://bdrouvot.wordpress.com/asm_metrics_script/ from which you’ll be able to download the script or copy the source code.

Feel free to download it and to provide any feedback.

26

Conclusion

Page 27: Automatic Storage Management (ASM) metrics are a goldmine: Let's use them!

Questions?