Automatic Storage Management (ASM) metrics are a goldmine: Let's use them!

Automatic Storage Management (ASM) Metrics are a Goldmine:

Let’s Use Them!

Bertrand Drouvot

Oracle DBA since 1999

OCP 9i,10g,11g

Rac certified Expert

Exadata certified implementation specialist

Blogger since 2012

@bertranddrouvot

BasketBall fan

About Me

asmcmd iostat?

asmiostat.sh from MOS [ID 437996.1])?

Me, I am not: The metrics provided are not enough, the way we can extract and display them is not customizable enough, and we don’t see the I/O repartitions within all the ASM or database instances into a RAC environment.

Are you happy with?

1. It provides useful real-time metrics:

Reads/s: Number of read per second. KbyRead/s: Kbytes read per second. Avg ms/Read: ms per read in average. AvgBy/Read: Average Bytes per read. Writes/s: Number of write per second. KbyWrite/s: Kbytes write per second. Avg ms/Write: ms per write in average. AvgBy/Write: Average Bytes per write.

2. It is RAC aware: You can display the metrics for all the ASM and (or) database instances or just a subset.

3. You can aggregate the results following your needs in a customizable way: Aggregate per ASM Instances, database instances, Diskgroup, Failgroup or a combination of all of them.

4. It does not need any change to the source: Simply download it and use it.

Welcome to asm_metrics

The script takes a snapshot each second (default interval) from the gv$asm_disk_iostat cumulative view (or gv$asm_disk_stat) and computes the delta with the previous snapshot.

The only difference with gv$asm_disk_stat is the information available in memory while v$asm_disk access the disks to re-collect some information.

Since the information required doesn’t require to “re-collect” it from the disks (as a discovery of new disks is not needed), gv$asm_disk_stat is more appropriated here.

How does it work?

Important remark:

The blank value for one of those fields (INST, DBINST, DG, FG, DSK) means that the values have been aggregated for this particular field.

Let’s use it

The metrics are computed this way:

Reads/s comes from the delta computation of the READS column divided by the snapshot wait interval.

KbyRead/s comes from the delta computation of the BYTES_READ column divided by the snapshot wait interval.

Avg ms/Read comes from the delta computation of the READ_TIME / READS columns. AvgBy/Read comes from the delta computation of the BYTES_READ / READS columns.

Writes/s comes from the delta computation of the WRITES column divided by the snapshot wait interval.

KbyWrite/s comes from the delta computation of the BYTES_WRITTEN column divided by the snapshot wait interval.

Avg ms/Write comes from the delta computation of the WRITE_TIME / WRITES columns.

AvgBy/Write comes from the delta computation of the BYTES_WRITTEN / WRITES columns.

How are the metrics computed?

What are the features? (1/3) To explain the features, let’s have a look to

the help

What are the features? (2/3)

1. You can choose the number of snapshots to display and the time to wait between the snapshots. The purpose is to see a limited number of snapshots of a specified amount of wait time between snapshots.

2. You can choose on which ASM instance to collect the metrics thanks to the -INST= parameter. Useful in RAC configuration to see the repartition of the ASM metrics per ASM instances.

3. You can choose for which DB instance to collect the metrics thanks to the -DBINST= parameter (wildcard allowed). In case you need to focus on a particular database or a subset of them.

4. You can choose on which Diskgroup to collect the metrics thanks to the -DG= parameter (wildcard allowed). In case you need to focus on a particular diskgroup or a subset of them.

5. You can choose on which Failgroup to collect the metrics thanks to the -FG= parameter (wildcard allowed). In case you need to focus on a particular failgroup or a subset of them.

What are the features? (3/3)

6. You can choose on which Exadata Cells to collect the metrics thanks to the -IP= parameter (wildcard allowed). In case you need to focus on a particular cell or a subset of them.

7. You can aggregate the results on the ASM instances, DB instances, Diskgroup, Failgroup (or Exadata cells IP) level thanks to the -SHOW= parameter. Useful to get an overview of what is going on per ASM Instances, per diskgroup or whatever you want, as this is fully customizable.

8. You can display the metrics per snapshot, the average metrics value since the collection began (that is to say since the script has been launched) or both thanks to the -DISPLAY= parameter. So that you can get the metrics per snapshots, since the script has been launched or both.

9. You can sort based on the number of reads, number of writes or number of IOPS (reads+writes) thanks to the -SORT_FIELD= parameter (so that you could for example find out which database is the top responsible for the I/O). So that you can find the ASM instances, the database Instances, or the diskgroup, or the failgroup or whatever you want that is generating most of the I/O reads, most of the I/O writes or most of the IOPS (reads+writes).

Find out the most physical IO consumers through ASM in real time. This is useful as you don’t need to connect to any database instance to get this info as this is "centralized" into the ASM instances.

Let's sort first based on the number of reads per second that way:

./asm_metrics.pl -show=dbinst -sort_field=reads

11

Use case 1

I want to see the ASM preferred read in action for a particular diskgroup

(BDT_PREF for example) and see the IO metrics for the associated

failgroups. I want to see that no reads are done "outside" the preferred

failgroup.

Let’s configure the ASM preferred read parameters:

SQL> alter system set asm_preferred_read_failure_groups='BDT_PREF.WIN' sid='+ASM1';

System altered.

SQL> alter system set asm_preferred_read_failure_groups='BDT_PREF.JMO' sid='+ASM2';

System altered.

And check its behaviour thanks to the utility:

./asm_metrics.pl -show=dg,inst,fg -dg=BDT_PREF

12

Use case 2

I want to see the IO distribution on Exadata across the Cells

(storage nodes). For example I want to check that the IO

load is well balanced across all the cells. This is feasible

thanks to the show=ip option:

./asm_metrics.pl -show=dbinst,dg,ip -dg=BDT

13

Use case 3

I want to see the IO distribution recorded into the ASM instances:

./asm_metrics.pl -show=inst

I want to see the IO distribution recorded into the ASM instances for each

database instance:

./asm_metrics.pl -show=inst,dbinst

I want to see the IO distribution recorded into the ASM instances for the

database instances linked to the BDT database:

./asm_metrics.pl -show=inst,dbinst -dbinst=%BDT%

14

Use case 4, 5 & 6

I want to see the IO distribution over the FAILGROUPS:

./asm_metrics.pl -show=fg

I want to see the IO distribution and their associated metrics across the

ASM instances and the failgroups:

./asm_metrics.pl -show=fg,inst

I want to see the IO distribution across the ASM instances, diskgroups and

failgroups:

./asm_metrics.pl -show=fg,inst,dg

15

Use case 7,8 & 9

The use cases focused only on snapshots taken during the last second but you could also:

Takes snapshots of longer period of time thanks to the interval parameter:

./asm_metrics.pl -interval=10 (for snaps of 10 seconds)

View the average since the collection began (not only the snaps delta) thanks to the display parameter that way:

./asm_metrics.pl -show=dbinst -sort_field=iops -display=avg

16

Remark

For this I created the csv_asm_metrics utility to produce a csv file from the output of the asm_metrics utility.

Once you get the csv file you can graph the metrics with your favourite visualization tool (I’ll use Tableau as an example).

First you have to launch the asm_metrics utility that way (To ensure that all the fields are displayed):

-show=inst,dbinst,fg,dg,dsk for ASM >= 11g

-show=inst,fg,dg,dsk for ASM < 11g

and redirect the output to a text file:

./asm_metrics.pl -show=inst,dbinst,fg,dg,dsk > asm_metrics.txt

17

Graphing ASM metrics

http://www.tableausoftware.com/public/

./csv_asm_metrics.pl -if=asm_metrics.txt -of=asm_metrics.csv -

d='2014/07/04'

The csv file looks like:

Snap Time,INST,DBINST,DG,FG,DSK,Reads/s,Kby Read/s,ms/Read,By/Read,Writes/s,Kby Write/s,ms/Write,By/Write

2014/07/04 13:48:54,+ASM1,BDT10_1,DATA,HOST31,HOST31CA0D1C,0,0,0.0,0,0,0,0.0,0

2014/07/04 13:48:54,+ASM1,BDT10_1,DATA,HOST31,HOST31CA0D1D,0,0,0.0,0,0,0,0.0,0

2014/07/04 13:48:54,+ASM1,BDT10_1,DATA,HOST32,HOST32CA0D1C,0,0,0.0,0,0,0,0.0,0

2014/07/04 13:48:54,+ASM1,BDT10_1,DATA,HOST32,HOST32CA0D1D,2,32,0.2,16384,0,0,0.0,0

2014/07/04 13:48:54,+ASM1,BDT10_1,FRA,HOST31,HOST31CC8D0F,0,0,0.0,0,0,0,0.0,0

2014/07/04 13:48:54,+ASM1,BDT10_1,FRA,HOST32,HOST32CC8D0F,0,0,0.0,0,0,0,0.0,0

2014/07/04 13:48:54,+ASM1,BDT10_1,REDO1,HOST31,HOST31CC0D13,0,0,0.0,0,0,0,0.0,0

As you can see:

1.The day has been added (to create a date) and next ones will be calculated (should the snaps 18

Produce the csv file

19

Visualize top IO consumers

20

By ASM instances

21

Read IO distribution by Failgroup

22

Should I use the ASM preferred read?

Without (Read IOPS and Throughput)

23

Simulate the ASM preferred read (1/3)

Create calculated field

24


With simulated FG

25


Thanks to these use cases, I hope you can see how customizable the utility is and how you could take benefit of it in a day-to-day work with ASM.

The main entry for the tool is located to this blog page: http://bdrouvot.wordpress.com/asm_metrics_script/ from which you’ll be able to download the script or copy the source code.

Feel free to download it and to provide any feedback.

26

Conclusion

http://bdrouvot.wordpress.com/asm_metrics_script/

Questions?

Technology

Automatic Storage Management (ASM) metrics are a goldmine: Let's use them!