39
Materials may not be reproduced in whole or in part without the prior written permission of IBM. 5.3 2011 IBM Power Systems Technical University October 10-14 | Fontainebleau Miami Beach | Miami, FL © Copyright IBM Corporation 2012 PE62 Part II: Updated Real-world Case Histories -- How to Monitor and Analyze the VMM and Storage I/O Statistics of a Power/AIX LPAR Earl Jew ([email protected]) 310-251-2907 cell Senior IT Management Consultant - IBM Power Systems and IBM Systems Storage IBM Lab Services and Training - US Power Systems (group/dept) 400 North Brand Blvd., c/o IBM 8th floor, Glendale, CA 91203 [Extended: April 4th, 2013]

Earl Jew Part II How to Monitor and Analyze AIX VMM and ......storage IO performance concepts and tactics for the day-to-day Power/AIX system administrator. It explains the meaning

  • Upload
    others

  • View
    9

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Earl Jew Part II How to Monitor and Analyze AIX VMM and ......storage IO performance concepts and tactics for the day-to-day Power/AIX system administrator. It explains the meaning

Materials may not be reproduced in whole or in part without the prior written permission of IBM. 5.3

2011IBM Power Systems Technical UniversityOctober 10-14 | Fontainebleau Miami Beach | Miami, FL

���

© Copyright IBM Corporation 2012

PE62 Part II: Updated Real-world Case Histories -- How to Monitor and Analyzethe VMM and Storage I/O Statistics of a

Power/AIX LPAR

Earl Jew ([email protected]) 310-251-2907 cellSenior IT Management Consultant - IBM Power Systems and IBM Systems Storage IBM Lab Services and Training - US Power Systems (group/dept)400 North Brand Blvd., c/o IBM 8th floor, Glendale, CA 91203 [Extended: April 4th, 2013]

Page 2: Earl Jew Part II How to Monitor and Analyze AIX VMM and ......storage IO performance concepts and tactics for the day-to-day Power/AIX system administrator. It explains the meaning

2© Copyright IBM Corporation 2012

Part I: Updated Concepts and Tactics -- How to Monitor and Analyze the VMM and Storage I/O Statistics of a Power/AIX LPAR

ABSTRACTThis presentation updates AIX/VMM (Virtual Memory Management) and LVM/JFS2

storage IO performance concepts and tactics for the day-to-day Power/AIX system administrator. It explains the meaning of the numbers offered by AIX commands (vmstat, iostat, mpstat, sar, etc.) to monitor and analyze the AIX VMM and storage IO performance and capacity of a given Power7/AIX LPAR.

These tactics are further illustrated in Part II: Updated Real-world Case Histories --How to Monitor and Analyze the VMM and Storage I/O Statistics of a Power/AIX LPAR.

Page 3: Earl Jew Part II How to Monitor and Analyze AIX VMM and ......storage IO performance concepts and tactics for the day-to-day Power/AIX system administrator. It explains the meaning

3© Copyright IBM Corporation 2012

Part II: Updated Real-world Case Histories -- How to Monitor and Analyzethe VMM and Storage I/O Statistics of a Power/AIX LPAR

ABSTRACTThese updated case-histories further illustrate the content presented in Part I:

Updated Concepts and Tactics -- How to Monitor and Analyze the VMM and Storage I/O Statistics of a Power/AIX LPAR.

This presentation includes suggested ranges and ratios of AIX statistics to guide VMM and storage IO performance and capacity analysis.

Each case is founded on a different real-world customer configuration and workload that manifests characteristically in the AIX performance statistics -- as performing: intensely in bursts, with hangs and releases, AIX:lrud constrained, AIX-buffer constrained, freely unconstrained, inode-lock contended, consistently light, atomic&synchronous, virtually nil IO workload, long avg-wait's, perfectly ideal, long avg-serv's, mostly rawIO, etc.

Page 4: Earl Jew Part II How to Monitor and Analyze AIX VMM and ......storage IO performance concepts and tactics for the day-to-day Power/AIX system administrator. It explains the meaning

4© Copyright IBM Corporation 2012

• Monitoring AIX – Usage, Meaning and Interpretation– Review component technology of the infrastructure, i.e. proper tuning-by-hardware– Review implemented AIX constructs, i.e. “firm” near-static structures and settings– Review historical/accumulated AIX events, i.e. usages, pendings, counts, blocks, etc.– Monitor dynamic AIX command behaviors, i.e. ps, vmstat, mpstat, iostat, etc.

• Recognizing Common Performance-degrading Scenarios– High Load Average relative to count-of-LCPUs, i.e. “over-threadedness”– vmstat:memory:avm near-to or greater-than lruable-gbRAM, i.e. over-committed– Continuous low vmstat:memory:fre with persistent lrud (fr:sr) activity– Continuous high ratio of vmstat:kthr:b relative to vmstat:kthr:r– Poor ratio of pages examined to pages freed (fr:sr ratio) in vmstat -s output

Strategic Thoughts, Concepts, Considerations, and Tactics

Page 5: Earl Jew Part II How to Monitor and Analyze AIX VMM and ......storage IO performance concepts and tactics for the day-to-day Power/AIX system administrator. It explains the meaning

5© Copyright IBM Corporation 2012

High Load Average relative to count-of-LCPUs, i.e. “over-threadedness”

System Configuration: lcpu=30 mem=123903MB

kthr memory page faults cpu time

----------- --------------------- ------------------------------------ ------------------ ----------- --------

r b p avm fre fi fo pi po fr sr in sy cs us sy id wa hr mi se

175 5 0 15545407 6363904 4659 378 0 0 4519 18608 6995 1798098 17456 89 11 0 0 07:39:31

169 5 0 15564915 6344425 4130 372 0 0 4269 25154 7503 1818519 17848 88 12 0 0 07:39:32

165 5 0 15592443 6316497 4697 496 0 0 4401 24161 7645 1722401 18499 88 12 0 0 07:39:33

175 4 0 15616366 6292662 5204 522 0 0 5576 39136 8115 1862990 19747 88 12 0 0 07:39:34

177 5 0 15619267 6290033 4224 673 0 0 4903 21061 7792 1772567 19948 88 12 0 0 07:39:35

186 7 0 15639664 6269361 4519 586 0 0 4394 19688 8235 1804175 22383 87 13 0 0 07:39:36

191 5 0 15651883 6257403 4158 286 0 0 4521 19371 6383 1833902 16902 88 12 0 0 07:39:37

199 2 0 15670551 6238481 2623 362 0 0 2470 11057 5157 1837258 13450 88 12 0 0 07:39:38

204 1 0 15691425 6217629 1823 292 0 0 1941 9109 3527 1885205 9547 89 11 0 0 07:39:39

207 4 0 15698443 6210608 2539 442 0 0 2718 13941 7907 1843474 19631 88 12 0 0 07:39:40

224 3 0 15715376 6194061 2113 230 0 0 2592 13741 5283 1853315 12240 88 12 0 0 07:39:41

0 0 0 15728361 6180673 2142 236 0 0 1814 9105 5295 1859673 12272 88 12 0 0 07:39:42

224 3 0 15737615 6171916 2275 220 0 0 2839 17585 5058 1931829 12176 88 12 0 0 07:39:43

238 4 0 15737613 6171746 3182 290 0 0 3108 16083 6011 1883330 14504 88 12 0 0 07:39:44

243 4 0 15739367 6169632 3016 356 0 0 2839 13574 4945 1917855 14129 89 11 0 0 07:39:45

245 2 0 15742352 6166712 2270 376 0 0 2463 15306 3546 1941029 10252 89 11 0 0 07:39:46

243 2 0 15754661 6154318 2280 312 0 0 2332 15982 3393 1892718 9638 89 11 0 0 07:39:47

244 3 0 15737393 6172133 1958 353 0 0 2592 14843 4138 1918667 11481 89 11 0 0 07:39:48

246 3 0 15737126 6172074 1682 311 0 0 1426 14808 4146 1922001 11942 89 11 0 0 07:39:49

242 2 0 15758610 6150400 1668 244 0 0 1555 8393 4324 1924860 10869 88 12 0 0 07:39:50

kthr memory page faults cpu time

----------- --------------------- ------------------------------------ ------------------ ----------- --------

r b p avm fre fi fo pi po fr sr in sy cs us sy id wa hr mi se

251 5 0 15790370 6118650 1985 452 0 0 2207 13685 5625 1903786 12530 88 12 0 0 07:39:51

259 2 0 15797418 6111729 1660 266 0 0 1937 12706 6979 1957108 16857 88 12 0 0 07:39:52

Page 6: Earl Jew Part II How to Monitor and Analyze AIX VMM and ......storage IO performance concepts and tactics for the day-to-day Power/AIX system administrator. It explains the meaning

6© Copyright IBM Corporation 2012

• Monitoring AIX – Usage, Meaning and Interpretation– Review component technology of the infrastructure, i.e. proper tuning-by-hardware– Review implemented AIX constructs, i.e. “firm” near-static structures and settings– Review historical/accumulated AIX events, i.e. usages, pendings, counts, blocks, etc.– Monitor dynamic AIX command behaviors, i.e. ps, vmstat, mpstat, iostat, etc.

• Recognizing Common Performance-degrading Scenarios– High Load Average relative to count-of-LCPUs, i.e. “over-threadedness”– vmstat:memory:avm near-to or greater-than lruable-gbRAM, i.e. over-committed– Continuous low vmstat:memory:fre with persistent lrud (fr:sr) activity– Continuous high ratio of vmstat:kthr:b relative to vmstat:kthr:r– Poor ratio of pages examined to pages freed (fr:sr ratio) in vmstat -s output

Strategic Thoughts, Concepts, Considerations, and Tactics

Page 7: Earl Jew Part II How to Monitor and Analyze AIX VMM and ......storage IO performance concepts and tactics for the day-to-day Power/AIX system administrator. It explains the meaning

7© Copyright IBM Corporation 2012

vmstat:memory:avm near-to or greater-than lruable-gbRAM;memory over-committed 3986734*4096=16329mb vs 15744mb

System configuration: lcpu=8 mem=15744MB

kthr memory page faults cpu time ----------- --------------------- ------------------------------------ ------------------ ----------- --------

r b p avm fre fi fo pi po fr sr in sy cs us sy id wa hr mi se1 1 0 3986577 2652 1944 797 0 0 1536 12803 880 2377 4459 10 4 55 31 14:17:582 2 0 3986576 2553 1863 757 0 0 2557 37067 852 4053 4446 11 4 55 30 14:18:002 1 0 3986574 2206 1959 799 0 0 2559 37499 1009 2523 4559 10 6 53 31 14:18:020 3 0 3986573 2597 2044 843 0 0 3069 42804 912 2377 4553 11 4 55 30 14:18:041 2 0 3986571 2511 1870 754 0 0 2559 167438 804 2203 4247 10 4 56 30 14:18:060 2 0 3986571 2197 1944 787 0 0 2560 102054 814 2310 4063 10 4 56 30 14:18:080 2 0 3986570 2872 1960 792 0 0 3070 42557 889 4148 4532 11 4 54 30 14:18:101 2 0 3986569 3752 1876 764 0 0 3070 65622 933 2363 4834 10 5 53 32 14:18:121 2 0 3986568 3864 1787 730 0 0 2559 49907 880 2135 4617 9 4 53 33 14:18:141 1 0 3986567 2634 1915 767 0 0 2047 30676 785 2774 3948 10 4 55 31 14:18:160 3 0 3986567 2523 1890 759 0 0 2552 27693 877 2646 4443 10 4 55 32 14:18:181 2 0 3986573 2040 2008 810 0 0 2557 23419 928 5155 4671 12 4 54 30 14:18:201 2 0 3986572 1962 1878 761 0 0 2554 52663 905 2525 4795 10 4 56 29 14:18:222 2 0 3986587 2652 1960 798 3 0 3071 14081 1030 11377 7789 13 9 51 27 14:18:242 2 0 3986570 2363 1938 781 0 0 2558 30570 836 3004 5732 10 5 56 29 14:18:262 1 0 3986734 2056 1884 762 1 0 2557 32017 888 31414 6058 15 11 47 26 14:18:282 0 0 3986617 1933 1920 779 2 0 2558 15377 933 22108 5545 15 9 48 28 14:18:301 0 0 3986612 2463 2008 826 0 0 3069 25129 1192 2823 5935 11 9 52 28 14:18:321 2 0 3986586 3073 1988 810 0 0 3064 15116 816 2732 4430 10 4 56 30 14:18:340 1 0 3986587 3402 1719 685 0 0 2555 24262 799 3395 4429 9 4 58 29 14:18:36kthr memory page faults cpu time

----------- --------------------- ------------------------------------ ------------------ ----------- --------r b p avm fre fi fo pi po fr sr in sy cs us sy id wa hr mi se0 3 0 3986582 2347 1841 748 0 0 2047 14678 865 2352 4683 10 4 56 31 14:18:380 1 0 3986580 3068 1945 784 0 0 3070 24649 784 4741 4233 11 4 55 29 14:18:400 2 0 3986583 2797 1929 780 0 0 2559 16436 806 2466 4205 10 4 57 29 14:18:42

Page 8: Earl Jew Part II How to Monitor and Analyze AIX VMM and ......storage IO performance concepts and tactics for the day-to-day Power/AIX system administrator. It explains the meaning

8© Copyright IBM Corporation 2012

• Monitoring AIX – Usage, Meaning and Interpretation– Review component technology of the infrastructure, i.e. proper tuning-by-hardware– Review implemented AIX constructs, i.e. “firm” near-static structures and settings– Review historical/accumulated AIX events, i.e. usages, pendings, counts, blocks, etc.– Monitor dynamic AIX command behaviors, i.e. ps, vmstat, mpstat, iostat, etc.

• Recognizing Common Performance-degrading Scenarios– High Load Average relative to count-of-LCPUs, i.e. “over-threadedness”– vmstat:memory:avm near-to or greater-than lruable-gbRAM, i.e. over-committed– Continuous low vmstat:memory:fre with persistent lrud (fr:sr) activity– Continuous high ratio of vmstat:kthr:b relative to vmstat:kthr:r– Poor ratio of pages examined to pages freed (fr:sr ratio) in vmstat -s output

Strategic Thoughts, Concepts, Considerations, and Tactics

Page 9: Earl Jew Part II How to Monitor and Analyze AIX VMM and ......storage IO performance concepts and tactics for the day-to-day Power/AIX system administrator. It explains the meaning

9© Copyright IBM Corporation 2012

Continuous low vmstat:memory:fre with persistent lrud (fr:sr) activityContinuous high ratio of vmstat:kthr:b relative to vmstat:kthr:r

System configuration: lcpu=8 mem=20480MB ent=3.90

kthr memory page faults cpu time ----------- --------------------- ------------------------------------ ------------------ ----------------------- --------r b p avm fre fi fo pi po fr sr in sy cs us sy id wa pc ec hr mi se1 11 0 2381462 12283 574 41 4 0 0 0 282 14044 3759 36 3 36 24 1.59 40.8 02:00:035 6 0 2383434 9229 537 21 15 0 0 0 214 9309 1221 65 8 22 5 2.94 75.4 02:00:053 5 0 2381734 10665 571 5 4 0 451 1303 442 8902 3345 83 6 5 6 3.53 90.4 02:00:073 10 0 2383409 7700 666 1 3 0 0 0 232 2606 1828 46 7 20 28 2.08 53.3 02:00:096 10 0 2383410 7743 611 0 0 0 661 1956 270 5229 3206 93 2 2 4 3.72 95.3 02:00:115 8 0 2384836 7559 544 35 1 45 1150 3962 475 9629 4267 48 10 22 19 2.33 59.7 02:00:134 7 0 2385533 8572 569 82 1 30 1437 3311 259 8388 1512 83 2 6 10 3.33 85.3 02:00:152 8 0 2387636 7716 411 90 5 7 967 2971 467 40650 2367 58 15 12 16 2.89 74.1 02:00:175 7 0 2390359 7622 542 20 19 11 1908 5953 360 18005 2195 77 4 9 10 3.19 81.8 02:00:193 11 0 2390756 7956 511 0 91 10 977 3899 178 3897 1655 59 2 17 22 2.40 61.5 02:00:213 9 0 2390761 7750 487 16 87 0 504 1368 471 3018 2893 45 2 33 19 1.89 48.5 02:00:236 7 0 2392294 7597 483 4 80 40 1227 3822 233 4682 2070 69 5 17 9 2.92 75.0 02:00:254 8 0 2392294 7837 413 30 53 0 571 2025 416 4990 5019 85 2 5 8 3.45 88.4 02:00:278 8 0 2392294 7704 409 0 0 0 385 983 184 2894 3480 53 1 25 20 2.16 55.3 02:00:29

11 8 0 2387726 19279 325 114 118 177 3971 13773 224 39964 2564 84 15 0 1 3.88 99.6 02:00:314 13 0 2402933 7698 390 656 189 118 3732 14638 436 39993 4822 53 24 7 16 3.06 78.6 02:00:335 11 0 2403390 8018 396 57 95 6 946 3780 263 23753 1784 93 3 1 3 3.77 96.6 02:00:353 11 0 2402851 7753 424 17 73 0 64 98 393 6538 2994 52 6 9 33 2.31 59.3 02:00:375 11 0 2404724 7710 354 10 49 0 1356 5981 239 4234 1928 70 6 16 9 2.96 75.8 02:00:393 11 0 2404716 8279 324 23 36 0 586 2056 436 6414 3849 76 2 8 13 3.12 80.0 02:00:41kthr memory page faults cpu time

----------- --------------------- ------------------------------------ ------------------ ----------------------- --------r b p avm fre fi fo pi po fr sr in sy cs us sy id wa pc ec hr mi se5 13 0 2404718 8054 373 0 0 0 274 784 192 1603 2058 56 1 26 17 2.36 60.4 02:00:435 13 0 2404716 7965 311 0 0 0 273 583 237 2342 2394 79 1 7 13 3.16 81.0 02:00:452 12 0 2388769 26143 353 22 35 3 322 590 390 3550 3130 48 4 25 24 2.06 52.9 02:00:47

Page 10: Earl Jew Part II How to Monitor and Analyze AIX VMM and ......storage IO performance concepts and tactics for the day-to-day Power/AIX system administrator. It explains the meaning

10© Copyright IBM Corporation 2012

• Monitoring AIX – Usage, Meaning and Interpretation– Review component technology of the infrastructure, i.e. proper tuning-by-hardware– Review implemented AIX constructs, i.e. “firm” near-static structures and settings– Review historical/accumulated AIX events, i.e. usages, pendings, counts, blocks, etc.– Monitor dynamic AIX command behaviors, i.e. ps, vmstat, mpstat, iostat, etc.

• Recognizing Common Performance-degrading Scenarios– High Load Average relative to count-of-LCPUs, i.e. “over-threadedness”– vmstat:memory:avm near-to or greater-than lruable-gbRAM, i.e. over-committed– Continuous low vmstat:memory:fre with persistent lrud (fr:sr) activity– Continuous high ratio of vmstat:kthr:b relative to vmstat:kthr:r– Poor ratio of pages freed to pages examined (fr:sr ratio) in vmstat -s output

Strategic Thoughts, Concepts, Considerations, and Tactics

Page 11: Earl Jew Part II How to Monitor and Analyze AIX VMM and ......storage IO performance concepts and tactics for the day-to-day Power/AIX system administrator. It explains the meaning

11© Copyright IBM Corporation 2012

Poor ratio of pages freed to pages examined (fr:sr ratio) in vmstat -s output

$ uptime ; vmstat –s02:17PM up 159 days, 21:47, 1 user, load average: 1.31, 1.52, 2.97

42359798551 total address trans. faults25149263165 page ins17902490831 page outs

52357061 paging space page ins59626441 paging space page outs

0 total reclaims13778823141 zero filled pages faults

804184 executable filled pages faults580181169061 pages examined by clock

310896 revolutions of the clock hand29494284299 pages freed by the clock4191584238 backtracks149218393 free frame waits

0 extend XPT waits4506482991 pending I/O waits29188011653 start I/Os8946597697 iodones

204899338951 cpu context switches26163416710 device interrupts699186076 software interrupts

31029975857 decrementer interrupts15560545 mpc-sent interrupts15560524 mpc-receive interrupts53335915 phantom interrupts

0 traps432963088862 syscalls

Page 12: Earl Jew Part II How to Monitor and Analyze AIX VMM and ......storage IO performance concepts and tactics for the day-to-day Power/AIX system administrator. It explains the meaning

12© Copyright IBM Corporation 2012

Poor ratio of pages freed to pages examined (fr:sr ratio) in vmstat -s outputGiven sustained fr:sr ratios: 1:1.1/blue 1:3/green 1:5/warning 1:10/red

$ uptime ; vmstat –Iwt 202:17PM up 159 days, 21:47, 1 user, load average: 1.31, 1.52, 2.97System configuration: lcpu=8 mem=15744MB

kthr memory page faults cpu time ----------- --------------------- ------------------------------------ ------------------ ----------- --------r b p avm fre fi fo pi po fr sr in sy cs us sy id wa hr mi se1 1 0 3986577 2652 1944 797 0 0 1536 12803 880 2377 4459 10 4 55 31 14:17:582 2 0 3986576 2553 1863 757 0 0 2557 37067 852 4053 4446 11 4 55 30 14:18:002 1 0 3986574 2206 1959 799 0 0 2559 37499 1009 2523 4559 10 6 53 31 14:18:020 3 0 3986573 2597 2044 843 0 0 3069 42804 912 2377 4553 11 4 55 30 14:18:041 2 0 3986571 2511 1870 754 0 0 2559 167438 804 2203 4247 10 4 56 30 14:18:060 2 0 3986571 2197 1944 787 0 0 2560 102054 814 2310 4063 10 4 56 30 14:18:080 2 0 3986570 2872 1960 792 0 0 3070 42557 889 4148 4532 11 4 54 30 14:18:101 2 0 3986569 3752 1876 764 0 0 3070 65622 933 2363 4834 10 5 53 32 14:18:121 2 0 3986568 3864 1787 730 0 0 2559 49907 880 2135 4617 9 4 53 33 14:18:141 1 0 3986567 2634 1915 767 0 0 2047 30676 785 2774 3948 10 4 55 31 14:18:160 3 0 3986567 2523 1890 759 0 0 2552 27693 877 2646 4443 10 4 55 32 14:18:181 2 0 3986573 2040 2008 810 0 0 2557 23419 928 5155 4671 12 4 54 30 14:18:201 2 0 3986572 1962 1878 761 0 0 2554 52663 905 2525 4795 10 4 56 29 14:18:222 2 0 3986587 2652 1960 798 3 0 3071 14081 1030 11377 7789 13 9 51 27 14:18:242 2 0 3986570 2363 1938 781 0 0 2558 30570 836 3004 5732 10 5 56 29 14:18:262 1 0 3986734 2056 1884 762 1 0 2557 32017 888 31414 6058 15 11 47 26 14:18:282 0 0 3986617 1933 1920 779 2 0 2558 15377 933 22108 5545 15 9 48 28 14:18:301 0 0 3986612 2463 2008 826 0 0 3069 25129 1192 2823 5935 11 9 52 28 14:18:321 2 0 3986586 3073 1988 810 0 0 3064 15116 816 2732 4430 10 4 56 30 14:18:340 1 0 3986587 3402 1719 685 0 0 2555 24262 799 3395 4429 9 4 58 29 14:18:36kthr memory page faults cpu time

----------- --------------------- ------------------------------------ ------------------ ----------- --------r b p avm fre fi fo pi po fr sr in sy cs us sy id wa hr mi se

Page 13: Earl Jew Part II How to Monitor and Analyze AIX VMM and ......storage IO performance concepts and tactics for the day-to-day Power/AIX system administrator. It explains the meaning

13© Copyright IBM Corporation 2012

Next: Case-histories to illustrate indications of performance issues

We will next review a parade of customer case-history textfiles.• These case-histories are founded on a mundane AIX command script, (see appendix)• Each sanitized textfile illustrates a common indicated performance issue, if not several.• Remedies to resolve will be offered, but not illustrated; most remedies are surprisingly simple.• Except for inexplicably poor performance, there are typically no other apparent issues.• In other words, Recognition is notably more problematic than Resolution.

Some indicated performance issues are:• A simple lack of CPU and/or gbRAM for the given workload, i.e. poor Tuning-by-Hardware• Improperly implemented tactics, i.e. AIX VMM parameter values that are far out-of-whack• Simply continuing to use old technologies when better technologies are free and available• Implementing tactics without understanding their purpose, appropriateness or compromise

Heads-up on 2 Hot Tips:• Understand JFS2 rbr,rbw,rbrw mount-options for predominantly sequential IO workloads• Understand the JFS2 cio mount-option for concurrently read-write IO workloads

Page 14: Earl Jew Part II How to Monitor and Analyze AIX VMM and ......storage IO performance concepts and tactics for the day-to-day Power/AIX system administrator. It explains the meaning

14© Copyright IBM Corporation 2012

An AIXperftuning tactic: The Tractor for move-the-data Sequential Read workloads – How to Qualify this as an Appropriate Tactic before Implementation

• Objective: Substantially reducing near-continuous burns of lrud’s “fr:sr” activity

• Mandatory Qualifications are:– deploying a SAN storage system of sufficient performance, i.e. V7000,DS8000,XIV,etc.– using AIX:LVM/JFS2 filesystems in-service of an RDBMS w/internal buffer cache– observing near-continuous burns of lrud’s “fr:sr” activity in vmstat -Iwt 2

– observing high runtime for lrud; check ps –ek|egrep “lrud|syncd|TIME”

– observing a sustained 5-digits or more of AIX:vmstat –Iwt 2:page:fi readIO– a confirming 10:1 or greater ratio of AIX:vmstat –s:start I/Os-to-iodones

• Bluntly, if your Power/AIX LPAR infrastructure/workload meets the above criteria, then implementing this simple tactic is Gonna-Rock BIG TIME !!!– If implemented on a workload less intense than the above, then the rock will be more like a pebble… – Of course, this tactic assumes all other dependencies are properly/sufficiently tuned.

Page 15: Earl Jew Part II How to Monitor and Analyze AIX VMM and ......storage IO performance concepts and tactics for the day-to-day Power/AIX system administrator. It explains the meaning

15© Copyright IBM Corporation 2012

An AIXperftuning tactic: The Tractor for move-the-data Sequential Read workloads – What to do to Implement this Tactic i.e. mount –o rbr <…>

From: http://pic.dhe.ibm.com/infocenter/aix/v6r1/index.jsp?topic=%2Fcom.ibm.aix.cmds%2Fdoc%2Faixcmds3%2Fmount.htm

AIX 6.1 information > Commands > mmount CommandPurposeMakes a file system available for use.Syntaxmount [ -f ] [ -n node ] [ -o options ] [ -p ] [ -r ] [ -v vfsname ] [ -t type | [ device | node:directory ] directory | all | -a ] [-V [generic_options]

special_mount_points ]DescriptionThe mount command instructs the operating system to make a file system available for use at a specified location (the mount point). In addition, you can use

the mount command to build other file trees made up of directory and file mounts. The mount command mounts a file system expressed as a device using the device or node:directory parameter on the directory specified by the directory parameter. After the mount command has finished, the directory specified becomes the root directory of the newly mounted file system.

...

...

...mount –o rbr <…>Mount file system with the release-behind-when-reading capability. When sequential reading of a file in this file system is

detected, the real memory pages used by the file will be released once the pages are copied to internal buffers. If none of the release-behind options are specified, norbrw is the default.

Note: When rbr is specified, the D_RB_READ flag is ultimately set in the _devflags field in the pdtentry structure.mount –o rbw <…>Mount file system with the release-behind-when-writing capability. When sequential writing of a file in this file system is

detected, the real memory pages used by the file will be released once the pages written to disk. If none of the release-behind options are specified, norbrw is the default.

Note: When rbw is specified, the D_RB_WRITE flag is set.mount –o rbrw <…>Mount file system with both release-behind-when-reading and release-behind-when-writing capabilities. If none of the release-

behind options are specified, norbrw is the default. Note: If rbrw is specified, both the D_RB_READ and the D_RB_WRITE flags are set.

Page 16: Earl Jew Part II How to Monitor and Analyze AIX VMM and ......storage IO performance concepts and tactics for the day-to-day Power/AIX system administrator. It explains the meaning

16© Copyright IBM Corporation 2012

An AIXperftuning tactic: The Tractor for move-the-data Sequential Read workloads – What does mount –o rbr <…> do to work so effectively?

• AIX buffers JFS2 SAN IO:– to implement its ReadAhead and WriteBehind algorithms (IO coalescence is Good)– to optimize JFS2 read-rehits and write-rehits (note: “rehits” are usually random IO)

• Substantial streams of Sequential IO almost never read-rehit or write-rehit in the JFS2 buffer cache – thus buffering Sequential IO offers little, if any, rehit benefit

• AIX:lrud indiscriminately/non-selectively uses its Clockhand to scan for Least Recently Used buffer-cache’d IOs to steal&free in-order to supply free memory– the kernel processing overhead of lrud fr:sr is often critically overwhelming– unfortunately howsoever overwhelming, it is also virtually always overlooked

Question:Is there a way to scan|steal|free Sequential IO, and only buffer-cache Random IO (for JFS2

rehits) – without suffering the kernel processing overhead of lrud fr:sr ?

Page 17: Earl Jew Part II How to Monitor and Analyze AIX VMM and ......storage IO performance concepts and tactics for the day-to-day Power/AIX system administrator. It explains the meaning

17© Copyright IBM Corporation 2012

An AIXperftuning tactic: The Tractor for move-the-data Sequential Read workloads – Question & Answer explanation regarding mount –o rbr <…>

Question:Is there a way to scan|steal|free Sequential IO, and only buffer-cache Random IO (for JFS2

rehits) – without suffering the kernel processing overhead of lrud fr:sr ?

Answer:Yes. Use mount –o rbr <…> to mount JFS2 RDBMS data filesystems.

– rbr replaces lrud fr:sr by immediately freeing only the memory used to convey Sequential Read’s to the RDBMS (thus rbr for release-behind-read).

– Unlike lrud, rbr is selective: It does no scanning of the buffer cache !!!– rbr only works when “sequential reading of a file in this file system is detected”.

Thereafter, only “the real memory pages used by the file will be released once the pages are copied to internal buffers”. These internal buffers can be the RDBMS itself.

Result:– Sequential Reads of a mount –o rbr <…> JFS2 filesystem are not buffer-cached.– This also means the Random Reads are buffer-cached for read-rehits.– The kernel processing overhead of lrud fr:sr is substantially reduced.– Overall SAN IO performance/throughput is noticeably improved with The Tractor.

Page 18: Earl Jew Part II How to Monitor and Analyze AIX VMM and ......storage IO performance concepts and tactics for the day-to-day Power/AIX system administrator. It explains the meaning

18© Copyright IBM Corporation 2012

An AIXperftuning tactic: The Tractor for move-the-data Sequential Read workloads – Other good tactics to go along with mount –o rbr <…>

Construct your LUN/hdiskVG/LV/JFS2 filesystems for the best IO performance

– Ask your storage admin for a RAID5/6/10 LUN-map to answer: Which LUNs share the same RAIDset, (and which one’s don’t), etc.?

– Do not share and reshare the same /dev/loglv01…99 log devices with more than one JFS2 filesystem. That is, only ever assign dedicated /dev/loglv’s.

– As well, howsoever convenient, try not to use INLINE jfs2log devices; they are about 5% slower than dedicated jfs2log devices.

– Create /dev/loglv01…99 on a different (not co-resident) set of RAID5 LUNs apart from its associated data LUN/LV, (study all LUN/hdisk->LVM:vg->lv/JFS2 filesystem mappings)

– Universally adopt the use of mount –o noatime.

– Monitor&tune AIX:vmstat –v:pbuf|psbuf|fsbuf blocked IOs (see Part I, 39-47)

– Consider using the counterpart mount –o rbw <…> for Sequential Write JFS2 filesystem workloads. Most RDBMS have a Write-Once-Only Sequential-Write logging mechanism; when there is no chance of JFS2 write-rehits for Sequential-Write’s, rbw is likely appropriate.

Page 19: Earl Jew Part II How to Monitor and Analyze AIX VMM and ......storage IO performance concepts and tactics for the day-to-day Power/AIX system administrator. It explains the meaning

19© Copyright IBM Corporation 2012

Criteria for Creating a Write-Expedient pagingspace_vg

The first priority should be to preclude any pagingspace-pageouts. Thus, a write-expedient pagingspace is only needed if you have any unavoidable pagingspace-pageout activity. Ultimately, if we must suffer any pagingspace-pageouts, we want them to write-out to the pagingspace as quickly as possible (thus my term: write-expedient).

So, for the sake of prudence, we should always create a write-expedient pagingspace. The listed traits below are optimal for write-expediency; include as many as you can (but always apply the key tuning tactic below):

• Create a dedicated AIX:LVM:vg (VolumeGroup) called pagingspace_vg

• Create the pagingspace_vg using FC-SAN storage LUNs (ideally RAID5 LUNs on SSD, FC or SAS technology disk drives, and not on SATA disk drives (which are slower and employs RAID6), nor on any local/internal SAS disks)

• The total size of the pagingspace in pagingspace_vg should match the size of installed LPAR gbRAM

• Assign 3-to-8 LUN/hdisks to pagingspace_vg and size each LUN to be an even fraction of installed gbRAM. For instance, if the LPAR has 18gbRAM, then assign three 6gb LUN/hdisks to pagingspace_vg

• Configure one AIX:LVM:VG:lv (logical volume) for each LUN/hdisk in pagingspace_vg; do not deploy PP-striping (because it messes-up discrete hdisk IO monitoring) –- just map one hdisk to one lv

• The key tuning tactic: With root-user privileges, use AIX:lvmo to set pagingspace_vg:pv_pbuf_count=2048. This will ensure pagingspace_vg:total_vg_pbufs will equal [<VGLUNcount> * pv_pbuf_count].

• To set the pv_pbuf_count value to 2048, type the following:lvmo -v pagingspace_vg -o pv_pbuf_count=2048

Page 20: Earl Jew Part II How to Monitor and Analyze AIX VMM and ......storage IO performance concepts and tactics for the day-to-day Power/AIX system administrator. It explains the meaning

20© Copyright IBM Corporation 2012

vmstat –I 2 # Best 6-in-1 monitor; no-load leave-it-up all-day VMM monitor

kthrNumber of kernel threads in various queues averaged per second over the sampling

interval. The kthr columns are as follows:rAverage number of kernel threads that are runnable, which includes threads that

are running and threads that are waiting for the CPU. If this number is greater than the number of CPUs, then there is at least one thread waiting for a CPU and the more threads there are waiting for CPUs, the greater the likelihood of a performance impact.

bAverage number of kernel threads in the VMM wait queue per second. This

includes threads that are waiting on filesystem I/O or threads that are blocking on a shared resource, i.e. inode-lock.

pFor vmstat -I The number of threads waiting on I/Os to raw devices per second.

Threads waiting on I/Os to filesystems would not be included here.

Page 21: Earl Jew Part II How to Monitor and Analyze AIX VMM and ......storage IO performance concepts and tactics for the day-to-day Power/AIX system administrator. It explains the meaning

21© Copyright IBM Corporation 2012

vmstat –I 2 # Best 6-in-1 monitor; no-load leave-it-up all-day VMM monitor

memoryProvides information about the real and virtual memory.

avmThe Active Virtual Memory, avm, column represents the number of active virtual

memory pages present at the time the vmstat sample was collected. It is the sum-total of all computational memory – including content paged-out to the pagingspace. The avm statistics do not include file pages.

freThe fre column shows the average number of free memory pages. A page is a 4

KB area of real memory. The system maintains a buffer of memory pages, called the free list, that will be readily accessible when the VMM needs space. The minimum number of pages that the VMM keeps on the free list is determined by the minfree parameter of the vmo command.

Page 22: Earl Jew Part II How to Monitor and Analyze AIX VMM and ......storage IO performance concepts and tactics for the day-to-day Power/AIX system administrator. It explains the meaning

22© Copyright IBM Corporation 2012

page [ fi and fo are only included with vmstat –I ]Information about page faults and paging activity. These are averaged over the

interval and given in units per second.fiThe fi column details the number of pages paged-in from persistent storage, i.e.

pages read-in from JFS/JFS2 file systems on disk. This does not include pagingspace-pagein’s from the pagingspace; rather, these are filesystem-reads.

foThe fo column details the number of pages paged-out to persistent storage, i.e.

pages written-out to JFS/JFS2 file systems on disk. This does not include pagingspace-pageout’s to the pagingspace; rather, these are filesystem-writes.

vmstat –I 2 # Best 6-in-1 monitor; no-load leave-it-up all-day VMM monitor

Page 23: Earl Jew Part II How to Monitor and Analyze AIX VMM and ......storage IO performance concepts and tactics for the day-to-day Power/AIX system administrator. It explains the meaning

23© Copyright IBM Corporation 2012

vmstat –I 2 # Best 6-in-1 monitor; no-load leave-it-up all-day VMM monitor

Page (continued)Information about page faults and paging activity. These are averaged over the interval and given in units per

second.pi The pi column details the number of pages paged in from paging space. Paging space is the part of virtual

memory that resides on disk. It is used as an overflow when memory is over committed. Paging space consists of logical volumes dedicated to the storage of working set pages that have been stolen from real memory. When a stolen page is referenced by the process, a page fault occurs, and the page must be read into memory from paging space.

Due to the variety of configurations of hardware, software and applications, there is no absolute number to look out for. This field is important as a key indicator of paging-space activity. If a page-in occurs, there must have been a previous page-out for that page. It is also likely in a memory-constrained environment that each page-in will force a different page to be stolen and, therefore, paged out.

po The po column shows the number (rate) of pages paged out to paging space. Whenever a page of working

storage is stolen, it is written to paging space, if it does not yet reside in paging space or if it was modified. If not referenced again, it will remain on the paging device until the process terminates or disclaims the space. Subsequent references to addresses contained within the faulted-out pages results in page faults, and the pages are paged in individually by the system. When a process terminates normally, any paging space allocated to that process is freed. If the system is reading in a significant number of persistent pages, you might see an increase in po without corresponding increases in pi. This does not necessarily indicate thrashing, but may warrant investigation into data-access patterns of the applications.

Page 24: Earl Jew Part II How to Monitor and Analyze AIX VMM and ......storage IO performance concepts and tactics for the day-to-day Power/AIX system administrator. It explains the meaning

24© Copyright IBM Corporation 2012

vmstat –I 2 # Best 6-in-1 monitor; no-load leave-it-up all-day VMM monitor

page (continued)Information about page faults and paging activity. These are averaged over the interval and

given in units per second.frNumber of pages that were freed per second by the page-replacement algorithm during the

interval. As the VMM page-replacement routine scans the Page Frame Table, or PFT, it uses criteria to select which pages are to be stolen to replenish the free list of available memory frames. The criteria include both kinds of pages, working (computational) and file (persistent) pages. Just because a page has been freed, it does not mean that any I/O has taken place. For example, if a persistent storage (file) page has not been modified, it will not be written back to the disk. If I/O is not necessary, minimal system resources are required to free a page.

srNumber of pages that were examined per second by the page-replacement algorithm during

the interval. The page-replacement algorithm might have to scan many page frames before it can steal enough to satisfy the page-replacement thresholds. The higher the srvalue compared to the fr value, the harder it is for the page-replacement algorithm to find eligible pages to steal.

Page 25: Earl Jew Part II How to Monitor and Analyze AIX VMM and ......storage IO performance concepts and tactics for the day-to-day Power/AIX system administrator. It explains the meaning

25© Copyright IBM Corporation 2012

vmstat –I 2 # Best 6-in-1 monitor; no-load leave-it-up all-day VMM monitor

faultsInformation about process control, such as trap and interrupt rate. The faults columns are as follows:

inNumber of device interrupts per second observed in the interval.

syThe number of system calls per second observed in the interval. Resources are available to user

processes through well-defined system calls. These calls instruct the kernel to perform operations for the calling process and exchange data between the kernel and the process. Because workloads and applications vary widely, and different calls perform different functions, it is impossible to define how many system calls per-second are too many. But typically, when the sy column raises over 10000 calls per second on a uniprocessor, further investigations is called for (on an SMP system the number is 10000 calls per second per processor). One reason could be "polling" subroutines like the select() subroutine. For this column, it is advisable to have a baseline measurement that gives a count for a normal sy value.

csNumber of context switches per second observed in the interval. The physical CPU resource is

subdivided into logical time slices of 10 milliseconds each. Assuming a thread is scheduled for execution, it will run until its time slice expires, until it is preempted, or until it voluntarily gives up control of the CPU. When another thread is given control of the CPU, the context or working environment of the previous thread must be saved and the context of the current thread must be loaded. The operating system has a very efficient context switching procedure, so each switch is inexpensive in terms of resources. Any significant increase in context switches, such as when cs is a lot higher than the disk I/O and network packet rate, should be cause for further investigation.

Page 26: Earl Jew Part II How to Monitor and Analyze AIX VMM and ......storage IO performance concepts and tactics for the day-to-day Power/AIX system administrator. It explains the meaning

26© Copyright IBM Corporation 2012

vmstat –I 2 # Best 6-in-1 monitor; no-load leave-it-up all-day VMM monitor

cpuPercentage breakdown of CPU time usage during the interval. The cpu columns are as follows:

usThe us column shows the percent of CPU time spent in user mode. A UNIX® process can execute in either user

mode or system (kernel) mode. When in user mode, a process executes within its application code and does not require kernel resources to perform computations, manage memory, or set variables.

syThe sy column details the percentage of time the CPU was executing a process in system mode. This includes

CPU resource consumed by kernel processes (kprocs) and others that need access to kernel resources. If a process needs kernel resources, it must execute a system call and is thereby switched to system mode to make that resource available. For example, reading or writing of a file requires kernel resources to open the file, seek a specific location, and read or write data, unless memory mapped files are used.

idThe id column shows the percentage of time which the CPU is idle, or waiting, without pending local disk I/O. If

there are no threads available for execution (the run queue is empty), the system dispatches a thread called wait, which is also known as the idle kproc. On an SMP system, one wait thread per processor can be dispatched. The report generated by the ps command (with the -k or -g 0 option) identifies this as kproc or wait. If the ps report shows a high aggregate time for this thread, it means there were significant periods of time when no other thread was ready to run or waiting to be executed on the CPU. The system was therefore mostly idle and waiting for new tasks.

waThe wa column details the percentage of time the CPU was idle with pending local disk I/O and NFS-mounted

disks. If there is at least one outstanding I/O to a disk when wait is running, the time is classified as waiting for I/O. Unless asynchronous I/O is being used by the process, an I/O request to disk causes the calling process to block (or sleep) until the request has been completed. Once an I/O request for a process completes, it is placed on the run queue. If the I/Os were completing faster, more CPU time could be used.

A wa value over 25 percent could indicate that the disk subsystem might not be balanced properly, or it might be the result of a disk-intensive workload.

Page 27: Earl Jew Part II How to Monitor and Analyze AIX VMM and ......storage IO performance concepts and tactics for the day-to-day Power/AIX system administrator. It explains the meaning

27© Copyright IBM Corporation 2012

Exercise&experiment with the JFS2 default mount and Raw I/O

By default, file pages can be cached in real memory for file systems. The caching can be disabled using direct I/O or concurrent I/O mount options; also, the Release-Behind mount options can be used to quickly discard file pages from memory after they have been copied to the application's I/O buffers if the read-ahead and write-behind benefits of cached file systems are needed.

JFS2 default mount -- AIX uses file caching as the default method of file access. However, file caching consumes more CPU and significant system memory because of data duplication. The file buffer cache can improve I/O performance for workloads with a high cache-hit ratio. And file system readahead can help database applications that do a lot of table scans for tables that are much larger than the database buffer cache.

Raw I/O -- Database applications traditionally use raw logical volumes instead of the file system for performance reasons. Writes to a raw device bypass the caching, logging, and inode locks that are associated with the file system; data gets transferred directly from the application buffer cache to the disk. If an application is update-intensive with small I/O requests, then a raw device setup for database data and logging can help performance and reduce the usage of memory resources.

Page 28: Earl Jew Part II How to Monitor and Analyze AIX VMM and ......storage IO performance concepts and tactics for the day-to-day Power/AIX system administrator. It explains the meaning

28© Copyright IBM Corporation 2012

Exercise&experiment with the JFS2 Direct I/O and Concurrent I/O mount options

By default, file pages can be cached in real memory for file systems. The caching can be disabled using direct I/O or concurrent I/O mount options; also, the Release-Behind mount options can be used to quickly discard file pages from memory after they have been copied to the application's I/O buffers if the read-ahead and write-behind benefits of cached file systems are needed.

• Direct I/O – DIO is similar to rawIO except it is supported under a file system. DIO bypasses the file system buffer cache, which reduces CPU overhead and makes more memory available to others (that is, to the database instance). DIO has similar performance benefit as rawIO but is easier to maintain for the purposes of system administration. DIO is pro-vided for applications that need to bypass the buffering of memory within the file system cache. For instance, some technical workloads never reuse data because of the sequential nature of their data access. This lack of data reuse results in a poor buffer cache hit rate, which means that these workloads are good candidates for DIO.

• Concurrent I/O -- CIO supports concurrent file access to files. In addition to bypassing the file cache, it also bypasses the inode lock that allows multiple threads to perform reads and writes simultaneously on a shared file. CIO is designed for relational database applications, most of which will operate under CIO without any modification. Applications that do not enforce serialization for access to shared files should not use CIO. Applications that issue a large amount of reads usually will not benefit from CIO either.

Page 29: Earl Jew Part II How to Monitor and Analyze AIX VMM and ......storage IO performance concepts and tactics for the day-to-day Power/AIX system administrator. It explains the meaning

29© Copyright IBM Corporation 2012

Exercise&experiment with the JFS2 Release-Behind Read/Write mechanisms

Release-behind-read and release-behind-write allow the file system to release the file pages from file system buffer cache as soon as an application has read or written the file pages. This feature helps the performance when an application performs a great deal of sequential reads or writes. Most often, these file pages will not be reassessed after they are accessed.

Without this option, the memory will still be occupied with no benefit of reuse, which causes paging eventually after a long run. When writing a large file without using release-behind, writes will go very fast as long as pages are available on the free list. When the number of pages drops to minfree, VMM uses its LRU algorithm to find candidate pages for eviction.

This feature can be configured on a file system basis. When using the mount command, enable release-behind by specifying one of the three flags below:

– The release-behind sequential read flag (rbr) – The release-behind sequential write flag (rbw) – The release-behind sequential read and write flag (rbrw)

A trade-off of using the release-behind mechanism is that the application can experience an increase in CPU utilization for the same read or write throughput rate (as compared to not using release-behind). This is because of the work required to free the pages, which is normally handled at a later time by the LRU daemon. Also note that all sequential IO file page accesses result in disk I/O because sequential IO file data is not cached by VMM. However, applications (especially long-running applications) with the release-behind mechanism applied are still likely to perform more optimally and with greater stability.

Page 30: Earl Jew Part II How to Monitor and Analyze AIX VMM and ......storage IO performance concepts and tactics for the day-to-day Power/AIX system administrator. It explains the meaning

30© Copyright IBM Corporation 2012

Appendix: AIXperfdataScript 2012Oct30.txt (page 1)

#!/bin/ksh -x

# Assumes IBM System P POWER4-->POWER7+ and AIX 5.3/6.1/7.1# Earl Jew -- Senior IT Management Consultant - IBM Power Systems and IBM Systems Storage # IBM Lab Services and Training - US Power Systems (group/dept) # IBM Certified Technical Sales Specialist - Power Systems with POWER7 and AIX - V1 # IBM Certified Specialist - Midrange Storage Technical Support V2 # IBM Certified Specialist - Enterprise Storage Technical Support V2 # 400 North Brand Blvd., c/o IBM 8th floor, Glendale, CA 91203 # [email protected] (310) 251-2907 cell# Version: October 30, 2012

# Mundane Performance Data Collection script:# NOTE: There is a subsection of rootuser commands in this script.

# Please execute for data-collection and send the collection to [email protected],# and I will review and offer my findings by telephone/concall.

# Please execute this script when there is an active workload of concern. # The script below collects 500kb-20mb of textdata per run.

#================================================================================

dateuname -aidoslevel -s

lparstat -i

uptimevmstat -svmstat -vvmstat -Iwt 1 80

ps -ekf | grep -v egrep | egrep "syncd|lrud|nfsd|biod|wait|getty|xmwlm“…

Page 31: Earl Jew Part II How to Monitor and Analyze AIX VMM and ......storage IO performance concepts and tactics for the day-to-day Power/AIX system administrator. It explains the meaning

31© Copyright IBM Corporation 2012

Appendix: AIXperfdataScript 2012Oct30.txt (page 2)

…ipcs -bm

lsps -alsps -s

lssrad -av

mountdf -kcat /etc/filesystemscat /etc/xtabshowmount

prtconf

ps -el | wcps -elmo THREAD | wcps -kl | wcps -klmo THREAD | wc

nfsstat

##### BEGIN rootuser-privileges sectionvmo -L # requires root-user to execute; makes no changesioo -L # requires root-user to execute; makes no changesno -L # requires root-user to execute; makes no changesnfso -L # requires root-user to execute; makes no changesschedo -L # requires root-user to execute; makes no changesraso -L # requires root-user to execute; makes no changes

lvmo -L # requires root-user to execute; makes no changes

for VG in `lsvg`do

lvmo -a -v $VG # requires root-user to execute; makes no changesecho

done…

Page 32: Earl Jew Part II How to Monitor and Analyze AIX VMM and ......storage IO performance concepts and tactics for the day-to-day Power/AIX system administrator. It explains the meaning

32© Copyright IBM Corporation 2012

Appendix: AIXperfdataScript 2012Oct30.txt (page 3)

…uptimesar -a 2 40 # requires root-user to execute; makes no changessar -b 2 40 # requires root-user to execute; makes no changessar -c 2 40 # requires root-user to execute; makes no changessar -k 2 40 # requires root-user to execute; makes no changessar -d 2 40 # requires root-user to execute; makes no changes##### END rootuser-privileges section

aioo -alsdevlscfglsconf

uptimevmstat -Iwt 1 80

uptimempstat -w 2 40uptimempstat -dw 2 40uptimempstat -i 2 40mpstat -w 2 40vmstat -fvmstat -i

nfso -a

lspvfor VG in `lsvg`do

lsvg $VG ; echolsvg -p $VG ; echo ; echo ; echo

done

echo "\n\n============== ps -ef ==============================================================="ps -efecho "\n\n============== ps -kf ==============================================================="ps –kf…

Page 33: Earl Jew Part II How to Monitor and Analyze AIX VMM and ......storage IO performance concepts and tactics for the day-to-day Power/AIX system administrator. It explains the meaning

33© Copyright IBM Corporation 2012

Appendix: AIXperfdataScript 2012Oct30.txt (page 4)

echo "\n\n============== ps -el ==============================================================="ps -elecho "\n\n============== ps -kl ==============================================================="ps -klecho "\n\n============== ps -elmo THREAD ======================================================"ps -elmo THREADecho "\n\n============== ps -klmo THREAD ======================================================"ps -klmo THREADecho "\n\n============== ps guww =============================================================="ps guwwecho "\n\n============== ps gvww =============================================================="ps gvwwecho "\n\n============================================================================="echo "\n\n============================================================================="

ifconfig -anetstat -ssnetstat -innetstat -rnnetstat -mnetstat -vnetstat -cnetstat -Cnetstat -Dnetstat -snetstat -Mnetstat -A

iostat -aiostat -s

uptimeiostat -aT 2 40 | grep -v "0 0.0 0 0 0.0 0.0"iostat -mT 2 40 | grep -v "0.0 0.0 0.0 0 0" | grep -v "tm_act"iostat -AQ 2 40 | grep -v " 0 "iostat -DRTl 60 4

uptimevmstat –s ; vmstat -vvmstat -Iwt 1 80 ; date ; id ; uname -a

Page 34: Earl Jew Part II How to Monitor and Analyze AIX VMM and ......storage IO performance concepts and tactics for the day-to-day Power/AIX system administrator. It explains the meaning

34© Copyright IBM Corporation 2012

Session Evaluations

• ibmtechu.com/vp

Prizes will be drawn from

Evals

Page 35: Earl Jew Part II How to Monitor and Analyze AIX VMM and ......storage IO performance concepts and tactics for the day-to-day Power/AIX system administrator. It explains the meaning

35© Copyright IBM Corporation 2012

Page 36: Earl Jew Part II How to Monitor and Analyze AIX VMM and ......storage IO performance concepts and tactics for the day-to-day Power/AIX system administrator. It explains the meaning

Materials may not be reproduced in whole or in part without the prior written permission of IBM. 5.3

2011IBM Power Systems Technical UniversityOctober 10-14 | Fontainebleau Miami Beach | Miami, FL

���

© Copyright IBM Corporation 2012

Thank you

Earl Jew ([email protected]) 310-251-2907 cellSenior IT Management Consultant - IBM Power Systems and IBM Systems Storage IBM Lab Services and Training - US Power Systems (group/dept)400 North Brand Blvd., c/o IBM 8th floor, Glendale, CA 91203

Page 37: Earl Jew Part II How to Monitor and Analyze AIX VMM and ......storage IO performance concepts and tactics for the day-to-day Power/AIX system administrator. It explains the meaning

37© Copyright IBM Corporation 201237 4-Apr-13

Trademarks

The following are trademarks of the International Business Machines Corporation in the United States, other countries, or both.

The following are trademarks or registered trademarks of other companies.

* All other products may be trademarks or registered trademarks of their respective companies.

Notes: Performance is in Internal Throughput Rate (ITR) ratio based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput that any user will experience will vary depending upon considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve throughput improvements equivalent to the performance ratios stated here. IBM hardware products are manufactured from new parts, or new and serviceable used parts. Regardless, our warranty terms apply.All customer examples cited or described in this presentation are presented as illustrations of the manner in which some customers have used IBM products and the results they may have achieved. Actual environmental costs and performance characteristics will vary depending on individual customer configurations and conditions.This publication was produced in the United States. IBM may not offer the products, services or features discussed in this document in other countries, and the information may be subject to change without notice. Consult your local IBM business contact for information on the product or services available in your area.All statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only.Information about non-IBM products is obtained from the manufacturers of those products or their published announcements. IBM has not tested those products and cannot confirm the performance, compatibility, or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.Prices subject to change without notice. Contact your IBM representative or Business Partner for the most current pricing in your geography.

Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other countries.Cell Broadband Engine is a trademark of Sony Computer Entertainment, Inc. in the United States, other countries, or both and is used under license therefrom. Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both. Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both.Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.UNIX is a registered trademark of The Open Group in the United States and other countries. Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both. ITIL is a registered trademark, and a registered community trademark of the Office of Government Commerce, and is registered in the U.S. Patent and Trademark Office.IT Infrastructure Library is a registered trademark of the Central Computer and Telecommunications Agency, which is now part of the Office of Government Commerce.

For a complete list of IBM Trademarks, see www.ibm.com/legal/copytrade.shtml:

*, AS/400®, e business(logo)®, DBE, ESCO, eServer, FICON, IBM®, IBM (logo)®, iSeries®, MVS, OS/390®, pSeries®, RS/6000®, S/30, VM/ESA®, VSE/ESA, WebSphere®, xSeries®, z/OS®, zSeries®, z/VM®, System i, System i5, System p, System p5, System x, System z, System z9®, BladeCenter®

Not all common law marks used by IBM are listed on this page. Failure of a mark to appear does not mean that IBM does not use the mark nor does it mean that the product is not actively marketed or is not significant within its relevant market.

Those trademarks followed by ® are registered trademarks of IBM in the United States; all others are trademarks or common law marks of IBM in the United States.

Page 38: Earl Jew Part II How to Monitor and Analyze AIX VMM and ......storage IO performance concepts and tactics for the day-to-day Power/AIX system administrator. It explains the meaning

38© Copyright IBM Corporation 2012

Disclaimers

No part of this document may be reproduced or transmitted in any form without written permission from IBM Corporation.

Product data has been reviewed for accuracy as of the date of initial publication. Product data is subject to change without notice. This information could include technical inaccuracies or typographical errors. IBM may make improvements and/or changes in the product(s) and/or program(s) at any time without notice. Any statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only.

The performance data contained herein was obtained in a controlled, isolated environment. Actual results that may be obtained in other operating environments may vary significantly. While IBM has reviewed each item for accuracy in a specific situation, there is no guarantee that the same or similar results will be obtained elsewhere. Customer experiences described herein are based upon information and opinions provided by the customer. The same results may not be obtained by every user.

Reference in this document to IBM products, programs, or services does not imply that IBM intends to make such products, programs or services available in all countries in which IBM operates or does business. Any reference to an IBM Program Product in this document is not intended to state or imply that only that program product may be used. Any functionally equivalent program, that does not infringe IBM's intellectual property rights, may be used instead. It is the user's responsibility to evaluate and verify the operation on any non-IBM product, program or service.

THE INFORMATION PROVIDED IN THIS DOCUMENT IS DISTRIBUTED "AS IS" WITHOUT ANY WARRANTY, EITHER EXPRESS OR IMPLIED. IBM EXPRESSLY DISCLAIMS ANY WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE OR INFRINGEMENT. IBM shall have no responsibility to update this information. IBM products are warranted according to the terms and conditions of the agreements (e.g. IBM Customer Agreement, Statement of Limited Warranty, International Program License Agreement, etc.) under which they are provided. IBM is not responsible for the performance or interoperability of any non-IBM products discussed herein.

Page 39: Earl Jew Part II How to Monitor and Analyze AIX VMM and ......storage IO performance concepts and tactics for the day-to-day Power/AIX system administrator. It explains the meaning

39© Copyright IBM Corporation 2012

Disclaimers (Continued)

Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products in connection with this publication and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.

The providing of the information contained herein is not intended to, and does not, grant any right or license under any IBM patents or copyrights. Inquiries regarding patent or copyright licenses should be made, in writing, to:

IBM Director of LicensingIBM CorporationNorth Castle DriveArmonk, NY 10504-1785USA

IBM customers are responsible for ensuring their own compliance with legal requirements. It is the customer's sole responsibility to obtain advice of competent legal counsel as to the identification and interpretation of any relevant laws and regulatory requirements that may affect the customer's business and any actions the customer may need to take to comply with such laws.

IBM does not provide legal advice or represent or warrant that its services or products will ensure that the customer is in compliance with any law.

The information contained in this documentation is provided for informational purposes only. While efforts were made to verify the completeness and accuracy of the information provided, it is provided “as is” without warranty of any kind, express or implied. IBM shall not be responsible for any damages arising out of the use of, or otherwise related to, this documentation or any other documentation. Nothing contained in this documentation is intended to, nor shall have the effect of, creating any warranties or representations from IBM (or its suppliers or licensors), or altering the terms and conditions of the applicable license agreement governing the use of IBM software.