Understanding esxtop statistics

© 2009 VMware Inc. All rights reserved

Confidential

Understanding esxtop statistics

Matthew Mancini, Phoenix VMUG Leader

02/25/2010 Phoenix VMUG

2 Confidential

Disclaimer... This presentation is not indented for any specific purpose.

These are simply my findings with ESXTOP

Use at your own risk...

3 Confidential

Speaker bio Matt Mancini Leader of the Phoenix VMUG Server Room Administrator at John C Lincoln 20+ Years in IT [email protected] Blog -- vmexplorer.blogspot.com

mailto:[email protected]

4 Confidential

Agenda References What is esxtop How to use it - Basic commands - Monitor CPU & Memory

5 Confidential

References

Vmware Performance & Vmark Communities

http://communities.vmware.com/community/vmtn/general/performance?view=overview

** Interpreting esxtop Statistics **

http://communities.vmware.com/docs/DOC-9279

Resource Management Guide

http://www.vmware.com/pdf/vi3_35/esx_3/r35u2/vi3_35_25_u2_resource_mgmt.pdf

Understanding Performance


Ready time

http://www.vmware.com/pdf/esx3_ready_time.pdf

Co-Scheduleing vCPU’s


http://communities.vmware.com/community/vmtn/general/performance?view=overview


http://www.vmware.com/pdf/vi3_35/esx_3/r35u2/vi3_35_25_u2_resource_mgmt.pdf


http://www.vmware.com/pdf/esx3_ready_time.pdf


6 Confidential

ESXTOP Translations...

Does anyone know what this means?Τα ελληνικά του σε με >> Its Greek to me

How about these 2, can you translate them?

7 Confidential

What is esxtop

Esxtop allows monitoring and collection of data for all system resources: CPU, memory, disk and network. (aka the 4 basic food groups)

When used interactively, these “food groups” data can be viewed on different types of screens; CPU, memory, network, and disk adapter statistics.

If used correctly it can interpret the performance of your ESX server and the VM’s that are running on it.

8 Confidential

What is esxtop

resxtop vs esxtop

The resxtop and esxtop command‐line utilities provide a detailed look at how ESX Server uses resources in real time. You can start either utility in one of three modes: interactive (default), batch, or replay.

The fundamental difference between resxtop and esxtop is that you can use resxtop remotely (or locally), whereas esxtop can be started only through the service console of a local ESX Server host.

The resxtop utility is a Remote Command Line Interface (Remote CLI) command and before you can use any Remote CLI commands, you must download, install, and configure the Remote CLI virtual appliance.

9 Confidential

How to Use it – Basic commands

To launch – type esxtop press enter Can only be executed at Service Console and by the root user

Press q for Exit Press c for CPU Press m for Memory Press d for Disk Press n for Network Press f for fields Press Space to refresh (default refresh is 5 seconds)

Help – info esxtop << Detailed information Esxtop –h << Command line Syntax

10 Confidential


At the root# type esxtop and press enter The default display is CPU metrics (Same as pressing c)

Press m for Memory

11 Confidential


Press d for Disk

Press n for network

12 Confidential


Press f for fields

Fields allow you to choose the fields you’d like displayed

In this case I pressed b to removethe GID field then I press space to exit

Now the GID field is not displayed

Note - By exiting esxtop this resets the fields

13 Confidential


Press o for order

Use a-h to change order Uppercase move left Lowercase to the right

In this case I press ‘b’ to move the GIDField to the first position

Now the GID & ID field are swappedNote - By exiting esxtop this resets the fields

14 Confidential


Set the screen delay ‘–d’ How quick the screen will refresh

Esxtop –d XX = where XX the time in seconds

Specifies the delay between screen updates. Default is 5 seconds. Minimum is 2 seconds.

You can change this with the interactive command 's'. If a delay of less than 2 seconds is specified then the delay is set to 2 seconds.

15 Confidential

How to Use it – Monitor CPU

Uptime - The elapsed time since the server has been powered on CPU load average - CPU loads based on 1 minute, 5 minutes, and 15 minutes, based on 6-second samples.

PCPU(%) - The percentage CPU utilization per physical CPU Used total – Average of the PCPU’s CCPU(%) - Percentages of total CPU time as reported by the ESX Service Console.

"us" is for percentage user time, "sy" is for percentage system time, "id" is for percentage idle time and "wa" is for percentage wait time. "cs/sec" is for the context switches per second recorded by the ESX Service Console.

16 Confidential


Esxtop uses worlds and groups as the entities to show CPU usage. A world is an ESX Server VMkernel schedulable entity, similar to a process or thread in other operating systems. A group contains multiple worlds.

Note: the sum of the NWLD column equals the total of worlds at the top.

ID 50 -71 Names have been blanked out these are VM’s

17 Confidential


Note that ID 7 has 12 NWLD (NWLD = Number of Members in this world)

Press E to expand, type 7, press enter

ID 7 is now expanded and more detailed information is revealed

18 Confidential


%RDY – A world in a run queue is waiting for CPU scheduler to let it run on a PCPU. %RDY accounts the percentage of this time. So, it is always smaller than 100% (clearest CPU contention indicator >5% is a concern)

%CSTP – high value indicates co-scheduling overhead %USED – amount of CPU load incurred by the world

19 Confidential

How to Use it – Drill down CPU

Other CPU Commands R Sort resource pools, virtual machines, and worlds by the resource pool’s or virtual machine’s %RDY column.

N Sort resource pools, virtual machines, and worlds by the GID column. This is the default sort order.

** V Display virtual machine instances only

20 Confidential

How to Use it – Monitor Memory

It is important to note that some statistics refer to guest physical memory while others refer to machine memory. "Guest physical memory" is the virtual-hardware physical memory presented to the VM. "Machine memory" is actual physical RAM in the ESX host. Let's use the following figure to explain. In the figure, two VMs are running on an ESX host, where each block represents 4 KB of memory and each color represents a different set of data on a block.

Inside each VM, the guest OS maps the virtual memory to its physical memory. ESX Kernel maps the guest physical memory to machine memory. Due to ESX Page Sharing technology, guest physical pages with the same content can be mapped to the same machine memory page.

21 Confidential


Server Uptime Memory Over Commit 1-min, 5-min, 15-min.1.00 = 100%

Worlds

22 Confidential

How to Use – Monitor Memory

Memory overcommit, the basics. In short: When you assign more RAM to your VMs than available in your host.

Good memory overcommit: When you assign more RAM to your VMs than available in your host BUT never cross the line where the amount of RAM that is USED by your VMs is more than available in your host.

Bad memory overcommit: When you assign more RAM to your VMs than available in your host AND cross the line where the amount of RAM that is USED by your VMs is more than available in your host.

Reference - http://www.gabesvirtualworld.com/?p=930

http://www.gabesvirtualworld.com/?p=930

23 Confidential


PMEM /MB: - is the total amount of machine memory in the servertotal: XXX cos – The amount of machine memory for the ESX Service Console

"vmk" : the amount of machine memory being used by the ESX VMKernel.

"other": the amount of machine memory being used by everything other than the ESX Service Console and ESX VMKernel. "other" contains not only the memory used by VM but also the user worlds that run directly on VMKernel.

"free" : the amount of machine memory that is free.

24 Confidential


The machine memory statistics for VMKernel. "VMKMEM" (MB) total amount of machine memory managed by VMKernel.

“managed: XXX minfree” - the minimum amount of machine memory that VMKernel would like to keep free. This is because VMKernel needs to keep some amount of free memory for critical uses. "rsvd" : the amount of machine memory that is currently reserved. "rsvd" is the sum of three parts: (a) the reservation setting of the groups; (b) the overhead reservation of the groups; (c) "minfree“."ursvd" : the amount of machine memory that is currently unreserved. It is the memory available for reservation."state" : the free memory state..

Q: How do I know my ESX box is under memory pressure?

A: It is usually safe to say the ESX box is under memory pressure, if "state" is "hard" or "low". But, you need also check

"SWAP" and "MEMCTL" to find whether you are experiencing memory problems. Basically, if there is not enough free

memory and ESX are experiencing swapping or ballooning, ESX box is under memory pressure.

25 Confidential


"COSMEM" (MB) The memory statistics reported by the ESX Service Console.

"free" : the amount of idle machine memory."swap_t": the total swap configured."swap_f": the amount of swap free."r/s" : the rate at which memory is swapped in from disk."w/s" : the rate at which memory is swapped out to disk.

Q: What does it mean if I see a high r/s or w/s?A: Your console OS is swapping. It is highly likely that your COS free memory is low. You may either configure more memory for COS and restart your ESX box, or stop some programs running inside your COS.

Q: Why can't I see this COSMEM line?A: You are using ESXi not classic ESX.

26 Confidential


"PSHARE" (MB) The ESX page-sharing statistics.

“Pshare/MB: XXXX shared“ - the amount of guest physical memory that is being shared.

"common": the amount of machine memory that is common across World(s).

"saving": the amount of machine memory that is saved due to page-sharing.

27 Confidential


"SWAP" (MB) The ESX swap usage statistics.

"curr" : the current swap usage. This is the total swapped machine memory of all the groups.

"target": the swap usage expected to be. This is the total swap target of all the groups.

"r/s" : the rate at which machine memory is swapped in from disk.

"w/s" : the rate at which machine memory is swapped out to disk

28 Confidential


"MEMCTL" (MB) The memory balloon statistics.

"curr" : the total amount of physical memory reclaimed by balloon driver. This is the total ballooned memory by the VMs.

"target": total amount of ballooned memory expected to be. This is the total ballooned targets of the VMs.

"max" : the maximum amount of physical memory reclaimable.

29 Confidential

How to Use it – Drill down Memory

"MEMSZ" (MB) For a VM, it is the amount of configured guest physical memory

"SZTGT" (MB)This counter is used to determine future swapping and ballooning activities

TCHD" (MB)The amount of guest physical RAM recently used by the VM

30 Confidential


"%ACTV” Percentage of active guest physical memory, current value.

"%ACTVS” Percentage of active guest physical memory, slow moving average.

"%ACTVF” Percentage of active guest physical memory, fast moving average.

"%ACTVN” Percentage of active guest physical memory in the near future. This is an estimated value.

"OVHDUW" (MB) Amount of overhead memory reserved for the vmx user world of a VM group. This is an internal counter.

"OVHD" (MB) Amount of overhead memory currently consumed by a VM.

"OVHDMAX" (MB) Amount of reserved overhead memory for the entire VM.

31 Confidential

Next VMUG Meeting –

esxtop Stats Part II –

Disk Network Batch Mode

Documents

Understanding esxtop statistics