Home > performance, PowerCLI, PowerShell, statistics, vSphere > PowerCLI & vSphere statistics – Part 1 – The basics

PowerCLI & vSphere statistics – Part 1 – The basics

December 30th, 2009 Leave a comment Go to comments

Another popular subject in the VMTN PowerCLI community are statistics. Quite often it’s not entirely clear to the user what is available, how the data can be extracted and how PowerShell/PowerCLI can be used to convert the raw metrics into usable reports.

Before you can fully use all that is available, there are a few key concepts that you should understand.

In this series I will try to explain some common questions.

Introduction

Since I want to focus on the practicalities of statistics, I’m not going to repeat what is already available in several excellent papers. The following are some of the papers that you should read or locations you should visit regularly, if you want to fully understand the gathering and usage of vSphere statistical data.

  • Understanding VirtualCenter Performance Statistics should be read by anyone working with vCenter performance data. Gives a clear explanation of intervals, statistics levels and the update interval.
  • vCenter Performance Counters an overview of the available vCenter performance counters with for each of them a short description of their purpose
  • Understanding performance a great collection of documents that explain several key concepts in virtual performance.
  • Performance & VMmark VMTN community is where you will find interesting discussions on performance and also a lot of documents on performance. And where you can of course post your questions !
  • VROOM! “THE BLOG” on performance. The posts are written by members of VMware’s Performance Engineering Team.. Contains deep-dives, best practices, comparisons… Subscribe and read !
  • Technical papers VMware publishes numerous technical papers on performance. A bit of overlap with the previous sources I mentioned but check out the list of published papers regularly.
  • The vSphere Web Services SDK Programming Guide which contains in chapter 12 “Monitoring Performance” an excellent and detailed overview of the key concepts.
  • The excellent Performance Troubleshooting for VMware vSphere 4 and ESX 4.0 document. It offers a very useful wizard to analyse performance problems.

Intervals

Although I promised not to repeat anything from the sources I mentioned above, this is one key concept about performance data in a vCenter environment that should be clearly understood before you start coding.

In short:

  • on an ESX/ESXi server the performance data is collected for 20 seconds intervals. This is the so-called Realtime interval. This data is kept on the ESX/ESXi server for about one hour.
  • The ESX/ESXi server will aggregate the realtime data into 5 minute interval data.
  • The unmanaged ESX/ESXi server will keep this data for 1 day.
  • The managed ESX/ESXi server will send the data to the vCenter, which stores it in the vCenter database.
  • In the vCenter this 5-minute interval is consolidated into 30-minute interval data. This data is kept for 1 week.
  • The aggregation/consolidation of the data is done on the database where you host you vCenter db. The following are some examples from a SQL 2005 Server.
    • The actual “rollup” jobs. Ignore the “event cleanup” job.
    • The schedule of the daily “rollup” job.

  • There are two more intervals, the 2-hour interval, which is kept for 1 month, and the 1-day interval, which is kept for 1 year.
  • These 4 intervals are sometimes referred to as Historical Interval 1 (5 minutes), Historical Interval 2 (30 minutes), Historical Interval 3 (2 hours) and Historical Interval 4 (1 day).

A schematic will perhaps make this a bit clearer.

Statistics level

The statistical levels are another key concept you need to understand before you dive into statistical data gathering. In short: the statistical levels define which metrics are available in which interval.

The available statistical levels are defined as follows:

Level Description
1 Basic metrics. Device metrics excluded. Only average rollups.
2 All metrics except those for devices. Maximum and minimum rollups excluded.
3 All metrics, maximim and minimum rollups excluded
4 All metrics

You define the statistical level for each “historical” interval in the vSphere Client under <Administration><vCenter Server Settings><Statistics>.

You could decide to keep all the metrics in all intervals, but you will most probably not need that level of detail for the “older” intervals. And don’t forget that the more data you keep, the bigger your VC database will become (longer roll up jobs, longer backups…).

As an example let’s take the metrics around cpu.usage. This metric comes in 3 types of roll up: minimum, average and maximum. While it definitely is useful to have all these roll up types for the realtime interval and most possibly the historical interval 1, there is really not justification to keep all these for historical interval 4.

In this case you can then safely define the statistical level for historical interval 4 to be level 2.

Note that the vSphere client gives you an estimate of the required space on your VC database for the statistics levels and retention periods you define. Just fill in the number of hosts and guests and you will have a rough idea !

Instances

The last key concept you need to know (I promise ;-) ) before we start scripting.

The official definition from the VMware vSphere API Reference Documentation: “An identifier that is derived from configuration names for the device associated with the metric. It identifies the instance of the metric with its source.“.

Let’s try to make this a bit more understandable through an example.

Take the CPU-related metrics for an ESX/ESXi host. If the ESX/ESXi server is equipped with a quadcore CPU, there will be four instances: 0, 1, 2 and 3. In this case the instance corresponds with the numeric position within the CPU core

And there will be a so-called aggregate, which is the metric averaged over all the instances.

These instances each get their own identifier which will be part of the returned statistical data. The aggregate instance is always represented by a blank identifier.

Scripting

What is available ?

To see all the available metrics you can consult Appendix C: Performance Metrics in the Basic System Administration manual. But that appendix might not always reflect the actual situation. Due to for example upgrades or patches that introduce or change one or more metrics.

A better method is to just ask the system. With the introduction of PowerCLI 4 a new cmdlet, Get-StatType, was introduced that does just that.

The cmdlet has one required parameter, called -Entity, that allows us to indicate for which object we want to query the available metrics.

Get-StatType -Entity (Get-VMHost $esxName)

In this form you will get all the metrics for an ESX/ESXi server and for all intervals. This last fact is not really useful if we have different statistical levels defined for the different intervals.
Luckily we have the -Interval parameter. It allows you to specify the interval for which you want to see the available metrics. But does this mean that you will have to learn all the statistical intervals, duration or name, by hearth ? No, you can use the Get-StatInterval to get a list of all the historical intervals.

Get-StatInterval

It returns something like this

With this knowledge we can now launch the Get-StatType cmdlet with a more specific request

Get-StatType -Entity (Get-VMHost $esxName) -Interval "Past Day"

And this returns a list of metrics like this.

But it still looks confusing! Why do we have apparently nine times the same metric in the list ?

To take away the solution, it’s due to the instances that are available for this metric.

In this case the ESX/ESXi host contained a quadcore and hyper-threading was active, this means that there are 8 logical CPU cores as far as vSphere is concerned. And the ninth entry comes from the aggregate value, the average over all logical CPU cores.

Unfortunately the Get-StatType cmdlet doesn’t give you the available instances. To see those you will have to look at the actual statistical data. Something like this


Get-Stat -Entity (Get-VMHost $esxName) -Stat cpu.usage.average -MaxSamples 1 -IntervalMins 5

And this returns something like this

In the Instance column you will now see all the available instances for this metric. In the first line the instance is blank, this represents the aggregate instance.

Three points to note  in the cmdlet and its parameters:

  • Although I asked for 1 sample (-MaxSamples 1) the cmdlet returned 9 values. The -MaxSamples parameter apparently only looks at the Timestamp. It doesn’t count the number of returned values
  • I used the -IntervalMins parameter to specify Historical Interval 1. If I had left out this parameter, the Get-Stat cmdlet would have used it’s default, which in this case would have returned values from Historical Interval 4. More on the Get-Stat defaults in a future post in this series.
  • The parameters in the Get-Stat and the Get-StatType cmdlets are not consistent. On the Get-StatType cmdlet you have to use the -Interval parameter and on the Get-Stat cmdlet you have to use the -IntervalMins or -IntervalSecs parameters to basically achieve the same thing.

Now that we now the instances for this metric (cpu.usage.average) we can limit the values to what we actually want to see.

In this case I’m interested in the aggregate value. Now unfortunately the Get-Stat cmdlet does not have an -Instance parameter (yet). That means we will have to filter the returned data.

You could do that like this


Get-Stat -Entity (Get-VMHost $esxName) -Stat cpu.usage.average -IntervalMins 5 | where{$_.instance -eq ""}

This returns something like this

Intervals and values

In the Intervals section I tried to explain what the statistics intervals are all about.

Now when you use the Get-Stat cmdlet it is quite important to know in which interval(s) your query will get its values and what the impact will be on the returned values.

First lets have a look at how you specify a time range to which you want to limit your query. To define a time range you use the -Start and -Finish parameters. With the help of the Get-Date cmdlet and some of its methods you can easily define time ranges. Something like this.

Get-Stat -Entity (Get-VMHost $esxName) -Stat cpu.usage.average -Start (Get-Date).AddDays(-4) -Finish Get-Date).AddDays(-3) | where{$_.Instance -eq ""}

And this returns

But for reporting you normally don’t want your intervals to start at the same time, albeit on another day, that you execute the Get-Stat cmdlet. In your reports the intervals should start at midnight and stop with the last interval before midnight.

Thanks to the PowerShell magic this is not too hard to accomplish. This is just one way of doing this, there are most probably other methods.

$todayMidnight = Get-Date -Hour 0 -Minute 0 -Second 0
Get-Stat -Entity (Get-VMHost $esxName) -Stat cpu.usage.average -Start $todayMidnight.AddDays(-4) -Finish $todayMidnight.AddDays(-3) | where{$_.Instance -eq ""}

And now we get these values.

Nearly there, but the start and end interval are one off. Again easy to correct by subtracting a number of minutes (less than the interval of course) from the -Start and -Finish parameters.

$todayMidnight = (Get-Date -Hour 0 -Minute 0 -Second 0).AddMinutes(-1)
Get-Stat -Entity (Get-VMHost $esxName) -Stat cpu.usage.average -Start $todayMidnight.AddDays(-4) -Finish $todayMidnight.AddDays(-3) | where{$_.Instance -eq ""}

And now the interval starts at midnight and ends with the last interval of that day.

This concludes the first part in the “statistics” series. This post just touched the basics (hence the title) of statistics gathering with PowerCLI.

Watch out for future posts where I will go into more specifics.

In the mean time, if you have any questions or are trying to find out how something could be done with the Get-Stat cmdlet, feel free to post your question(s) or problem(s) in the comments.

Happy New Year and see you in 2010.

vCenter Performance Counters

  1. Mike Leone
    January 12th, 2012 at 16:22 | #1

    Very informative article, thanks. One question, tho – if I was measuring cpu.usage,average of a *VM* (not an ESX/ESXi server), would I still need to specify an instance? Presume all VMs have at least 2 vCPUs; in that case, I would want to use the instance indicator as you have (.instance -eq “”)

    • January 12th, 2012 at 17:42 | #2

      @Mike, thanks.
      You are correct, if you are interested in the aggregate value, you would have to use the “” instance.
      Note that the Get-Stat cmdlet in the latest PowerCLI builds does have an Instance parameter. So you can specify the instance immediately with the Get-Stat cmdlet.

  2. cybercoaster
    April 8th, 2011 at 17:20 | #3

    @LucD

    Thanks so much, that did the trick! I have lots to learn.

  3. Cybercoaster
    April 7th, 2011 at 22:52 | #4

    I am wanting to get an output of the average cpu usage for all of our servers. I am using the following code however I would like each line to also list the host that it came from.
    Right now it just outputs the date/time/% I would like to output this to a CSV so I can import in Excel and sort by hostname.

    Any help would be much appreciated as I am pretty new to this and still looking through a lot of web sites and examples.

    $vhosts – (get-vmhost)

    foreach ($vhost in $vhosts){

    Get-Stat -Entity (Get-VMHost $vhost) -Stat cpu.usage.average

    }

    • April 7th, 2011 at 23:26 | #5

      @Cybercoaster. The server information is present in the returned objects, but you are not seeing that ‘property’ because of the Format file that comes with PowerCLI.
      In such a Format file you define which properties of a specific object are displayed by default.

      With the Select-Object cmdlet you can specify explicitely which properties you want to see.
      In your case, you could do:

      Get-Stat -Entity (Get-VMHost) -Stat cpu.usage.average | `
      Select @{N="Server";E={$_.Entity.Name}},
      MetricId,Timestamp,Value,Unit,Instance | `
      Export-Csv "C:\server-cpu-avg.csv" -NoTypeInformation -UseCulture

      The ‘Server’ property is a so-called calculated property. You provide a name (N) for the property and an expression (E) how to get the value of the property. The rest of the properties are the same level-1 properties you see in the default display of the object.
      Also note that you can provide all servers in 1 call to Get-Stat. This will be considerably faster than making an individual call for each host.

  4. March 2nd, 2011 at 09:32 | #6

    Thanks for sharing this. It has been useful in one of my client’s request.
    Drinks on me when you visit Singapore.

    e1@Singapore

  5. Tchek14
    October 9th, 2010 at 11:08 | #7

    Hello,

    When I launch this command :
    Get-StatType -Entity (Get-VMHost $esxName) -Interval “Past Day”

    I don’t have stat cpu.usage.minimum and cpu.usage.maximum. How can enable this metrics ?

    Thanks in advance

    • October 9th, 2010 at 11:52 | #8

      @Tchek14. To collect minimum and maximum metrics the statistics level has to be at level 4.
      See the Statistics level section above.

  6. lastcall
    March 4th, 2010 at 00:33 | #9

    Hi, can you help me? this script runs but it doesn’t give me the data i need. i need to come up with performance stats of % cpu usage and % memory usage over the past month per host per vm per hour. then i can create a pivotchart which i can automate into a macro hopefully (as i can do it manually). this script creates an excel spreadsheet and populates the sheets with data only not the data i need. also, can the vm’s be grouped by vmhosts which are grouped by clusters? so, this way my data is hierarchical ie: by cluster, vmhost, and vm. i havent written many scripts and i have been playing with the get-stat command however i dont know how to identify the host’s/vm’s that i am extracting data from.

    # Get VM / Datastore / Host information from VMware clusters and add it to a excel spreadsheet,
    # please note that you will need the VMware powershell plugins and a copy of Excel on the machine running the code.

    #YOURSERVER
    Connect-VIServer -Server 192.168.24.6 -User lab\john.weber -Password L3tm3in!

    $Excel = New-Object -Com Excel.Application
    $Excel.visible = $True
    $Excel = $Excel.Workbooks.Add()
    $Addsheet = $Excel.sheets.Add()
    $Sheet = $Excel.WorkSheets.Item(1)
    $Sheet.Cells.Item(1,1) = “Name”
    $Sheet.Cells.Item(1,2) = “Number of CPUs”
    $Sheet.Cells.Item(1,3) = “Memory (MB)”

    $WorkBook = $Sheet.UsedRange
    $WorkBook.Font.Bold = $True

    $intRow = 2
    $colItems = Get-VM | Select-Object -property “Name”,”NumCPU”,”MemoryMB”

    foreach ($objItem in $colItems)
    {
    $Sheet.Cells.Item($intRow,1) = $objItem.Name

    # $powerstate = $objItem.PowerState
    # If ($PowerState -eq 1) {$power = “Powerd On”}
    # Else {$power = “Powerd Off”}

    $Sheet.Cells.Item($intRow,2) = $objItem.NumCPU
    $Sheet.Cells.Item($intRow,3) = $objItem.MemoryMB

    $intRow = $intRow + 1

    }

    $WorkBook.EntireColumn.AutoFit()

    $Sheet = $Excel.WorkSheets.Item(2)
    $Sheet.Cells.Item(1,1) = “Name”
    # $Sheet.Cells.Item(1,2) = “Free Space”
    # $Sheet.Cells.Item(1,3) = “Capacity”

    $WorkBook = $Sheet.UsedRange
    $WorkBook.Font.Bold = $True

    $intRow = 2
    $colItems = Get-cluster | Select-Object -property “Name” #,”FreeSpaceMB”,”CapacityMB”

    foreach ($objItem in $colItems)
    {
    $Sheet.Cells.Item($intRow,1) = $objItem.Name
    # $Sheet.Cells.Item($intRow,2) = $objItem.FreeSpaceMB
    # $Sheet.Cells.Item($intRow,3) = $objItem.CapacityMB

    $intRow = $intRow + 1

    }

    $WorkBook.EntireColumn.AutoFit()

    $Sheet = $Excel.WorkSheets.Item(3)
    $Sheet.Cells.Item(1,1) = “Name”
    # $Sheet.Cells.Item(1,2) = “State”

    $WorkBook = $Sheet.UsedRange
    $WorkBook.Font.Bold = $True

    $intRow = 2
    $colItems = Get-VMhost | Select-Object -property “Name” #,”state”

    foreach ($objItem in $colItems)
    {
    $Sheet.Cells.Item($intRow,1) = $objItem.Name

    # $state = $objItem.State
    # If ($state -eq 0) {$status = “Connected”}
    # Else {$status = “Disconnected”}

    # $Sheet.Cells.Item($intRow,2) = $status

    $intRow = $intRow + 1

    }

    $WorkBook.EntireColumn.AutoFit()

    • March 4th, 2010 at 07:15 | #10

      Hi John,
      That sounds like a great practical application. Let me see what I can come up with. Might even do a dedicted post on this.

  1. December 30th, 2009 at 16:00 | #1
  2. February 25th, 2010 at 14:20 | #2
  3. August 9th, 2010 at 19:44 | #3
  4. May 2nd, 2011 at 21:08 | #4