PowerCLI & vSphere statistics – Part 1 – The basics

Another popular subject in the VMTN PowerCLI community are statistics. Quite often it’s not entirely clear to the user what is available, how the data can be extracted and how PowerShell/PowerCLI can be used to convert the raw metrics into usable reports.

Before you can fully use all that is available, there are a few key concepts that you should understand.

In this series I will try to explain some common questions.

Introduction

Since I want to focus on the practicalities of statistics, I’m not going to repeat what is already available in several excellent papers. The following are some of the papers that you should read or locations you should visit regularly, if you want to fully understand the gathering and usage of vSphere statistical data.

Understanding VirtualCenter Performance Statistics should be read by anyone working with vCenter performance data. Gives a clear explanation of intervals, statistics levels and the update interval.
vCenter Performance Counters an overview of the available vCenter performance counters with for each of them a short description of their purpose
Understanding performance a great collection of documents that explain several key concepts in virtual performance.
Performance & VMmark VMTN community is where you will find interesting discussions on performance and also a lot of documents on performance. And where you can of course post your questions !
VROOM! “THE BLOG” on performance. The posts are written by members of VMware’s Performance Engineering Team.. Contains deep-dives, best practices, comparisons… Subscribe and read !
Technical papers VMware publishes numerous technical papers on performance. A bit of overlap with the previous sources I mentioned but check out the list of published papers regularly.
The vSphere Web Services SDK Programming Guide which contains in chapter 12 “Monitoring Performance” an excellent and detailed overview of the key concepts.
The excellent Performance Troubleshooting for VMware vSphere 4 and ESX 4.0 document. It offers a very useful wizard to analyse performance problems.

Intervals

Although I promised not to repeat anything from the sources I mentioned above, this is one key concept about performance data in a vCenter environment that should be clearly understood before you start coding.

In short:

on an ESX/ESXi server the performance data is collected for 20 seconds intervals. This is the so-called Realtime interval. This data is kept on the ESX/ESXi server for about one hour.
The ESX/ESXi server will aggregate the realtime data into 5 minute interval data.
The unmanaged ESX/ESXi server will keep this data for 1 day.
The managed ESX/ESXi server will send the data to the vCenter, which stores it in the vCenter database.
In the vCenter this 5-minute interval is consolidated into 30-minute interval data. This data is kept for 1 week.
The aggregation/consolidation of the data is done on the database where you host you vCenter db. The following are some examples from a SQL 2005 Server.
- The actual “rollup” jobs. Ignore the “event cleanup” job.
- The schedule of the daily “rollup” job.
There are two more intervals, the 2-hour interval, which is kept for 1 month, and the 1-day interval, which is kept for 1 year.
These 4 intervals are sometimes referred to as Historical Interval 1 (5 minutes), Historical Interval 2 (30 minutes), Historical Interval 3 (2 hours) and Historical Interval 4 (1 day).

A schematic will perhaps make this a bit clearer.

Statistics level

The statistical levels are another key concept you need to understand before you dive into statistical data gathering. In short: the statistical levels define which metrics are available in which interval.

The available statistical levels are defined as follows:

Level	Description
1	Basic metrics. Device metrics excluded. Only average rollups.
2	All metrics except those for devices. Maximum and minimum rollups excluded.
3	All metrics, maximim and minimum rollups excluded
4	All metrics

You define the statistical level for each “historical” interval in the vSphere Client under <Administration><vCenter Server Settings><Statistics>.

You could decide to keep all the metrics in all intervals, but you will most probably not need that level of detail for the “older” intervals. And don’t forget that the more data you keep, the bigger your VC database will become (longer roll up jobs, longer backups…).

As an example let’s take the metrics around cpu.usage. This metric comes in 3 types of roll up: minimum, average and maximum. While it definitely is useful to have all these roll up types for the realtime interval and most possibly the historical interval 1, there is really not justification to keep all these for historical interval 4.

In this case you can then safely define the statistical level for historical interval 4 to be level 2.

Note that the vSphere client gives you an estimate of the required space on your VC database for the statistics levels and retention periods you define. Just fill in the number of hosts and guests and you will have a rough idea !

Instances

The last key concept you need to know (I promise ;-)) before we start scripting.

The official definition from the VMware vSphere API Reference Documentation: “An identifier that is derived from configuration names for the device associated with the metric. It identifies the instance of the metric with its source.“.

Let’s try to make this a bit more understandable through an example.

Take the CPU-related metrics for an ESX/ESXi host. If the ESX/ESXi server is equipped with a quadcore CPU, there will be four instances: 0, 1, 2 and 3. In this case the instance corresponds with the numeric position within the CPU core

And there will be a so-called aggregate, which is the metric averaged over all the instances.

These instances each get their own identifier which will be part of the returned statistical data. The aggregate instance is always represented by a blank identifier.

Scripting

What is available ?

To see all the available metrics you can consult Appendix C: Performance Metrics in the Basic System Administration manual. But that appendix might not always reflect the actual situation. Due to for example upgrades or patches that introduce or change one or more metrics.

A better method is to just ask the system. With the introduction of PowerCLI 4 a new cmdlet, Get-StatType, was introduced that does just that.

The cmdlet has one required parameter, called -Entity, that allows us to indicate for which object we want to query the available metrics.

Get-StatType -Entity (Get-VMHost $esxName)

1	Get-StatType -Entity (Get-VMHost $esxName)

In this form you will get all the metrics for an ESX/ESXi server and for all intervals. This last fact is not really useful if we have different statistical levels defined for the different intervals.
Luckily we have the -Interval parameter. It allows you to specify the interval for which you want to see the available metrics. But does this mean that you will have to learn all the statistical intervals, duration or name, by hearth ? No, you can use the Get-StatInterval to get a list of all the historical intervals.

Get-StatInterval

1	Get-StatInterval

It returns something like this

With this knowledge we can now launch the Get-StatType cmdlet with a more specific request

Get-StatType -Entity (Get-VMHost $esxName) -Interval "Past Day"

1	Get-StatType -Entity (Get-VMHost $esxName) -Interval "Past Day"

And this returns a list of metrics like this.

But it still looks confusing! Why do we have apparently nine times the same metric in the list ?

To take away the solution, it’s due to the instances that are available for this metric.

In this case the ESX/ESXi host contained a quadcore and hyper-threading was active, this means that there are 8 logical CPU cores as far as vSphere is concerned. And the ninth entry comes from the aggregate value, the average over all logical CPU cores.

Unfortunately the Get-StatType cmdlet doesn’t give you the available instances. To see those you will have to look at the actual statistical data. Something like this

Get-Stat -Entity (Get-VMHost $esxName) -Stat cpu.usage.average -MaxSamples 1 -IntervalMins 5

1	Get-Stat -Entity (Get-VMHost $esxName) -Stat cpu.usage.average -MaxSamples 1 -IntervalMins 5

And this returns something like this

In the Instance column you will now see all the available instances for this metric. In the first line the instance is blank, this represents the aggregate instance.

Three points to note in the cmdlet and its parameters:

Although I asked for 1 sample (-MaxSamples 1) the cmdlet returned 9 values. The -MaxSamples parameter apparently only looks at the Timestamp. It doesn’t count the number of returned values
I used the -IntervalMins parameter to specify Historical Interval 1. If I had left out this parameter, the Get-Stat cmdlet would have used it’s default, which in this case would have returned values from Historical Interval 4. More on the Get-Stat defaults in a future post in this series.
The parameters in the Get-Stat and the Get-StatType cmdlets are not consistent. On the Get-StatType cmdlet you have to use the -Interval parameter and on the Get-Stat cmdlet you have to use the -IntervalMins or -IntervalSecs parameters to basically achieve the same thing.

Now that we now the instances for this metric (cpu.usage.average) we can limit the values to what we actually want to see.

In this case I’m interested in the aggregate value. Now unfortunately the Get-Stat cmdlet does not have an -Instance parameter (yet). That means we will have to filter the returned data.

You could do that like this

Get-Stat -Entity (Get-VMHost $esxName) -Stat cpu.usage.average -IntervalMins 5 | where{$_.instance -eq ""}

1	Get-Stat -Entity (Get-VMHost $esxName) -Stat cpu.usage.average -IntervalMins 5 \| where{$_.instance -eq ""}

This returns something like this

Intervals and values

In the Intervals section I tried to explain what the statistics intervals are all about.

Now when you use the Get-Stat cmdlet it is quite important to know in which interval(s) your query will get its values and what the impact will be on the returned values.

First lets have a look at how you specify a time range to which you want to limit your query. To define a time range you use the -Start and -Finish parameters. With the help of the Get-Date cmdlet and some of its methods you can easily define time ranges. Something like this.

Get-Stat -Entity (Get-VMHost $esxName) -Stat cpu.usage.average -Start (Get-Date).AddDays(-4) -Finish Get-Date).AddDays(-3) | where{$_.Instance -eq ""}

1	Get-Stat -Entity (Get-VMHost $esxName) -Stat cpu.usage.average -Start (Get-Date).AddDays(-4) -Finish Get-Date).AddDays(-3) \| where{$_.Instance -eq ""}

And this returns

But for reporting you normally don’t want your intervals to start at the same time, albeit on another day, that you execute the Get-Stat cmdlet. In your reports the intervals should start at midnight and stop with the last interval before midnight.

Thanks to the PowerShell magic this is not too hard to accomplish. This is just one way of doing this, there are most probably other methods.

$todayMidnight = Get-Date -Hour 0 -Minute 0 -Second 0
Get-Stat -Entity (Get-VMHost $esxName) -Stat cpu.usage.average -Start $todayMidnight.AddDays(-4) -Finish $todayMidnight.AddDays(-3) | where{$_.Instance -eq ""}

1 2	$todayMidnight = Get-Date -Hour 0 -Minute 0 -Second 0 Get-Stat -Entity (Get-VMHost $esxName) -Stat cpu.usage.average -Start $todayMidnight.AddDays(-4) -Finish $todayMidnight.AddDays(-3) \| where{$_.Instance -eq ""}

And now we get these values.

Nearly there, but the start and end interval are one off. Again easy to correct by subtracting a number of minutes (less than the interval of course) from the -Start and -Finish parameters.

$todayMidnight = (Get-Date -Hour 0 -Minute 0 -Second 0).AddMinutes(-1)
Get-Stat -Entity (Get-VMHost $esxName) -Stat cpu.usage.average -Start $todayMidnight.AddDays(-4) -Finish $todayMidnight.AddDays(-3) | where{$_.Instance -eq ""}

1 2	$todayMidnight = (Get-Date -Hour 0 -Minute 0 -Second 0).AddMinutes(-1) Get-Stat -Entity (Get-VMHost $esxName) -Stat cpu.usage.average -Start $todayMidnight.AddDays(-4) -Finish $todayMidnight.AddDays(-3) \| where{$_.Instance -eq ""}

And now the interval starts at midnight and ends with the last interval of that day.

This concludes the first part in the “statistics” series. This post just touched the basics (hence the title) of statistics gathering with PowerCLI.

Watch out for future posts where I will go into more specifics.

In the mean time, if you have any questions or are trying to find out how something could be done with the Get-Stat cmdlet, feel free to post your question(s) or problem(s) in the comments.

Happy New Year and see you in 2010.

vCenter Performance Counters

48 Comments

Jyoti Maindola

June 1, 2022 at 17:26

Hello LucD

I’m trying to export the realtime CPU and Memory usage of all the Virtual Machines in vCenter but my script is giving me the exact values.
Below is my script. Please Assist me. Thank you in advance!

$report = @()
$metrics = “cpu.usagemhz.average”,”mem.active.average”
$vms = Get-Vm | where {$_.PowerState -eq “PoweredOn”}

Get-Stat -Realtime -MaxSamples 1 -Entity ($vms) -stat $metrics | where{$_.instance -eq “”} | `
Group-Object -Property EntityId | %{
$row = “”| Select VmName, Timestamp, vCPU,CpuUsed,MemAlloc,MemUsed
$row.VmName = $_.Group[0].Entity.Name
$row.Timestamp = ($_.Group | Sort-Object -Property Timestamp)[0].Timestamp
$row.vCPU = $_.Group[0].Entity.NumCpu
$cpuStat = $_.Group | where {$_.MetricId -eq “cpu.usagemhz.average”} | Measure-Object -Property Value -Maximum
$row.CpuUsed = “{0:f2}” -f ($cpuStat.Maximum)
$row.MemAlloc = $_.Group[0].Entity.MemoryMB
$memStat = $_.Group | where {$_.MetricId -eq “mem.active.average”} | Measure-Object -Property Value -Average
$row.MemUsed = “{0:f2}” -f ($memStat.Average)
$report += $row
}
$report | Export-Csv “C:\JYOTI\Temp\VMUsage.csv” -NoTypeInformation -UseCulture

PowerCLI & vSphere statistics – Part 1 – The basics

Introduction

Intervals

Statistics level

Instances

Scripting

What is available ?

Intervals and values

vCenter Performance Counters

48 Comments

Jyoti Maindola

Jyoti Maindola

Jyoti Maindola

Jyoti Maindola

Hammad

Raju

Ismael

Jamey

Paddu

Sid

russ oconnor

reac

Joey

sniff

sniff

Daniel

Ganesh Prasad Pal

Rich

admin

Tuhin

mario

frank

Mike Leone

cybercoaster

Cybercoaster

Tchek14

lastcall

Ravi

Leave a Reply Cancel reply