PowerCLI & vSphere statistics – Part 4 – Grouping

In a previous post in this series (see PowerCLI & vSphere statistics – Part 2 – Come together), I showed the usefulness of the Group-Object cmdlet when working with statistical data. The script in that post grouped data samples for each hour together, which made it much easier to calculate the hourly average. With the Group-Object cmdlet you avoid numerous nested if- or switch-statements.

And best of all, you don’t have to code the grouping yourself, it was all done for you by the PowerShell Team.

So make sure this cmdlet belongs to your basic PowerShell repertoire. It will prove invaluable when processing statistical data.

This post will show you several of the different options you have to group statistical data together. And I will illustrate each of these with a sample script.

Grouping using one property

Let’s start with something simple.

We want to produce a CSV file with the average CPU usage per core on a specific ESX host over the last full hour.

With the -Start and -Finish parameters of the Get-Stat cmdlet we can limit the returned data to the last full hour.

And we know that the Instance property has the core-Id or an empty string, to indicate the aggregate value (i.e. the average over all cores).

It is then the obvious choice to group the data on the Instance property.

Annotations

Line 3: Gives a timestamp with the current hour minus some seconds so as not to get a value for the full hour.

Line 6: We use the -IntervalMins 5 parameter to make sure Get-Stat returns values for the complete hour. If you leave out this parameter, the Get-Stat cmdlet in some cases returns samples from the Realtime interval. But then you risk, depending how many minutes after the full hour you run the script, that you don’t have samples for the complete hour.

Line 7: The returned statistical data is grouped on the Instance property, which is the Core-Id.

Line 8: The Where-Object filters out the data for the aggregate instance (the average over all the cores).

Line 9-12: This uses a PowerShell v2 feature (-Properties) of the New-Object cmdlet.

Since a picture says more than a 1K words, a simple diagram.

  • Get-Stat returns a number of samples
  • The CPU core-Id is stored in the Instance property
  • The Group-Object cmdlet creates a new array element for each value present in the Instance property seen over all the samples
  • Such an array element contains:
    • a Name property that holds in this sample the core-Id
    • a Group property that holds all the samples for the specific code-Id in the Name property

If we have a look with the Variable Inspector from PowerShell Plus, we can see how the $groups variable is composed.

The Group-Object cmdlet creates a GroupInfo object for each “group” it detects.

In the Name property we get the value that was used for a specific group. For $groups[0] this is an empty string, in other words the group for the aggregate value. In $groups[1] we can see from the Name property that this group instance was for core-Id 0.

The resulting CSV file looks like this when the script is executed on single quad-core Nehalem processor with hyper-threading enabled.

Grouping using multiple properties

The previous script produced a CSV file for a specific host. But what if we want the same type of CSV file for multiple ESX hosts ?

In fact this is quite easy with the Group-Object cmdlet, just add the Entity property to the -Properties parameter.

Annotations

Line 6: The additional property Entity was added to include it in the grouping process.

Line 7: The values for both grouping conditions (Entity and Instance) are available in the Values property of the GroupInfo object. Note that the Name property in the GroupInfo object now contains the values for both grouping conditions.

Line 9: The first element in the Values array is the hostname. The order of the elements corresponds with order that was used on the -Property parameter of the Group-Object cmdlet.

Line 10: The second element is the core-Id.

Another look at your best friend the Variables Inspector makes all this perhaps a bit clearer.

Notice how the Values property contains an array with values for the conditions for a specific group.

And the resulting CSV file looks like this. This was produced for three ESX hosts each with a quad-core Nehalem processor with hyper-threading enabled.

Grouping with Boolean expressions

We take our first sample script one step further.

This time we only want to report the intervals where one or more of the CPU cores used more than 15% of their capacity. I know this 15% is not really a realistic report but consider it’s for demonstration purposes and I didn’t want to overload my test boxes 😉

Annotations

Line 7: The -Property parameter is now a script block that can return $true or $false. The test for the aggregate instance is now included in the Boolean expression.

Line 8: Another great concept by the PowerShell Team, the array of GroupInfo objects is in fact a hash table! Which means you get access to a specific group by using the required condition value as a key. Since we want all the conditions met, we use the $true key.

The view with the Variable Inspector.

And the resulting CSV file.

Nested groups

And for the “Pièce de résistance” we now do this for multiple ESX hosts and we want to produce separate CSV files for each host.

Annotations

Line 6: The method for the first grouping is identical to that of the previous script.

Line 7: The second grouping is done with all the values where the previous Boolean expression returned $true. On those values we now group per Entity which allows us to create  a separate CSV file for each host.

Line 8: The script loops through all the groups (e.g. hosts).

Line 9: The script creates an entry for each statistical value.

Line 16-17: Finally, a CSV file is created for each host.

In my test environment there were two hosts that had cores where the average CPU usage was higher then 15%.

Conclusion

I hope this post made you see the usefulness of the Group-Object cmdlet.

Feel free to post any questions or remarks in the Comments section.

10 Comments

    Pugazhendhi

    Hi Lucd,

    I found this script from online and trying to get the output of average CPU and Memory utilization for a week in a CSV file. But am not able to get the results in CSV format. Can you Please help me out with an idea to get the output in CSV format.

    $start = (Get-Date).AddDays(-30)
    $finish = Get-Date
    $metrics = “mem.usage.average”

    Get-Datacenter | ForEach-Object{
    Get-VMHost -location $_ | Sort-Object Name | ForEach-Object {
    Write-Host $_.Name
    $stats = “”
    $stats = get-stat -Entity $_ -Stat $metrics -Start $start -Finish $finish
    Write-Host “$($_) average Memory Usage over the last month is: $(“{0:N2}” -f ($stats | Measure-Object -Property Value -Average).Average)%”
    }
    }

      LucD

      When you display data on the console (Write-Host) there will be nothing in the pipeline, not in a variable.
      The easiest way to do this is to set up a pipeline construction.
      Something like this

      $start = (Get-Date).AddDays(-30)
      $finish = Get-Date
      $metrics = "mem.usage.average"

      $esx = Get-Datacenter | Get-VMHost
      Get-Stat -Entity $esx -Stat $metrics -Start $start -Finish $finish |
      Group-Object -Property {$_.Entity} |
      ForEach-Object -Process {
      New-Object -TypeName PSObject -Property @{
      VMHost = $_.Name
      AvgMemPerc = ($_.Group | Measure-Object -Property Value -Average).Average
      }

      } | Export-Csv -Path .\report.csv -NoTypeInformation -UseCulture

      I hope that helps.

        Pugazhendhi

        Thanks a lot. This worked for me but in the csv output the value is like decimal (65.15085397) how can get csv output in percentage itself.

          LucD

          That is a percentage.
          If you want to have the ‘%’ sign on there, you could change that line like this

          AvgMemPerc = "{0:P1}" -f (($_.Group | Measure-Object -Property Value -Average).Average/100)

            Pugazhendhi

            Thanks a lot!!!!!. This worked perfect.

              Pugazhendhi

              I tried to script this code for getting the output (csv file) to my mail id but am receiving mail without the attachment. Please help me!

              $start = (Get-Date).AddDays(-7)
              $finish = Get-Date
              $metrics = “cpu.usage.average”

              $esx = Get-Datacenter | Get-VMHost
              Get-Stat -Entity $esx -Stat $metrics -Start $start -Finish $finish |
              Group-Object -Property {$_.Entity} |
              ForEach-Object -Process {
              New-Object -TypeName PSObject -Property @{
              VMHost = $_.Name
              AvgCPUPerc = ($_.Group | Measure-Object -Property Value -Average).Average
              }

              } | Export-Csv -Path F:\Test\report.csv -NoTypeInformation -UseCulture | Start-Sleep -s 20 | Send-MailMessage -SmtpServer “mail.contoso.com” -To “Me ” -From “Contoso ” -Subject “Host utilization” -Body “The attached spreadsheet contains Host utilization Report” -Attachment = “F:\Test\report.csv”

                LucD

                That will not work with a pipeline construction.
                The Send-MailMessage needs to be separate.
                Btw, I used splatting to organise the Send-MailMessage parameters.

                $start = (Get-Date).AddDays(-7)
                $finish = Get-Date
                $metrics = “cpu.usage.average”

                $esx = Get-Datacenter | Get-VMHost
                Get-Stat -Entity $esx -Stat $metrics -Start $start -Finish $finish |
                Group-Object -Property {$_.Entity} |
                ForEach-Object -Process {
                New-Object -TypeName PSObject -Property @{
                VMHost = $_.Name
                AvgCPUPerc = ($_.Group | Measure-Object -Property Value -Average).Average
                }

                } | Export-Csv -Path F:\Test\report.csv -NoTypeInformation -UseCulture

                $sMail = @{
                SmtpServer = 'mail.contoso.com'
                To = 'Me'
                From = 'Contoso'
                Subject = 'Host utilization'
                Body = 'The attached spreadsheet contains Host utilization Report'
                Attachment = 'F:\Test\report.csv'
                }
                Send-MailMessage @sMail

    mike

    hi Luc, Great post. Question for you. what is the fastest way to get data from vcenter? Poweshell, get-stat, takes a very long time. Was curious, if you explored anything else?

    Thanks,

    Mike

      LucD

      Hi Mike,
      Yes, Get-Stat can take quite some time before it returns data.
      Depends of course on the size of your vSphere environment, how far back in time you want to go, the size of the vCenter database…

      I wrote a Get-Stat2 function some time ago, that is normally faster than the Get-Stat cmdlet.
      Give it a try.

        mike

        Hi Luc,

        I was hoping to create script, run it every 10 or 20 minutes to retrieve 5 minute granular data from vcenter for all of the hosts, for cpu and memory utilization and graph this real time. Do you thing retrieving data from cluster may be faster? I modified this script, but looks like those metrics for cluster level not available, I’d really appreciate any insight/help.

        $midnight = Get-Date -Hour 0 -Minute 0 -Second 0
        $metrics_rp = “cpu.usage.average”,”mem.usage.average”
        $rp = Get-Cluster | Get-ResourcePool

        Get-Stat -Entity $rp -Stat $metrics_rp -Start $midnight.addminutes(-20) -Finish $midnight.addminutes(-1) -IntervalMins 5 |
        Group-Object -Property {$_.EntityId} | %{
        $cluster = $parent
        }
        Select-Object -InputObject $_ -Property @{N=”Interval start”;E={$midnight.addminutes(-20)}},
        @{N=”Resource Pool”;E={$_.Group[0].Entity.Name}},
        @{N=”Cluster”;E={$cluster.Name}},
        @{N=”CPU”;E={$_.Group | Where {$_.MetricId -eq “cpu.usage.average”} |
        Measure-Object -Property Value -Average | Select -ExpandProperty Average}},
        @{N=”MEM”;E={$_.Group | Where {$_.MetricId -eq “mem.usage.average”} |
        Measure-Object -Property Value -Maximum | Select -ExpandProperty Maximum}} |
        Export-Csv “C:\output\cluster_stats_v2.csv”
        }

Leave a Reply

Your email address will not be published. Required fields are marked *

*
*

This site uses Akismet to reduce spam. Learn how your comment data is processed.