Datastore usage statistics

An interesting question came up in the PowerCLI Community. Can one extract the datastore statistics, that are used for the space utilization graphs in the vSphere Client, with PowerCLI ? The graph in question, which you find in the Datastores Inventory view under the Performance tab, looks something like this.

A quick browse through the available metrics, on the SDK Reference PerformanceManager page, showed that these metrics are indeed available.

With the disk.capacity.latest, disk.provisioned.latest and disk.used.latest metrics this should be a simple script. But was it ? As it turned out there are a few gotchas!

Getting there

The SDK Reference, on the Storage Capacity page, mentions that these metrics are available for virtual machines and for datastores. In this case we obviously want the metrics for the datastore.

Get-Stat doesn’t do datastores.

Which brings us immediately to the first problem, the current Get-Stat cmdlet doesn’t accept datastore entities. No problem, I have an older post, called Get-Stat2 : another way of getting at the statistical data, which I should be able to adapt to retrieve datastore statistics.

The script uses a number of methods on the PerformanceManager object, to retrieve the counterId and available instances for the metrics. Once it has these it calls the QueryPerf method to get the actual statistical data.

Next problem!

The “optional value not set” error

On each call I seemed to get an error, stating “optional value not set”. A quick search through the VMTN posts showed that I was not the first one seeing that error when using datastore metrics. Unfortunately none of these posts provided me with a workable solution.

After some trial and error, I discovered that the problem was caused by the intervalId property. Although the SDK Reference states that you can either use begintime/endtime or intervalId, it seems that for datastores the PerformanceManager methods only want you to use beginTime and endTime.

This required some fundamental changes to the Get-Stat2 script, but in the end I got it working.

Instances. What instances ?

As with most metrics, you have the option to use the instance property, to identify which specific statistical data you want to retrieve. According to the SDK Reference page, the Storage Capacity metrics support the following instances.

Counter Instance Result
disk.capacity.latest <empty> Capacity, in KB, for the complete datastore
disk.provisioned.latest <empty> Provisioned space, in KBm for the complete datastore
VMid Provisioned space, in KB, for a specific virtual machine
disk.unshared.latest VMid Unshared space, in KB, per virtual machine on the datastore
disk.used.latest <empty> Actually used space, in KB, on the datastore
VMid Actually used space, in KB, for a specific virtual machine
“DISKFILE” Actually used space, in KB,
“DELTAFILE” Actually used space, in KB, for snapshot files
“SWAPFILE” Actually used space, in KB, for the swap files
“OTHERFILE” Actually used space, in KB, for all other virtual machine related files

The VMid is the value property of the MoRef of a specific virtual machine.

The script

Annotations

Line 49: Determine the type of entity that was passed to the function

Line 62-71: Create a hash table with all the available metrics and their metricId

Line 79: Datastore statistics are only available from Historical Interval 2 (HI2) onwards.

Line 82-103: Datastore metrics don’t seem to like the intervalId property. To avoid this problem, the script transforms the interval to a Start and Finish time, provided there were no explicit Start or Finish parameters passed.

Line 94: For the Start time, the script substracts 1 interval duration from the length of the historical interval.

Line 106-122: These lines handle the QueryMetrics switch

Line 148-150: Test if a valid metric was passed on the Stat parameter

Line 156-166: These lines handle the QueryInstances switch

Line 167-181: Check if a valid Instance was passed

Line 187-207: Construct the PerfQuerySpec object

Line 208: The actual call of the QueryPerf method

Line 230-235: Handle the presence of the MaxSamples parameter

Sample runs

Let’s start by investigating which metrics are available for a datastore.

The result of this call is a list with the available metrics.

Next let’s see what instances are available.

As we already learned from the table above, this metric only has a “” instance.

But the disk.used.latest metric has 3 different types of instances.

Note that the VMid numbers that you see in this list do not necessarily mean that these virtual machines currently have files on the specific datastore. Remember that we are looking at historical data! The only way to know what is currently on the datastore, is to get the metrics and check if there are actual values returned for a specific VMid. If the value is -1, there were no files from that virtual machine on the datastore at that specific time.

Retrieving the actual statistical data is easy now.

As we expected, this gives us the statistical data for a specific virtual machine on the datastore.

The original question, how to reproduce the specific graph that is available in the vSphere Client, becomes quite straightforward.

This produces a nice CSV file with the values from the graph.

Notice how you can clearly see the addition of an 8 GB virtual disk, which we also saw in the screenshot at the top of this post.

32 Comments

    Acekeeper

    Great script LucD. I was wondering how can I get VMs datastore usage statistics. For example I have VM1 and VM2, so can I get output like that?:
    VM1 DataStore01 Capacity 200GB Free space 50GB
    VM2 DataStore02 Capacity 100GB Free space 20GB

    Thank you

    Thefluffyadmin

    Tried to run this on powercli 6.0R3 against vCenter 5.5 and unfortunately running up against this:

    Exception calling “QueryAvailablePerfMetric” with “4” argument(s): “A specified parameter was not correct.
    entity”
    At D:\scripts_robert\get-datastoreusagestats.ps1:107 char:5
    + $metrics = $perfMgr.QueryAvailablePerfMetric($Entity.MoRef,$null, …
    + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo : NotSpecified: (:) [], MethodInvocationException
    + FullyQualifiedErrorId : VimException

      LucD

      I’ll check if I can reproduce the error.

    Kids_Kol

    Hi Luc,

    We often found free space in a datastore has dramatically reduced. Is there any way we can get the VM name and the amount it contributes to this free space utilization?

    – Kids

    AdrianH

    Hi LucD
    I’m new in Script and Little bit lost.. 🙂

    Try to find a Script that gives me Cluster Performance values for a “custom Time-Schedule” (CPU MHz usage max/average and Memory consumed max/average) for all Clusters

    -Adrian

    Ravi

    For example someone forget reclaim the data store , how we can identify unused lun details through powercli

    Jesse

    I added “StoragePod” to the list of acceptable entities to be able to get Datastore Cluster stats (just used and provisioned), and it worked just fine!

    Great function.

    tequesta

    @LucD

    Hi Luc, you´re right, some datastores don´t have HI4 statistics. When I change to HI3 or HI2, all works fine.
    But, after a few runs of the script, I see that some VMs don´t show on the list.
    And after a time looking, I found that the VMs with disks bigger than 2Tb was the trouble.
    I cut the script to look like this (for diagnose purposes):

    *-*-*-*

    $metrics = “disk.used.latest”
    $report = @()
    foreach($vm in Get-VM ){
    $report += Get-Stat -entity $vm.ExtensionData -stat $metrics -interval “HI2” |
    Sort-Object -Property Timestamp |
    Group-Object -Property Timestamp | %{
    New-Object PSObject -Property @{
    “VM Name” = $vm.Name
    Timestamp = $_.Name
    “Used (GB)” = [Math]::Round(($_.Group |
    where {$_.CounterName -eq “disk.used.latest”}).Value/1GB,2)

    }
    }
    }
    $report | Export-Csv “z:\DS-stats_sata.csv” -NoTypeInformation -UseCulture

    *-*-*-*-*

    I think the problem is to handle the numbers, because a VM got a disk of 3Tb and the “disk.used.latest” is under 2Tb. If i cut the “disk.provisioned.latest” line on the original script, the VM is showed on the list.

    What do you think about this?

    I hope you can understand my idea because my english is too poor.

    Thanks in advance.

    tequesta

    tequesta

    LucD :
    @tequesta,in the code you specify Historical Interval 4 (HI4). That means data from more than 1 month ago with an interval of 1 day.
    If the Get-Stat cmdlet doesn’t find any data that corresponds, it will not return anything at all.
    Like I said in the other reply, are you sure that there is data for HI4 available for these specific datastores ? You can check by looking under the Performance tab in the vSphere client.
    You could also try to get the report for a more recent intervals (HI3, HI2….).
    Let me know if that brings something ?

    Thanks LucD.

    Thats the problem, I have less than 1 month of data of some datastores.

    Thank you very much!

    tequesta

    @LucD

    When I use this add to get the information of all datastores, some datastores are not reported, i got a message error saying something about null pointer, but if i run “get-datastore”, the list is complete.

    What can i do?

      LucD

      @tequesta, it could be that there are no statistics collected for some datastores. A quick check would be to look in the vSphere client under the Performance tab and verify if there is data for these datastores that seem to fail. Can you check ?

      LucD

      @tequesta,in the code you specify Historical Interval 4 (HI4). That means data from more than 1 month ago with an interval of 1 day.
      If the Get-Stat cmdlet doesn’t find any data that corresponds, it will not return anything at all.
      Like I said in the other reply, are you sure that there is data for HI4 available for these specific datastores ? You can check by looking under the Performance tab in the vSphere client.
      You could also try to get the report for a more recent intervals (HI3, HI2….).
      Let me know if that brings something ?

    tequesta

    Hi LucD

    I add this code on your function:

    $metrics = “disk.capacity.latest”,”disk.provisioned.latest”,”disk.used.latest”
    $report = @()
    foreach($ds in Get-Datastore){
    $report += Get-Stat -entity $ds.ExtensionData -stat $metrics -interval “HI4” |
    Sort-Object -Property Timestamp |
    Group-Object -Property Timestamp | %{
    New-Object PSObject -Property @{
    “Datastore Name” = $ds.Name
    Timestamp = $_.Name
    “Capacity (GB)” = [Math]::Round(($_.Group |
    where {$_.CounterName -eq “disk.capacity.latest”}).Value/1MB,2)
    “Allocated (GB)” = [Math]::Round(($_.Group |
    where {$_.CounterName -eq “disk.provisioned.latest”}).Value/1MB,2)
    “Used (GB)” = [Math]::Round(($_.Group |
    where {$_.CounterName -eq “disk.used.latest”}).Value/1MB,2)
    }
    }
    }
    $report | Export-Csv “z:\DS-stats.csv” -NoTypeInformation -UseCulture

    To get the historical data of all my datastores, but for some strange reason, the script got an error on some datastores:

    This is the error message:

    Export-Csv : No se puede enlazar el argumento al parámetro ‘InputObject’ porque es nulo.
    En C:\Program Files (x86)\VMware\Infrastructure\vSphere PowerCLI\Scripts\estadisticas.ps1: 280 Carácter: 21
    + $report | Export-Csv <<<< "z:\DS-stats.csv" -NoTypeInformation -UseCulture
    + CategoryInfo : InvalidData: (:) [Export-Csv], ParameterBindingValidationException
    + FullyQualifiedErrorId : ParameterArgumentValidationErrorNullNotAllowed,Microsoft.PowerShell.Commands.ExportCsvCommand

    If i try the command Get-Datastore, the list is complete.

    What do you think about this?

    Thanks in advance!

    tequesta

    (sorry, my english is a little poor)

    K

    If we want to get the datastore counters from this link, what needs to change in the script
    https://pubs.vmware.com/vsphere-50/index.jsp?topic=%2Fcom.vmware.wssdk.apiref.doc_50%2Fdatastore_counters.html
    These metrics are only available in realtime in vcenter as mentioned in this article below
    https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2031594

    Matt

    Amazing work Luc.
    Worked the first time I tried it, and your step by step examples helped immensely.

      LucD

      Thank you Matt, glad it was useful for you.

    Yeral

    @LucD
    Hello people from the past, (I know this is an old post). Is just that I ran this looped script and the vCenter service went down. :S
    I have tested the first script and works like a charm fro one single datastore, but the loop version which seems to be very improved is having my entire vCenter service down every time I run it, do you have any suggestions?

    Greg

    @LucD
    LucD,

    Do I need to create a separate .ps1 script that contains the code you mentioned in your reply to Michael that calls the original Get-Stat2 script? Sorry, I’m a little bit at a lost as to the general process and have been struggling to get this working.

    Look forward to our feedback,

    Greg

    Michael Porzio

    Can this be used or modified to give output like:

    Datastore Name Capacity GB Provisioned Space GB Free Space GB

    ???
    I have been trying to figure out how to get a CSV with those 4 pieces of information. Thanks.

    Mike P

      LucD

      Hi Michael, in the last example the datastorename is in the $ds variable. You could loop over all your datastores and then add the datastorename in the output.
      Something like this for example

      $metrics = "disk.capacity.latest","disk.provisioned.latest",
      "disk.used.latest"
      $report = @()
      foreach($ds in Get-Datastore){
      $report += Get-Stat2 -entity $ds.ExtensionData -stat $metrics -interval "HI2" |
      Sort-Object -Property Timestamp |
      Group-Object -Property Timestamp | %{
      New-Object PSObject -Property @{
      "Datastore Name" = $ds.Name
      Timestamp = $_.Name
      "Capacity (GB)" = [Math]::Round(($_.Group |
      where {$_.CounterName -eq "disk.capacity.latest"}).Value/1MB,2)
      "Allocated (GB)" = [Math]::Round(($_.Group |
      where {$_.CounterName -eq "disk.provisioned.latest"}).Value/1MB,2)
      "Used (GB)" = [Math]::Round(($_.Group |
      where {$_.CounterName -eq "disk.used.latest"}).Value/1MB,2)
      }
      }
      }

      $report | Export-Csv "C:\DS-stats.csv" -NoTypeInformation -UseCulture

      Let me know if that produces what you are looking for ?

    Greg

    @LucD
    Hi LucD,

    I was wondering if you have had a chance to help create the .ps1 script that can help make a “report on all the datastores in vCenter”

    Many thanks,

    Greg

    Greg

    @LucD
    LucD,

    Yes, a report of all datastores in vCenter for the three metrics mentioned, “disk.capacity.latest”,”disk.provisioned.latest”, “disk.used.latest” is a good enough start for me. I assume it’s simply adding more metrics into variable definition for $metrics. If not, please include some “how-to” comments on adding more so we fine tune it for future use. We also have mutiple vCenter servers in different sites so if the script can include functionality to account for mutiple vCenter servers, it would be awesome.

    Thanks again for your help.

    Greg

    Greg

    @Shannon
    LucD,

    I’m on the same boat as Shannon. I’m a little bit at a lost on what do with these sniplets. Can you provide some more generic, step by step, instructions on how to run this successfully for PS dummies like myself. Or if there is a single .ps1 file that I can use, that would be awesome.

    Thanks in advance for your help.

      LucD

      @Greg & @Shannon, I’ll be more than glad to help, but first tell me what you want to achieve with the script ?
      A report on all the datastores in your vCenter ?
      And what metrics do you want in the report ? There are many datastore metrics available.

    Shannon

    This looks brilliant and exactly what I want, however I am not a scripting person and am struggling to make these snippets work as one complete script applied to our vcentre instance.

    Has someone converted this to a single script that will run against a designated vcentre server?

    Thanks!
    Shannon

    Allan

    Amazing!

    i was looking for the max queuedepth counter. but no luck.

    lucD can you enlighthen me 😛

      LucD

      Thanks Allan.
      Afaik there is no metric for queuedepth in the PerformanceManager.
      But esxtop has a counter called AQLEN that gives the queuedepth per physical adapter.
      In my Hitchhiker’s Guide to Get-EsxTop – Part 2 – The wrapper post you see this counter in row 24 of the included spreadsheet.
      With the Get-EsxTopValues function from that same post you should be able to retrieve that counter.

      I think such a practical question would be a good subject for a Part 3 post in the Hitchhikers series 😉

    Marcin Rybak

    amazed! great job!

    Steve Jin

    Hi Luc,

    Another solid post! The scripts does not leverage cmdlets much, therefore can be easily ported to Java using vSphere Java API. Please let me know if anyone wants to give a try and we would like to include the samples.

    -Steve

      LucD

      Thanks Steve. Indeed there isn’t a lot of PowerCLI in there, mostly APIs.
      Feel free to use the logic and improve it 🙂

    rob

    This get-stat2 script will come in very handy. Thanks!

    AureusStone

    Great post!

    I tried to work out how to do this after seeing the post on the communities, but ran into issues. 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *

*
*

This site uses Akismet to reduce spam. Learn how your comment data is processed.