RDM & vMotion: inaccessible direct-access LUN

When you try to migrate a guest, that is using one or more RDM disks, you might see this message.

compatibility

The reason this is most probably because the LUN IDs are different on the source and the destination ESX server.

One solution is:

  • stop the guest
  • write down the Physical LUN ID
  • remove the RDM disk(s)
  • vMotion the guest
  • add the RDM disk(s) to the guest based on the Physical LUN ID
  • start the guest

But why do this the hard (manual) way when we have PowerCLI ?

Automating the above scenario doesn’t look too difficult, but there are some pitfalls on the way.

Gathering the information

In this step I collect extensive information about the guest and the RDM disk(s) it has connected.

As a safety-measure, the collected information is dumped to a CSV file. This should allow, in case of script-failure, to restore the RDM mappings quite easily.

The collected information should also include to which controller the RDM disk is connected and the RDM’s unit number. This is important because otherwise the guest’s OS might mix up the drive lettering.

Removing the RDM

For this step I use a filter, called Remove-HD. This makes it easy to combine this in a pipe with the Get-Harddisk cmdlet.

I could have used PowerCLI’s Remove-Harddisk but this cmdlet leaves the mapping files on the datastore and thus makes your guest consume unused disk space.

The filter has two steps, the first one removes the RDM from the guest’s configuration (ReconfigVM_Task) and the second step removes (DeleteDatastoreFile_Task) the mapping file(s) from the datastore.

Connecting the RDM

This function (New-RawHardDisk) already appeared in the PowerCLI Community thread Adding an existing hard disk. I just added some logic to be able to add physical and virtual mode RDMs.

Main function

This is driving part of the script. It controls the logic and the order in which the different functions and filters are called.

The script

Annotation

Line 2: destination ESX server

Line 18-23: support for physical and virtual mode RDM

Line 118: dump the RDM info to a CSV file

Line 120-141: this is the part where the RDM disk(s) are removed, the VM vMotioned and the RDM disk(s) re-attached.

Line 136: because a guest can have more than one RDM I use the Group-Object cmdlet

20 Comments

    Robdogaz

    Great script works for what I need but one question, it is removing my NIC can you think of anything that would cause that?

    I have 21 Physical RDMs and it removes them and readds them but removes the NIC (“Flexible”)

      LucD

      @Robdogaz, I can’t immediately see a reason why the script would remove the NIC from the VM.
      One reason I know why this could happen is when there aren’t any free ports on the switch on the destination ESXi host.
      It seems that vMotion doesn’t check if there are free ports on the switch on the destination.
      But that shouldn’t remove the NIC, the NIC will still be there but it will be disconnected.
      Is that the case with your VMs ? Are the NICs disconnected ?

    Scott

    I am having an issue with this script and I am wondering if you you could help. I am trying to remove a physical RDM and remap as a virtual RDM. I am having trouble remapping. In the VI console it states “Incompatible device backing specified for device ‘0’. Any ideas what might be going wrong? The script works if I try and remove and remap a physical RDM as a test. Thanks for any pointers.

      LucD

      Hi Scott, these functions are quite old and the PowerCLI build at that time couldn’t handle RDMs correctly.
      In the current build you should be able to do the same thing with the Remove-Harddisk and the New-Harddisk cmdlets.

    Josh Feierman

    Thank you, thank you, thank you for this script. I recently had to move a half dozen VMs with over three dozen RDMs in total. With a little bit of tweaking it saved me many hours of mindless GUI clicking, not to mention a very sizable risk of fat-fingering something. Nice work!

    Michael

    Hello Luc,

    Thank you for the great script. I’m trying to use it in two steps (two separate scripts) for our SRDF failover. First step (one script) to create the RDM report and the second step to import the CSV report and use that data to attach the RDM. The disk order is important as we have VMs with many RDMs,

    The first part works great, export the report and remove the RDM and backing file, but I’m having difficulty importing the CSV and connecting the RDM back to its VM.

    It goes through importing the CSV and I can see the variables populate with the data, but at the end I get I get errors about null argument and methods.

    Can you help please?

      LucD

      @Michael, can you send me the scripts you are using, so I can have a look.
      Send it to lucd (at) lucd (dot) info

    Gert Van Gorp

    Hi Luc,

    Hope you had a nice trip back van SF.

    As I told you in SF I am writing a PS script with a gui to move / clone VM’s with RDM”s connected. Thanks to your code I have come a far way.

    The only thing I want to do with the above script is to clone VM-A (with VMDK & RDM) to antother location. But I ant to clone (the RDM is removed, the clone to VM-B is done, but the script does not wait untill the end of the clone operation to readd the RDM to VM-A. Is there a way to have the new-vm cmdlet wait until the clone is done before to move on with the next step in the script?

      LucD

      Thanks Gert, I did.
      I’m afraid you have to look for another way to wait till the new VM-B is ready.
      The New-VM cmdlet creates the new guest, creates the sysprep input files,copies the machine over and powers on VM-B. At that point the task is finished for New-VM.
      On VM-B in the mean time the sysprep process is running which will require at least one reboot of VM-B.
      If you can find a way that VM-B informs your script that the sysprep process is finished, you have a solution.
      In a similar situation, I created a token-file, at the end of the sysprep process, on a share and let my script run in a loop till the token-file was present. But this requires network connectivity and membership.
      Like I said at the beginning, this is not a New-VM problem but a sysprep problem. How do you find out, externally, that the sysprep process is finished.
      Sorry I don’t have a clear solution for this.

    John House

    updated my email address.

    John House

    Hi Luc – thanks for doing that, it does make it a lot clearer – although i still cant seem to hack it for what I need.

    Essentially, we are cloning VMs using the storage array and i’ve “written” (pinched) scripts to do that for normal vms (without RDM) – but we have some clusters that use RDMs and they do not clone well. So I need to 1) remove the RDMs from the cloned VMs (because these are invalid), 2) Readd the cloned RDM LUNs (which have different LUN ids to the originals).

    Long shot – but do you know of any existing scripts (and i initially thought I could use this) – so I can pass it some parameters “VM Name”, “SCSI Controller ID”,”LUN ID” etc.. and it will add the RDM in physical compatibility mode to the VM?

    Thanks again – ur a king in the community!

    LucD

    John, there are no stupid questions, just stupid answers 😉

    The script run against all the VMs on a specific ESX (which you specify in line 2).

    I have added a few comment lines to the script and indented the lines where appropriate.
    Let me know if that helps ?
    Luc.

      Amit

      hi Luc,

      do you have script by vm-level migration

      we have to migrate a lot of vm’s with physical RDM’s from one data center
      to other data center with new compute and storage (HDS)
      and i want to be able to run the script by VM and not by ESX level

      thanks,
      Amit

    John House

    Hi – stupid question, but how do you actually run this. I cant see anywhere where the VM is specified. Pasting the above code into PS doesnt do anything.

    THanks

    justasimpledude

    So much typing !!
    Why not copy and paste your script 😉 must have taken ages to type , instead why not do the following …much less effort!

    * stop the guest
    * write down the Physical LUN ID
    * remove the RDM disk(s)
    * vMotion the guest
    * add the RDM disk(s) to the guest based on the Physical LUN ID
    * start the guest

      LucD

      Thanks for the tip.
      While I agree this would surely be faster than me typing the script, the script has it’s advantage when you have to migrate substantial numbers of guests with RDMs.
      And you avoid making “human errors” 😉
      Automation proves it’s worth when the task becomes repetitive and when you have to tackle bigger numbers.

    shatztal

    Come on man ,couldn’t you have post this 3 Weeks ago ?
    i wanted to write a script’ but because i didn’t have the time ‘; it did not happend.

    nice one .
    BTW : The Problem happens also if the LUN ID is the same in Both Host.

      LucD

      Sorry that my timing was not optimal 😉

      Thanks for the info on identical LUN ids, didn’t know that one.

    nate

    Even if the LUN ID is the same under certain circumstances the error can still pop up – https://www.techopsguys.com/2009/08/18/its-not-a-bug-its-a-feature/

    Worked fine in ESX 3.0, 3.02, 3.5 (when the LUN ID is the same). I call it a bug, they call it a feature.

      LucD

      Thanks for the feedback.
      At least you can now automate the tedious process of re-mapping all the RDM volumes 😉

Leave a Reply

Your email address will not be published. Required fields are marked *

*
*

This site uses Akismet to reduce spam. Learn how your comment data is processed.