Bug Catcher: SRA/SRM testFailoverStart

scared-bugI’ve decided to write this post since I spent quite a bit of time troubleshooting this problem just to find out that it was a bug, grrrr. So, hopefully this will save someone some hair tearing and time as well.

I was recently implementing SRM 6.1 on a NetApp cluster running clustered mode (8.3.1). Configuration was flawless and was happy that I may complete a project early on a Friday. We decided to run a failover test to DR site, and this is where the issue came about.

After double checking all the settings, I had no luck finding any resources with this issue/resolution. The job would fail almost immediately after starting, with the error “Storage ports not found”. I checked the SRA ontap_config file to make sure the IPv4 option (isipv4) was set to match the IP format of the NFS configuration within SRM. Checked to make sure the firewall on the NetApp was set properly to allow communication, but everything looked correct.

I learned later on, that SRA 2.1 cannot detect NetApp interfaces that are set to mgmt. So, with NetApp LIFs, you have the option to “bundle” your data and mgmt interfaces within the same LIF (NFS, CIFS). If this is the case, those interfaces will be set to have the mgmt firewall-policy rather than the data firewall-policy, which is ok unless you are trying to use SRM/SRA in that setup.

Resolution:

  • Create a separate/dedicated mgmt LIF for your NFS SVM (per SVM)
    • Otherwise you are removing all mgmt interfaces for that SVM without a replacement
  • Remove the mgmt option for the NFS data LIFs
  • Change the firewall-policy for the NFS data LIFs from mgmt to data
    • You can use this command to do so:

network interface modify -vserver [vserver_name] -lif [data_lif_name] -firewall-policy data

Also make sure to check the ontap_config file for SRA. This is located under C:\Program Files\VMware\VMware vCenter Site Recovery Manager\storage\sra\CMODE_ONTAP.  If the ip addresses for the data LIFs are IPv4, this option needs to be set to YES.

ontap_config_SRA

 

 

 

Bug should be fixed with SRA 3…. coming soon to a datacenter near you.

One thought on “Bug Catcher: SRA/SRM testFailoverStart

  1. Great find! fyi, the exact OPPOSITE is required for Commvault file archiving on cDOT, where you NEED mgmt on the data lifs enabled. So hopefully you don’t need both solutions at the same time on the same clusters!

    Liked by 1 person

Leave a comment