vSAN on HPE Synergy Composeable Infrastructure – Part 2

Firstly, Apologies for the delay in getting the follow up to this series posted. I am getting all these together now to post in quicker succession. Hopefully will have all these posted around VMworld US!

So, this blog post is going to dive into the configuration and components of our vSAN setup using HPE Synergy, based on the overview in Part 1 where I mentioned this would be supporting VDI workloads.

Firstly, a little terminology to help you follow along with the rest of the blog articles:
Frame – Enclosure chassis which can hold up to 12 compute modules, and 8 interconnect modules.
Interconnect – Linkage between blade connectivity and datacentre such as Fibre Channel, Ethernet, and SAS.
Compute Module – Blade server containing CPU, Memory, PCI Expansion cards, and optionally Disk.
Storage Module – Storage chassis within above frame which can hold up to 40 SFF drives (SAS / SATA / SSD). Each storage module occupies 2 compute slots.
Stack – Between 1 and 3 Synergy frames combined together to form a single logical enclosure. Allows sharing of Ethernet uplinks to datacentre with Virtual Connect.
Virtual Connect – HPE technology to allow multiple compute nodes to share a smaller set of uplinks to datacentre networking infrastructure. Acts similar to a switch internal to the Synergy stack.

All of the vSAN nodes are contained within a single Synergy frame or chassis. Main reason behind this is that today, HPE do not support SAS connectivity across multiple frames within a stack, therefore the compute nodes accessing storage must be in the same frame as the storage module. You can mix the density of storage vs. compute within Synergy how you like. So, using a single storage module will leave 10 bays for compute.

Our vSAN configuration is set out as so:
1 x 12000 Synergy Frame with:

2 x SAS Interconnect Modules
2 x Synergy 20Gb Interconnect Link Modules
1 x D3940 Storage module with:

10 x 800GB SAS SSDs for Cache
30 x 3.84TB SATA SSDs for Capacity
2 x I/O Modules

10 x SY480 Gen 10 computer modules with:

768GB RAM (24 x 32GB DDR4 Sticks)
2 x Intel Xeon Gold 6140 CPUs (2.3Ghz x 18 cores)
2 x 300GB SSDs (for ESXi Installation)
Synergy 3820C CNA / Network Adapter
P416ie-m SmartArray Controller

The above frame is actually housed within a logical enclosure or stack containing 3 frames. This means the entire stack shares 2 redundant Virtual Connect Interconnects out to the physical switching infrastructure – but in our configuration these are in a different frame to that containing the vSAN nodes. The stack is interconnected with 20GB Interconnect modules to a pair of master Virtual Connects. For our environment, we have 4 x 40Gbe uplinks to the physical switching infrastructure per stack (2 per redundant Interconnect).

We keep our datacentre networking relatively simple, so all VLANs are trunked through the Virtual Connect switches directly to ESXi. We decided not to have any separation of networking, or internal networking configured within Virtual Connect. Therefore, vSAN replication traffic, and vMotion traffic will traverse out to the physical switching infrastructure, and hairpin back in, however this is of little concern given the bandwidth available to the stack.

That’s all for an overview of the hardware. But do let me know if there is any other detail you would like to see surrounding this topic! The next post will detail how a blade is ‘cabled’ to use the storage and network in a profile.

vSAN on HPE Synergy Composeable Infrastructure – Part 1

It’s been a while since I have posted, been pulled in many different directions with work priorities so blogging took a temporary side-line! I am now back and going to blog about a project I am currently working on to build out an extended VDI vSAN environment.

Our existing vSAN environment is running on DL380 G9 rackmounts, which whilst we had some initial teething issues have been particularly solid and reliable of late!

We are almost to the point of exhausting our CPU and Memory resources for this environment, along with about 60% utilized on the vSAN datastores across the 3 clusters. So with this it felt a natural fit to expand our vSAN environment as we continue the migration to Windows 10, and manage the explosive growth of the environment – aided by recently implementing Microsoft Azure MFA authentication vs 2-factor using a VPN connection.

As an organization, we are about to a refresh a number of HP Gen8 blades in our datacentre, and in looking at going to Gen10 knowing that this could be the last generation to support C7000 chassis, we thought it would be a good time to look at other solutions. This is where HPE Synergy composable infrastructure came in! After an initial purchase of 4 frames, and a business requirement causing us to expand this further – we felt that expanding vSAN could be a good fit into Synergy with the D3940 storage module.

Now we have the hardware in the datacentre and finally racked up, I am going to be going through a series of blogs on how vSAN looks in HPE Synergy composable infrastructure, our configuration, and some of the Synergy features / automation capabilities which make this easier to implement vs the traditional DL380 Gen9 rackmount hardware we have in place today. Stay tuned or follow my twitter handle for notifications for more on this series.

new vSAN cmdlet: Get-VSANObjectHealth

Been a while since I posted, so about time I caught up with some things I have been working on!

I will be posting a series of PowerCLI / Powershell cmdlets to assist with vSAN Day-2 operations. There is plenty out there for creating new clusters, but not much around for helping to troubleshoot / support production environments. Some of these have come from VMworld Europe Hackathon I recently attended in Europe which was a blast! I’ll do my best to credit others who helped contribute to the creation of these cmdlets also!

The first I created Get-VSANObjectHealth is to help obtain a state of the objects in a vSAN cluster. This is a great way to validate how things are in the environment, as it will give you the different status of objects, particularly if they are degraded or rebuilding. The idea of this cmdlet was to integrate into our host remediation script so I could verify that all the objects are healthy prior to moving onto the next host. I trust this verification in PowerCLI more so than VUM attempting to put a host into maintenance and then getting stuck.

Code is below and also posted on GitHub Here

Function Get-VSANObjectHealth {
    <#
        .SYNOPSIS
        Obtain object health for vSAN Cluster
        .DESCRIPTION
        This function performs an object health report for a vSAN Cluster
        .PARAMETER VsanCluster
        Specifies a vSAN Cluster object, returned by Get-Cluster cmdlet.
        .EXAMPLE
        PS C:\> Get-Cluster | Get-VSANObjectHealth
    #>
  
    [CmdletBinding()]
    Param (
      [Parameter(Mandatory, ValueFromPipeline)]
      [Alias("Cluster")]
      [VMware.VimAutomation.ViCore.Types.V1.Inventory.Cluster]$VsanCluster
      ,
      [Parameter(Mandatory = $false)]
      [switch]$ShowObjectUUIDs
      ,
      [Parameter(Mandatory = $false)]
      [switch]$UseCachedInfo
      ,
      [Parameter(Mandatory = $false)]
      [switch]$HealthyOnly    
    )
  
    Begin {
      $vchs = Get-VsanView -Id "VsanVcClusterHealthSystem-vsan-cluster-health-system"
      $VsanVersion = (Get-VsanView -Id "VsanVcClusterHealthSystem-vsan-cluster-health-system").VsanVcClusterQueryVerifyHealthSystemVersions((Get-Cluster).Id) | select VcVersion
    }
  
    Process {
      if($VsanCluster.VsanEnabled -and $VsanVersion.VcVersion -lt '6.6') {
        $result = $vchs.VsanQueryVcClusterHealthSummary($VsanCluster.id, $null, $null, $ShowObjectUUIDs, $null, $UseCachedInfo)
        if($result) {
          if($HealthyOnly) {
            $Health = $result.ObjectHealth.ObjectHealthDetail | Where-Object {$_.Health -notlike 'healthy' -and $_.NumObjects -gt 0}
            if($Health) {
              return $false
            } else {
              return $true
            }
          } else {
            return $result.ObjectHealth.ObjectHealthDetail
          }
        }
      }
      elseif ($VsanCluster.VsanEnabled -and $VsanVersion.VcVersion -lt '6.6') {
          Write-Warning -Message 'vSAN cluster is currently at a version newer than this call allows. Please open an issue: https://github.com/mitsumaui/PowerTheVSAN/issues'
      }
    }
  }

How to use:

Get-Cluster VSAN-Cluster-1 | Get-VSANObjectHealth

Get-Cluster VSAN-Cluster-1 | Get-VSANObjectHealth -HealthyOnly

Get-Cluster VSAN-Cluster-1 | Get-VSANObjectHealth -ShowObjectUUIDs

Details on the Switches:

– HealthyOnly
Will only return True if all objects are healthy, else will return False

– ShowObjectUUIDs
Will extend the query to include an array of ObjectUUIDs for each health category. Good if you need to investigate specific objects

– UseCachedInfo
Will use the cached vSAN Health data from the vCenter server rather than forcing an update from the cluster. Great if you need a quick check, but not recommended if you need a current picture of object health (such as just after exiting from Maintenance Mode)

Enjoy and please do let me know (either via comments here or GitHub) if there is any other enhancements you would like to see! More cmdlets to come soon.

Get vSAN Invalid State Metadata using PowerShell

Have had some ‘fun’ with our All-Flash vSAN clusters recently, after updating to 6.0 U3, then VMware certifying new HBA firmware / driver for our HPE DL380 servers. Every time I have updated / rebooted we end up with invalid metadata:

We’ve had a couple of tickets with VMware on this issue now, and a fix for this is still outstanding. It was scheduled for 6.0 U3 but failed a regression test. So for now when we patch / reboot we have to go fixing these!

The vSAN health page shown above only shows the affected component. On a Hybrid environment, you can remove the capacity disk from the diskgroup and re-add to resolve this, but for All-Flash you need to remove the entire diskgroup. Our servers have 2 x diskgroups per host, so we need to identify which diskgroup needs destroying.

To discover the diskgroup, you have to identify which capacity disk the affected component resides on. There is a VMware KB article for this – but it never worked on our vSAN nodes, so there was a different set of commands VMware support provided us to obtain these. VMware KB is HERE.

Now I’ve ended up doing this several times, and decided pull this into a PowerShell function to make life easier. It will return an object showing the host, Disk UUID, Disk Name & Affected components:

The script does require Posh-SSH, as it connects to the host over SSH to obtain the information. You can download this over at the PowerShell Gallery.

Here’s the code I put together:

function Get-vSANBadMetadataDisks {
  [cmdletbinding()]
  param(
    $vmhostname,
    $rootcred
  )

  if (!$vmhostname) {
    $vmhostname = Read-Host "Enter FQDN of esxi host"
  }
  if (!$rootcred) {
    $rootcred = Get-Credential -UserName 'root' -Message 'Enter root password'
  }
  Write-Host "[$vmhostname : Gathering Metadata information]" -ForegroundColor Cyan
  $session = New-SSHSession -ComputerName $vmhostname -Credential $rootcred -AcceptKey
  if (!$session) {
    Write-Warning "error connecting. exiting"
    return
  }

  $Disks = (Invoke-SSHCommand -SSHSession $session -Command 'vsish -e ls /vmkModules/lsom/disks/ | sed "s/.$//"').Output
  
  $Report = @()
  foreach($Disk in $Disks) {
    $NAA = (Invoke-SSHCommand -SSHSession $session -Command "localcli vsan storage list | grep -B 2 $Disk | grep Displ").output.split(':')[1].trim()
    $Display = (Invoke-SSHCommand -SSHSession $session -Command "esxcli storage core device list | grep -A 1 $NAA | grep Displ").output.split(':')[1].trim()
    
    $Components = (Invoke-SSHCommand -SSHSession $session -Command "vsish -e ls /vmkModules/lsom/disks/$Disk/recoveredComponents/ 2>/dev/null | grep -v 626 | sed 's/.$//'").output
    
    $DiskGroupMetadata = '' | Select-Object VMHost,Disk_UUID,Disk_Name,Components
    $DiskGroupMetadata.VMHost = $vmhostname.split('.')[0]
    $DiskGroupMetadata.Disk_UUID = $Disk
    $DiskGroupMetadata.Disk_Name = $Display
    $DiskGroupMetadata.Components = $Components

    $Report += $DiskGroupMetadata

  }
  
  return $Report | sort Disk_Name
}

To use this, just pass in the VMHostName and Root Credential (using Get-Credential) or just call the function and it will ask for those and run. Of course for this to work you will need to have SSH access enabled for your hosts, and afaik it will not work in lockdown mode.

With the output you can see which capacity disks / diskgroup is affected to delete and re-create. As you can see from the first screenshot I’ve had quite of few of these to do today! 🙁

vSAN is pretty darn good when it’s running – but we do have these challenges when any maintenance is required which is a little more involved than it seems. That said I think some of our challenges stem form using All-Flash, our small Hybrid test cluster has always been solid on the other hand!

Happy Hyper-converging!

iovDisableIR change in ESXi 6.0 U2 P04 causing PSOD on HPE Servers – Updated

So we have had an ‘interesting’ issue at work on the past few weeks!

We have had Gen8 / Gen9 blades in our environment randomly crashing over the last month. We had originally been sent down what seems an incorrect path, however seem we are on the right track now!

Symptoms

HP BL460c Gen8 and Gen9 blades with v2 / v3 processors, would randomly crash. There was no specific cause for them but it seemed to be more prevalent in higher I/O periods such as backups running. It started out that PSODs looked like this:

After logging a call with VMware, we were led down a path that the mlx4_core error in the above screenshot was causing the issue. After further investigation, it turned out that after upgrading from vSphere 5.5 to vSphere 6.0 (using VUM) there were mlx4 drivers left behind – which is what was causing the ‘jumpstart dependancy error’. Once we removed the bad 5.5 VIBs all was well.

The root cause as to why the 5.5 drivers remained after the 6.0 install, is because whilst the driver was present in the HP utilities bundle for 6.0 – the driver version was not revised, so VUM just ignored this! We have run into this before, and fed back to HP that even if the driver is the ‘same’, the version should be revised (and particularly incremented) to ensure the driver gets updated. This is not an issue if you use a fresh-install of vSphere 6.0.

So – we fixed this across the environment (along with some other VIBs – more on this later) and hoped this would be the end of it. 2 days later, we get further PSODs but this time without the mlx4 dependancy error!

Back to square one. We updated logs with VMware, and this time opened a case with HP – as it’s technically reporting LINT1 / NMI hardware errors. Cue 3 days later, and finally get some solid information back – a very interesting discovery!

HP recommended this customer advisory – One I have seen before and a long time back. Strange to me as we have never seen this issue before, and it’s not a setting we change as standard. There was also a specific error called out in the advisory which we had asked for confirmation this was in the logs:

ALERT: APIC: 1823: APICID 0x00000000 – ESR = 0x40.

Anyhow – the most crucial piece that the HP L2 tech informed us, is that ESXi 6.0 Patch ESXi600-201611401-BG changed the setting in the HP customer advisory from a default of false to true.

After running a script in PowerCLI – it appears that is certainly the case, all the hosts we had running ESXi 6.0 Build 4600944 had the iovDisableIR setting set to TRUE (so it is disabled). This is causing the PSODs according to HP.

Digging a little further, the iovDisableIR is a parameter which handles IRQ Interrupt remapping. This is a feature developed by Intel to improve performance. According to VMware this feature had it’s issues originally – particularly with certain Intel chipsets so recommend disabling it in certain circumstances. However HP do support this and infect per their advisory recommend this is enabled to prevent PSODs. The interesting piece – is where the VMware KB (1030265) linked from the HP Customer Advisory states that the error may occur with QLogic HBAs. This is the HBA we use for our FC storage in the environment, but also explains why we have not seen PSODs in our Rackmounts (were they use Direct or SAS attached storage). But – On our Gen9 hardware, we have not seen PSODs, instead the QLogic HBA failing, or the host just rebooting, so I believe these are related to the above setting.

So – to resolve this, we need to do the following across all our hosts running ESXi 6.0 Build 4600944 (I have also had word this is also the same in the ESXi 6.5 release):

esxcli system settings kernel set –setting=iovDisableIR -v FALSE

This requires a reboot to take effect. To determine if the host is affected, you can use the following PowerCLI script to gather a report of the current setting.

$Hosts = Get-VMHost
$Report = $Hosts | % {
    $iovdata = ($_ | Get-EsxCli).system.settings.kernel.list() | ? {$_.Name -like 'iovDisableIR'}
    $_ | select name,parent,model,processorType,version,build,@{n='iovDisableIR_Conf';e={$iovdata.Configured}},@{n='iovDisableIR_Runtime';e={$iovdata.Runtime}}
}

We are awaiting further information from HP/VMware who are now collaborating on our cases to determine the root cause and why this was changed (and is attributed), however have rolled this setting out across our blade environment and will continue to monitor. I will update this post when we know more!

*** Update 15th Feb ***

VMware have now released a KB article on this issue.
VMware KB (2149043)

Word of Warning:

We did some digging on this setting, and found that iovDisableIR has been Disabled (set to TRUE) on the ESXi 6.5 initial release. It does not appear to be configured unique to HP Custom ISO’s