Subnet Manager (SM) High Availability (HA) on Mellanox InfiniBand Switches

NADDOD Claire Optical Module Engineer Dec 29, 2023

This post explains the InfiniBand SM High Availability (HA) synchronization functionality on Mellanox InfiniBand switches.

 

 

High Availability in InfiniBand

In InfiniBand, only one SM manages an InfiniBand subnet. However, Multiple SMs can be enabled on the same subnet. In such a case, one of the SMs will be elected as the subnet SM and the rest will be operationally disabled (standby). If the administrator SM dies for any reason, another SM will be elected to manage the network.

 

Could there be an issue with this?

 

The SM configuration files may not be in sync. For example, assuming two IB nodes (A and B) are enabled with SM. Let's also assume that the user configures an SM parameter on node A but does not configure it on node B. If the SM that runs on node A dies, the new SM to be elected from node B will not have that configuration and thus the network may not operate as before.

 

Mellanox SM HA Solution (Mellanox InfiniBand Switches)

  • When enabling SM HA (configuration synchronization) on Mellanox IB switches, the SM database is synchronized with all the switches enabled with SM.

 

  • The synchronization is done out-of-band using an Ethernet management network. All switches participating in the SM HA should be connected to thesame management subnet (same network) without the need to go through a router. This is because the switches send multicast control frames that do not cross routers normally.

 

  • All the switches that participate in the Mellanox SM HA are joined to the InfiniBand subnet ID. Once joined, the synchronized SMs are launched. One of the nodes is elected as SM Masterand the others are Slaves.

 

  • The SM HA allows the systems’ manager to enter and modify all InfiniBand SM configuration of the different subnet managers from a single locationusing a Virtual IP (VIP). All subnet managers can be controlled, started, or stopped from this VIP address. The user is expected to use the VIP address for SM configuration. Trying to configure SM parameters on a master or slave IP will be disabled.

 

Setup

  • InfiniBand network with several switches (at least two). The SM HA will be enabled on the switches. To test the feature, a minimum setup of two switches connected together suffices.

 

  • All switches participating in the SM HA should have the same CPU type(either all x86 or all PPC)

 

  • All switches should have the same MLNX-OS version.

 

  • All switches participating in the SM HA should be connected to the same management subnet(same network) without the need to pass through a router.

 

For this post, two Mellanox SX6036 FDR (36 56Gb/s port) switches (sx21 and sx22) are used, connected to each other on ports 1/1 and 1/2.

 

Planning

The plan is to enable SM HA on both switches.

 

We need to generate a Virtual IP address for the SM HA, as part of the management network.

 

	 Switch IP

 

Configuration

  1. Create an SM HA cluster with planned VIP and SM HA cluster name, and Virtual IP on the first switch (sx21).

 

sx21 [standalone: master] (config) # ib ha my-sm-cluster ip 10.20.2.160 /16

sx21 [my-sm-cluster: master] (config) #

 

  1. Add the second switch (sx22) to the cluster. Just mention the cluster name (same name).

 

sx22 [standalone: master] (config) # ib ha my-sm-cluster

sx22 [my-sm-cluster: standby] (config) #

 

  1. Enable SM on both switches (applicable only from the master).

 

sx21 [my-sm-cluster: master] (config) # ib smnode sx21 enable

sx21 [my-sm-cluster: master] (config) # ib smnode sx22 enable

 

  1. (Optional) Specify the SM priority (range: 0-15; higher number means higher priority) to manage the election of the SM in your desired order (applicable only from the master).

 

sx21 [my-sm-cluster: master] (config) # ib smnode sx21 sm-priority 1

sx21 [my-sm-cluster: master] (config) # ib smnode sx22 sm-priority 2

 

Verification

  1. Check the IB HA status.

 

sx21 [my-sm-cluster: master] (config) # show ib ha

Global HA state

==================

IB Subnet HA name: my-sm-cluster

HA IP address: 10.20.2.160/16

Active HA nodes: 2

HA node local information

Name: sx21 (active) <--- (local node)

SM-HA state: master

IP: 10.20.2.21

Virtual switch membership: infiniband-default

HA node local information

Name: sx22 (active)

SM-HA state: standby

IP: 10.20.2.22

Virtual switch membership: infiniband-default

 

       Check a brief status of HA.

 

sx21 [my-sm-cluster: master] (config) # show ib ha brief

Global HA state

==================

IB Subnet HA name: my-sm-cluster

HA IP address: 10.20.2.160/16

Active HA nodes: 2

ID SM-HA state IP Virtual switch membership

--------------------------------------------------------------------------------

*sx21 master 10.20.2.21 infiniband-default

sx22 standby 10.20.2.22 infiniband-default

 

  1. Show IB SM nodes status (per switch).

 

mti-mar-sx21 [my-sm-cluster: master] (config) # show ib smnodes

HA state of switch infiniband-default

========================================

IB Subnet HA name: my-sm-cluster

HA IP address: 10.20.2.160/16

Active HA nodes: 2

HA node local information

Name: sx21 (active) <--- (local node)

SM-HA state: master

SM Licensed: yes

SM Running: running

SM Enabled: enabled

SM Priority: 1

IP: 10.20.2.21

HA node local information

Name: sx22 (active)

SM-HA state: standby

SM Licensed: yes

SM Running: running

SM Enabled: enabled

SM Priority: 2

IP: 10.20.2.22

 

MLNX-OS WebUI

For the webUI, use VIP Address to change SM configuration.

 

  1. Login to 10.20.2.160 (VIP address).

 

  1. Go to System > HA to configure and change the HA cluster name and VIP.

 

System - HA

 

  1. Go to IB SM Mgmt > Base SM to change the SM nodes parameters.

 

IB SM Mgmt - Base SM