Advantages and Applications of 2x200G HDR Splitter Cable in Large-Scale Server Clusters - NADDOD Blog

InfiniBand 2X200G HDR Splitter Cable

NADDOD Abel InfiniBand Expert Aug 14, 2023

In scenarios such as high-performance computing, AI large-scale model training, and inference, servers commonly adopt an eight-GPU design to maximize the computational power of the cluster. They are equipped with eight 200G IB cards to provide communication and bandwidth support for the computing network. In the process of building large-scale server clusters, the cost of deploying high-speed, low-latency, lossless network equipment (IB HDR switches and IB HDR cables) is increasing. Especially when the number of server nodes exceeds 100, the network architecture transitions from a two-tier FAT TREE to a three-tier FAT TREE, resulting in a noticeable increase in the proportion of network investment.

 

The HDR 200G IB Splitter H-Cable offers a better solution for the port connectivity between spine switches and leaf switches, significantly enhancing the networking scalability of IB switches. It enables a maximum access capacity of 200 server nodes in a two-tier architecture, compared to the previous limit of 100 server nodes. Now let's briefly introduce this cable and its networking approach:

HPC

1. What is a 2x200G HDR Splitter Cable?

The 2x200G HDR Splitter Cable offered by NADDOD, specifically the 2Q2Q56-200G-A3H cable, is a cost-effective active optical splitter cable based on QSFP56 VCSEL (Vertical Cavity Surface Emitting Laser) technology. It supports 2x 200Gb/s to 2x 200Gb/s data transmission. Each QSFP56 end of the cable is equipped with an EEPROM that provides product and status monitoring information that can be accessed by the host system. The cable complies with SFF-8665, RoHS, and FF-8636 standards.

MFS1S90-H003E

2. What is the application of the 2x200G HDR Splitter Cable?

The main application of the 2x200G HDR Splitter Cable is to connect 200G leaf switches and spine switches in a Fat Tree topology, providing cross-connection functionality. This allows the ports of HDR InfiniBand QSFP56 switches to operate as 2x HDR100, as shown in the diagram below;

Spine-Leaf Full Interconnection

 

3. The advantages of the 2x200G HDR Splitter Cable

<1>Increase port access capacity

A single 2x200G HDR Splitter Cable (consisting of four modules) can achieve full interconnection between two spine switches and two leaf switches. If HDR100G direct cables are used, it would require four cables and eight modules, occupying eight device ports, resulting in significant wastage of switch port resources. On the other hand, if HDR 200G direct cables are used, it would still require at least four cables, occupying eight switch ports. While this enhances the uplink and downlink bandwidth, the network scalability is limited due to the port occupancy. As shown in Figure 2, using traditional HDR 200G direct cables and HDR IB switches, a two-tier Fat Tree network can support a maximum of 20 spine switches and 40 leaf switches, with each leaf switch having a maximum access capacity of 20 200G ports. This totals an access capacity of 800 200G ports, which can meet the networking requirements of 100 servers.

Fat Tree Architecture - 200G HDR Direct Cables

Figure 2: Fat Tree Architecture - 200G HDR Direct Cables

 

However, by using 2x200G HDR Splitter Cable and HDR IB switches for networking, under the same two-tier Fat Tree architecture, the maximum access capacity can be doubled to 1600 200G ports, which can meet the networking requirements of 200 eight-GPU servers. As shown in Figure 3:

Two-tier Fat Tree Architecture - 2x200G HDR Splitter Cable

Figure 3: Two-tier Fat Tree Architecture - 2x200G HDR Splitter Cable

 

<2>Expand network scalability

As shown in Figure 3, a single 2x200G HDR Splitter Cable is used between spine1-2 and leaf1-2. For Leaf1, it can be interconnected with both spines using just one port, enabling connectivity with 40 spines using only 20 ports. This doubles the network scalability.

 

However, as depicted in Figure 4, in a network setup that fulfills the requirements of 200 eight-GPU servers, using HDR 200G direct cables would necessitate the construction of a three-tier Fat Tree architecture, requiring more switch devices and AOC cables.

Three-tier Fat Tree Architecture

Figure 4: Three-tier Fat Tree Architecture

 

<3>Reduce Cost

The table below compares the required number of devices for networking 200 eight-GPU servers using HDR 200G direct cables and 2x200G HDR Splitter Cable:

 

Device Name/Description

Quantity (200G HDR Direct Connection Network)

Quantity (2x200G HDR Splitter Cable Network)

QM87xx(HDR IB Switch)--core

40

0

QM87xx(HDR IB Switch)--spine

80

40

QM87xx(HDR IB Switch)--leaf

80

80

AOC(core-spine)

1600

0

AOC(spine-leaf)

1600

800(2x200G HDR Splitter cable)

AOC(leaf-HCA)

1600

1600(200G HDR Direct Connection)

 

From the above table, it can be observed that:

When networking is done using 200G HDR direct cables between switches, it requires 200 IB switches and 4800 HDR cables. However, when using 2x200G HDR Splitter Cable for networking, the number of switches reduces to 120, which is a reduction of 80 switches, and the cable count reduces to 2400, which is a reduction of 2400 cables. Although the price of 2x200G HDR Splitter Cable is approximately 2.5 times that of 200G HDR direct cables, the overall cost is reduced by approximately 30%~40% due to the reduction in the number of switches by 80 and the reduction in cable count by 2400.

4. Summarize

As the scale of user server clusters gradually expands, especially starting from 100 servers, the adoption of 2x200G HDR Splitter Cable significantly reduces the number of switch devices and cables, lowers costs, enhances port access capacity, and provides smoother and more cost-effective scalability for network upgrades. NADDOD can provide 2x200G HDR Splitter Cables of different lengths to meet the usage and deployment requirements in various data center scenarios.