Introducing InfiniBand for AI Clusters Infrastructure: High Bandwidth and Low Latency Networking Solution - NADDOD Blog

Unleashing InfiniBand High-Speed Networking for AI Clusters

NADDOD Gavin InfiniBand Network Engineer Jan 19, 2024

InfiniBand(IB),which translates to "infinite bandwidth" technology, is a network communication protocol designed for large-scale and scalable clusters. It can be used for interconnecting data within or outside of computers, direct or switched interconnects between servers and storage systems, and interconnects between storage systems.

 

One of the most significant features of InfiniBand is its high bandwidth and low latency, making it widely used in high-performance computing projects. It is primarily employed in high-performance computing (HPC), high-performance cluster application servers, and high-performance storage.

 

InfiniBand Link Rates

InfiniBand defines multiple link rates at the physical layer, such as 1X, 4X, and 12X. Each individual link consists of a four-wire serial differential connection (two wires in each direction).

 

InfiniBand Link Rates

 

For example, in the early Single Data Rate (SDR) specification, the raw signal bandwidth of a 1X link was 2.5 Gbps, 4X link was 10 Gbps, and 12X link was 30 Gbps. The actual data bandwidth of a 1X link is 2.0 Gbps (due to 8b/10b encoding). Since the link is bidirectional, the total bandwidth relative to the bus is 4 Gbps.

 

Over time, InfiniBand's network bandwidth has continuously evolved. The following diagram illustrates the progression of InfiniBand's network bandwidth from SDR, DDR, QDR, FDR, EDR to HDR, and NDR, with the speed based on 4X link rates.

 

InfiniBand Bandwidth

SDR (Single Data Rate): 2.5 Gb/s (10 Gb/s for 4X).

DDR (Double Data Rate): 5 Gb/s (20 Gb/s for 4X).

QDR (Quad Data Rate): 10 Gb/s (40 Gb/s for 4X).

FDR (Fourteen Data Rate): 14 Gb/s (56 Gb/s for 4X).

EDR (Enhanced Data Rate): 25 Gb/s (100 Gb/s for 4X).

HDR (High Data Rate): 50 Gb/s (200 Gb/s for 4X).

NDR (Next Data Rate): 100 Gb/s (400 Gb/s for 4X).

 

InfiniBand Link Rates Sheet

InfiniBand Network Interconnect Products

The cables used in InfiniBand networks differ from traditional Ethernet cables and optical fiber cables. Specialized InfiniBand cables are required for different connection scenarios.

 

InfiniBand network interconnect products include Direct Attach Copper (DAC) cables, Active Optical Cables (AOC), and optical modules.

 

DAC cables and AOC are both used for interconnecting high-capacity storage devices in data centers and high-performance computing systems.

 

DAC cables, also known as Direct Attach Copper cables, use copper wires for signal transmission with low-voltage pulses. The power consumption, transmission distance, and price of DAC cables vary depending on the materials used. DAC cables have lower power consumption but relatively shorter transmission distances, typically less than 10 meters. They are comparatively more affordable in terms of price.

 

Direct Attach Copper cables

Active Optical Cables (AOC) are optical cables used for data transmission in InfiniBand networks. Unlike DAC cables, AOCs utilize optical fibers for signal transmission, employing electrical-to-optical and optical-to-electrical conversions. AOCs have higher power consumption compared to DAC cables but can achieve transmission distances of up to 100 meters. They are relatively higher priced compared to DAC cables due to the use of optical components and technology.

 

Active Optical Cables

Optical modules serve the purpose of converting between optical and electrical signals. They are primarily used as the transmission medium between switches and devices. The principle behind optical modules is similar to that of optical transceivers, but optical modules are more efficient and secure. Optical modules are categorized based on their packaging, with common types including SFP, SFP+, XFP, SFP28, QSFP+, QSFP28, and so on.

 

Optical transceivers are devices that convert short-distance electrical signals to long-distance optical signals and vice versa. They are typically used in long-distance transmissions, utilizing optical fibers to transmit the converted optical signals while converting received optical signals back into electrical signals at the receiving end. In many cases, they are also referred to as fiber converters. Optical transceivers provide an affordable solution for users who need to upgrade their systems from copper to fiber but have limited funds, manpower, or time.

 

Optical transceivers

 

How to pair optical modules with optical transceivers:

 

  1. The wavelength and transmission distance must match. For example, if using a 1310nm wavelength, the transmission distance should be 10KM/20KM.

 

  1. Pay attention to the interface selection of the fiber jumper. Optical transceivers typically use SC connectors, while optical modules use LC connectors.

 

  1. The data rate must be the same. For instance, a gigabit transceiver corresponds to a 1.25G optical module, 100Mbps connects to 100Mbps, and gigabit connects to gigabit.

 

  1. The module types should be the same. Single-mode fiber requires a single-mode module, and multi-mode fiber requires a multi-mode module.

 

InfiniBand Networking

 

InfiniBand networking differs from regular switches, as it incurs higher networking costs. To achieve lossless communication between any two compute nodes' network cards in an InfiniBand network, a network topology called Fat Tree is used. The following diagram illustrates a typical Fat Tree topology, where squares represent switches and ellipses represent compute nodes.

 

Fat Tree consists of two main layers. The upper layer is the core layer, which does not connect to any compute nodes and solely functions as a traffic forwarder. The lower layer is the access layer, connecting various compute nodes.

 

The high cost of implementing a Fat Tree topology in InfiniBand networks primarily stems from the following reason: Suppose a convergence switch has 36 ports. To achieve lossless bandwidth, half of these ports, which is 18 ports, can be connected to compute nodes, while the remaining half needs to be connected to the upper layer's core switches. It's worth noting that each cable alone costs thousands of dollars, and redundant connections are required for achieving losslessness.

 

InfiniBand Nodes

 

NVIDIA InfiniBand Commercial Products

 

Mellanox has been a dominant player in the global InfiniBand market, and after NVIDIA's acquisition of Mellanox, NVIDIA introduced its own seventh-generation NVIDIA InfiniBand architecture called NVIDIA Quantum-2 in 2021.

 

The NVIDIA Quantum-2 platform includes the NVIDIA Quantum-2 series switches, NVIDIA ConnectX-7 InfiniBand adapters, BlueField-3 InfiniBand DPUs, and cables.

 

NVIDIA NDR 400G INFINIBAND

The NVIDIA Quantum-2 series switches feature a compact 1U design and are available in both air-cooled and liquid-cooled versions. The switches are built on a 7nm chip manufacturing process and have a single chip with 57 billion transistors (even more than the A100 GPU). A single switch offers flexible configurations of either 64 ports at 400Gb/s or 128 ports at 200Gb/s, providing a total bidirectional throughput of 51.2Tb/s. The NVIDIA NDR 400Gb/s InfiniBand switch is shown in the following image:

 

NVIDIA Quantum-2

The NVIDIA ConnectX-7 InfiniBand adapters support both PCIe Gen4 and Gen5 and come in various form factors, delivering 400Gb/s throughput.

 

NVIDIA ConnectX-7 InfiniBand Adapter

NADDOD:Experienced third-party InfiniBand products

Naddod for InfiniBand optical modules and high-speed cables, provides a one-stop solution!

 

NADDOD InfiniBand NDR Product

Delivery Time: We have abundant and stable inventory to ensure fast delivery. After placing an order, we promise to complete the delivery within two weeks, enabling your project to progress quickly and saving time and resources, so that your project is no longer limited by waiting.

 

Product Performance:

① Our products undergo 100% real device testing to ensure quality and reliability, and we can provide you with professional test reports.

② Our testing scenarios involve the simultaneous application of tens of thousands of cables to ensure that the products can operate smoothly under real application requirements without packet loss or errors.

 

Product Delivery:

① We have successfully cooperated with multiple enterprises and delivered products that have been running stably, gaining trust from our customers.

② We provide fast and responsive technical services to ensure after-sales support throughout your product usage process.

 

Multiple successful deliveries and real-world application cases are the best endorsement of our quality assurance. You don't need to worry about product quality and inventory issues as we always maintain sufficient stock to ensure your needs are met promptly.

 

In addition to providing third-party high-quality optical modules, we also have a large inventory of original NVIDIA products, offering you more choices at any time. Contact NADDOD now to learn more details!