How many Infiniband HDR Products are used for AI?

NADDOD Abel InfiniBand Expert Jul 6, 2023

Since 2023, ChatGPT has successfully “gone global” outside of the tech industry. With the development of ChatGPT and AI, the world will accelerate the construction of data centers across the country, which will inevitably bring changes to the scale and pattern of optical module procurement.

Due to the large internal data flows, AI data centers use Fat Tree network structures. Each node has an equal number of upstream and downstream ports, resulting in more switches than traditional data centers.

For example, NVIDIA’s AI cluster model uses a SuperPOD as a basic unit. A standard SuperPOD is made up of 140 DGX A100 GPU servers, HDR InfiniBand 200G network cards, and 170 NVIDIA Quantum QM8790 switches. Each switch has a rate of 200G and 40 ports. Based on NVIDIA’s scheme, a SuperPOD requires 170 switches, each with 40 ports. With 70 servers connected to each switch above and below (1:1 connection), the corresponding cable demand for interconnecting ports is 40x170/2 = 3,400, which is increased to 4,000 to account for actual deployment scenarios.

For a SuperPOD, the usage ratio of servers: switches: optical transceivers is 140:170:4,000, or approximately 1:1.2:28.6, which requires about 3,750 NVIDIA DGX A100 servers and 110,000 optical transceivers for an entry-level GPT4.0-like demand.

Network Topology

According to NVIDIA's scheme, SuperPODs use HDR InfiniBand 200G network interconnections. NADDOD provides InfiniBand optical transceivers, including HDR 200G QSFP56 SR4 and HDR 200G QSFP56 FR4, to support high-speed and low-latency data transmission, as well as low power consumption and high stability. The following is the basic information for the two modules:

Infiniband HDR 200G QSFP56 Transceiver

Speed PN Price Wavelength Fiber Connector Max Transmission Distance Key Features
200G QSFP56 SR4 MMA1T00-HS $390 850nm Multi-mode MPO/MTP 100m High-speed, low-cost, short-reach connectivity solution designed for data center applications.
200G QSFP56 FR4 MMS1W50-HM $836 1310nm Single-mode LC duplex 2km It uses WDM technology to transmit 4 x 50 Gbps signals over a single fiber, which enables high-density and low-power interconnects.

200G QSFP56 Transceiver

In addition, NVIDIA's scheme also mentions the use of AOC and DAC cables, which are usually used for short-distance data transmission, such as within servers or between servers and switches, and can provide high-bandwidth, low-latency, and low-power consumption transmission solutions. Therefore, HDR AOC and DAC cables may also be used in SuperPODs to meet the needs of different distances and application scenarios. NADDOD can provide both direct-connect and branch HDR AOC and DAC cables under the InfiniBand HDR scheme:

Infiniband HDR 200G QSFP56 Transceiver

Speed PN Price Wavelength Fiber Connector Max Transmission Distance Key Features
200G QSFP56 SR4 MMA1T00-HS $390 850nm Multi-mode MPO/MTP 100m High-speed, low-cost, short-reach connectivity solution designed for data center applications.
200G QSFP56 FR4 MMS1W50-HM $836 1310nm Single-mode LC duplex 2km It uses WDM technology to transmit 4 x 50 Gbps signals over a single fiber, which enables high-density and low-power interconnects.

200G QSFP56 AOC & DAC

Conclusion

Overall, with the rapid development of AI and data centers, take NVIDIA’s AI cluster model as an example, a large number of optical transceivers and cables are required to achieve high-speed and low-latency data transmission in a SuperPOD. NADDOD’s InfiniBand optical transceivers and cables not only have the advantages of high speed, low latency, and low power consumption but also can meet the needs of different distances and application scenarios. In future data center construction, we believe that high-speed, efficient, and stable optical transceivers and cables will be widely used and become an important part of data center construction.


Related Resources:
InfiniBand 200G QSFP56 Products Introduction