The Future of High-speed Interconnection 400G QSFP-DD SR4 Optical Module

NADDOD Claire Optical Module Engineer Oct 9, 2023

With the rapid development of technologies such as large-scale models, cloud computing, and big data, especially the increasing demand for computational power and memory in large-scale model training, we have to reconsider the importance of network connectivity. Taking GPT-3 as an example, training a model with a trillion parameters requires 2TB of memory, which exceeds the capacity of traditional graphics cards. Even with larger memory capacity, training on a single card would take 32 years, which is clearly impractical. To accelerate the training process, distributed training techniques have emerged, which involve partitioning the model and data and utilizing multiple machines and cards to reduce training time to the scale of weeks or days.

Traditional training vs Large Model training

1. The Rise of Distributed Training Technology

Distributed training techniques build a cluster with high computational and storage capacity, and the high-performance network connecting this cluster directly determines the communication efficiency between nodes, thereby impacting the throughput and performance of the entire cluster. Therefore, low-latency, high-speed networks have become an important requirement for data center development.

 

The core technology for reducing end-to-end communication latency between multiple machines and cards is Remote Direct Memory Access (RDMA) technology. RDMA allows one host to bypass the operating system kernel and directly access the memory of another host. By bypassing the protocol stack of the operating system kernel, RDMA provides significant improvements in latency performance compared to traditional TCP/IP networks, often achieving several orders of magnitude improvement.

 

Currently, RDMA technology is primarily implemented through two methods: InfiniBand and RoCE (RDMA over Converged Ethernet). InfiniBand networks offer higher performance and more mature technology but come with higher costs and are mainly provided by vendors like NVIDIA. On the other hand, the RoCE network ecosystem is more open, with multiple vendors offering various device models, which leads to diversity and complexity in product selection.

2. Existing 400G Network Solutions

The high-speed network in a GPU cluster primarily consists of high-speed network cards, optical connectors, and high-speed switches. Currently, servers equipped with H800 GPUs mainly use NVIDIA CX7 400G network cards, and there are multiple options available for switches on the market. These options include 64400G QDD switches based on the BCM 25.6T Tomahawk4 (TH4) chip, 64800G switches based on the BCM 51.2T chip, switches based on Cisco Nexus chips, and 400G/800G switches based on the NVIDIA Spectrum chip. From the market perspective, 400G QDD switches are currently the dominant choice. The following diagram illustrates a commonly used network architecture for a cluster:

Common network architecture for clusters

It is worth noting that mainstream 400G switches in the market use 56G SerDes, while the CX7 network card utilizes 112G SerDes. This difference needs to be taken into account when selecting optical connectors to ensure compatibility and proper functioning of the components. The following diagram illustrates this:

56G Serdes working process

In most cases, customers tend to choose DR4 optical modules on the switch side, while using OSFP 400G DR4 modules on the server side. The connection is illustrated in the following diagram:

OSFP 400G DR4 working process

3. QSFP-DD 400G SR4 Module: New Choice for High-Performance Networks

The DR4 modules are expensive, and the market is in urgent need of cost-effective alternatives. This is where NADDOD's latest release, the QSFP-DD 400G SR4 optical module, comes into play, as shown in the following diagram. The 400G QSFP-DD SR4 module offers a transmission distance of up to 100 meters, operates at a wavelength of 850nm, and utilizes MPO/MTP-12 multitude fiber, making it the ideal choice for high-performance network interconnection within data centers.

NADDOD 400G QSFP-DD SR4 ModuleOSFP 400G DR4 working process

4. Why Choose 400G QSFP-DD SR4 Optical Module?

Compared to DR4 modules, SR4 optical modules utilize VCSEL (Vertical-Cavity Surface-Emitting Laser) technology. This approach significantly reduces costs and can lower overall power consumption by 2-4 watts, providing clear power advantages in the long run. Although the current pricing of SR4 modules is relatively high, primarily due to limitations in device supply, as the supply issues gradually get resolved, the cost advantages of SR4 modules will become evident. In scenarios such as data centers where the maximum distance requirement does not exceed 100 meters, the SR4 optical module is undoubtedly the most ideal choice.

5. Where to Buy 400G QSFP-DD SR4 Optical Module?

With the rapid development of technologies such as cloud computing, big data, and the Internet of Things, data transmission rates continue to increase. Traditional copper cable transmission faces challenges such as bandwidth limitations and signal attenuation, while optical fiber transmission has become the primary choice for modern communication due to its high bandwidth and low loss advantages. NADDOD's 400G QSFP-DD SR4 serves as a high-performance optical fiber transmission solution, meeting the demands of high-speed transmission while providing stability and reliability.

 

Technology is constantly evolving and innovating. 400G multimode optical modules/AOCs/DACs are expected to continue leading the development in the networking field, offering strong support for the network requirements of the digital era. As a professional module manufacturer, NADDOD produces optical modules ranging from 1G to 400G. We welcome everyone to explore and purchase our products.