Key Technologies of NVIDIA Network Solutions

NADDOD Quinn InfiniBand Network Architect Mar 8, 2024

The key technologies used in building the cable and transceiver solutions for NVIDIA networks include.Cables and transceivers are used for the fundamental interconnect of 100G-PAM4, 50G-PAM4, and 25G-NRZ modulations. With the help of these devices, it is possible to achieve total data rates of 800G, 400G, 200G, 100G, and 25Gb/s.The following technologies are used in various combinations to create a variety of cables and transceivers with different modulation rates, copper wires, connector shells, protocols, transceivers, optical connectors, and fibers.

 

This topic will be discussed in the following sections:

 

  1. Modulation Rates

 

  1. Protocol Support for InfiniBand and Ethernet

 

  1. Connector Cages and Plugs

 

  1. Optical Connectors

 

  1. Optical Fibers

 

  1. Straight and Splitter Fiber Crossover Cables

 

  1. Optical Patch Panels

 

Before we proceed, it's important to understand that total data rates can be achieved through various technology combinations. For example:

 

  • 400Gb/s can be achieved through 4x100G-PAM4 or 8x50G-PAM4.

 

  • 200Gb/s can be achieved through 2x100G-PAM4, 4x50G-PAM4, or 8x25G-NRZ.

 

  • 100Gb/s can be achieved through 1x100G-PAM4, 2x50G-PAM4, or 4x25G-NRZ.

 

These combinations demonstrate the flexibility in achieving different data rates by utilizing different modulation schemes and configurations.

 

1. Modulation Rates

High-speed digital signals utilize various voltage modulation types. Changing voltages generate digital pulses with varying voltage amplitudes or levels. In modern data centers, NRZ (Non-Return to Zero) modulation is typically used for slower speeds, while PAM4 (Pulse Amplitude Modulation with 4 levels) modulation is employed for higher speeds.

 

NRZ Modulation

In the early days of digital 1s and 0s signaling, a digital zero was inserted between each data bit to allow the receiving clock to synchronize with the data signal. This was known as "return to zero" modulation. However, as electronic devices became faster, the inserted zeros were eliminated, and the pulses were synchronized with the edges of the data signal. This is known as "non-return to zero" or NRZ modulation. NRZ became the industry standard for data rates of 1G, 10G, and 25Gb/s, and it has been used for total data rates of 1G, 10G, 25G, 40G, and 100G.

 

Continuing to use NRZ for 50G became problematic because the wire circuitry becomes like a radio antenna at high frequencies, resulting in signal power loss. This significantly increased the cost of controlling the signal on circuit boards and wires.

 

PAM4 Modulation

The industry standards organizations introduced an innovative modulation technique called four-level Pulse Amplitude Modulation (PAM4). It encodes two data signals within a single pulse by varying the intensity of the voltage during a clock pulse period. This technology achieves higher data transmission efficiency by elevating the voltage levels to four different levels, allowing the transmission of two data bits (or {00, 01} and {10, 11}) within each clock period. This effectively combines two data bits together, significantly enhancing data transmission efficiency.

 

In the case of 50G-PAM4, this technology maintains a similar 25GHz clock speed as NRZ modulation but achieves higher data transmission rates by stacking two data bits, all while keeping costs relatively low. With advancements in electronic technology, the industry has seen the emergence of 100G-PAM4 with a 50GHz clock and two data bits. In the near future, it is expected that the industry will transition to 200G-PAM4 modulation with a clock speed of 100GHz and two data bits, which will further drive a leap in data transmission rates.

 

  • 100G-PAM4 modulation is used for 400Gb/s NDR InfiniBand and Spectrum-4 400Gb/s Ethernet systems:

 

  • 800Gb/s = 8 channels of 100G-PAM4

 

  • 400Gb/s = 4 channels of 100G-PAM4

 

  • 200Gb/s = 2 channels of 100G-PAM4

 

  • 50G-PAM4 modulation is used for 400G and 200G Spectrum-2/Spectrum-3 Ethernet and HDR InfiniBand:

 

  • 400Gb/s = 8x50G-PAM4, used with QSFP-DD devices for systems supporting only Spectrum-3 Ethernet

 

  • 200Gb/s = 4x50G-PAM4, used for 200GbE and HDR InfiniBand

 

  • 25G-NRZ modulation is used for 25G/100G Spectrum/Spectrum-2 Ethernet systems and 100Gb/s EDR InfiniBand:

 

  • 100Gb/s = 4x25G-NRZ

 

  • 50Gb/s = 2x25G-NRZ

 

  • 25Gb/s = 1x25G-NRZ

 

NRZ VS PAM4

2. InfiniBand and Ethernet protocol support

NVIDIA is unique in its ability to provide both InfiniBand and Ethernet networking technologies simultaneously. These two technologies share similar electrical and optical physical characteristics, allowing NVIDIA to integrate support for both InfiniBand and Ethernet protocols in the firmware of its network adapters, DPUs, cables, and transceivers. This means that when these cables and transceivers are inserted into switches that support specific Ethernet or InfiniBand protocols, the corresponding protocol is activated.

 

NVIDIA's dual-protocol capability is a distinctive feature that allows customers to use adapters and network interconnect devices more flexibly, migrating or combining different protocols across systems according to their computing needs. For example, users of the DGX-H100 8-GPU system can choose to use InfiniBand for GPU-to-GPU communication while also using InfiniBand and Ethernet for storage networking and other cluster communication. This capability applies to the 100G-PAM4 product line, but there may be some limitations for older products such as 50G PAM4 and 25G-NRZ, as some devices are specifically designed for either InfiniBand or Ethernet.

 

The 100G-PAM4 LinkX cables and transceivers, ConnectX-7 adapters, and BlueField-3 DPU all support both InfiniBand and Ethernet protocols on the same device and use the same part numbers. The protocol is determined when these network adapters and interconnect devices are inserted into Quantum-2 NDR InfiniBand or Spectrum-4 Ethernet switches.

 

However, the dual-protocol capability of NVIDIA may not be fully applicable to high-speed networking technologies such as 800GbE/400GbE, 200GbE, NDR, HDR, and 100GbE, EDR. For example, 400GbE QSFP-DD switches based on 8x50G-PAM4 only support Ethernet because InfiniBand does not use the QSFP-DD form factor. For 200Gb/s cables and transceivers, although most are dual-protocol products based on 4x50G-PAM4, there are still some specific components designed for either Ethernet or InfiniBand. Additionally, 100Gb/s cables and transceivers based on 4x25G-NRZ typically contain mixed protocol components, while InfiniBand EDR only supports 100Gb/s data transmission. The SFP28 interface is also not typically used for InfiniBand.

 

3. Connector Cages and Plugs

Electronic components, optical devices, and copper wires are assembled into a special metal plug called a form-factor connector. Each connector has a corresponding cage that is installed on the front panel of network switches and the top of network adapters. These metal plugs have various designations based on the original design of a single-channel small-form-factor plug (SFP).

 

The designations of SFP can be expanded by adding different letters to indicate different channel counts. For example, adding the letter "Q" represents four channels (QSFP), adding "-DD" represents double density (QSFP-DD), and adding "OSFP" represents eight channels. NVIDIA has designed an eight-channel transceiver called dual-port OSFP, which includes eight electrical channels and two optical four-channel ports. The numbers on the connectors denote the maximum Gb/s speed they can support, for example, QSFP56 or QSFP112, and these numbers help accommodate electromagnetic interference (EMI) noise.

 

It is important to note that although the connectors support high maximum speeds, the actual operating speeds of InfiniBand and Ethernet are often lower than this maximum value. For example, the QSFP56 connector supports a maximum speed of 50Gb/s, while the QSFP112 connector supports a maximum speed of 100Gb/s.

 

LinkX 100G

LinkX 50GLinkX 25G

InfiniBand and Ethernet

 

4. Optical Connectors

An optical connector is a device used to connect the ends of optical fibers to transceivers. A straight-through cable has one optical connector on each end, while a splitter cable contains five optical connectors.

 

There are four commonly used optical connectors in data centers.

MPO-12/APC 8 chips 50G-PAM4,100G-PAM4
LC duplex 2 chips 25G-NRZ,50G-PAM4
MPO-12/UPC 8 chips 25G-NRZ,50G-PAM4
MPO-16/APC 16 chips 50G-PAM4

MPO

 

5. Single-Mode and Multi-Mode Fibers(SMF & MMF)

Fiber optic cables are made of a bundle of glass, with the outer glass coating having a higher density than the inner part. This allows light to propagate along the length of the fiber and bend when encountering the higher-density glass coating.

 

There are two types of fiber optic cables: single-mode and multimode.

 

Single-Mode Fiber:

 

Single-mode fiber has a tiny 9-μm optical transmission core. The core is small enough that light bends at a very shallow angle inside, keeping the data pulse photons as a group or "single mode" together, allowing them to propagate over long distances.

 

  • It is typically used for distances ranging from 50 meters to 40 kilometers but can also be used for shorter lengths of 1 meter.

 

  • Single-mode fiber used in data centers is most transparent at a wavelength of 1310 nm.

 

  • The cable jacket for single-mode fiber is usually yellow, and the labels on transceivers are also yellow.

 

Multi-Mode Fiber:

 

Multimode fiber has a larger 50-μm optical transmission core, and some of the photons in a single data pulse take steeper angles when entering the fiber. This causes the light pulse to spread out into multiple paths or "modes" inside the fiber, with parts of the pulse taking longer to reach the photodetector end. This results in a decrease in pulse intensity and temporal spreading, which is sufficient to cause collisions with the next data pulse, limiting the maximum transmission distance.

 

  • Multimode fiber is most transparent at a wavelength of 850 nm.

 

  • Multimode fiber based on 850 nm and transceivers cannot be used together with single-mode fiber and transceivers based on 1310 nm.

 

  • The cable jacket for multimode fiber is usually green, and the labels on transceivers are brown.

 

Fiber types are optimized for specific modulation rates:

 

  • For 25G-NRZ and 50G-PAM4: OM4 fiber type achieves a distance of up to 100 meters, while OM3 can reach up to 70 meters.

 

  • For 100G-PAM4: OM4 fiber type achieves a distance of up to 50 meters, while OM3 can reach up to 30 meters.

 

  • NVIDIA's 100G-PAM4 series fiber optic cables are limited to multimode straight-through fiber up to 50 meters and single-mode straight-through and multimode 1:2 splitter fiber up to 100 meters.

 

  • NVIDIA's fiber optic cables are limited to MPO-12/APC and use green connector housings, suitable for both single-mode and multimode optics.

 

Large-diameter multimode fiber is cost-effective and can be used for connecting lasers and photodetectors, but it is limited to 50 meters at 100G-PAM4 speeds. Single-mode fiber interfaces are more expensive but can extend beyond 2 kilometers, spanning across the entire data center. Multimode optics are the most commonly used optical devices as most data center interconnect distances are less than 50 meters.

 

NADDOD - Revolutionary Network Connectivity Solutions to Boost Your Data Transfer Speed

 

NADDOD is a leading provider of network connectivity solutions dedicated to offering innovative products that enhance data transfer speed and network performance. Our cutting-edge technology and high-quality products provide a significant competitive advantage for your business.

 

  1. High-Speed Data Transfer: NADDOD's products leverage state-of-the-art modulation rate technology, enabling exceptional data transfer speeds. Our solutions support various modulation rates, including NRZ and PAM4 modulation, ensuring your data is transmitted at the fastest speeds possible.

 

  1. Protocol Compatibility: NADDOD's products are compatible with multiple communication protocols, such as InfiniBand and Ethernet. Regardless of your network environment, our products deliver reliable connections, ensuring fast and stable data transmission.

 

  1. Flexible Connectivity Options: Our products offer a range of connectivity options to meet diverse networking needs. Our connector cages and plugs are designed for flexibility, adapting to various wiring environments and devices. Whether you're in a data center, office, or any other location, NADDOD provides the optimal connectivity solution.

 

  1. High-Quality Optical Fibers: NADDOD utilizes high-quality optical fibers, including single-mode and multimode fibers, to ensure stable and reliable data transmission. Our fiber products undergo rigorous testing and quality control to deliver excellent performance over long distances.

 

  1. Customized Solutions: We understand that each customer's needs are unique. Therefore, we provide customizable solutions to meet your specific requirements. Whether you need specific connector types, customized fiber lengths, or other tailored demands, NADDOD offers the best solution for you.

 

By choosing NADDOD, you gain access to top-notch network connectivity solutions that enhance data transfer speed, improve network performance, and elevate your business competitiveness. Contact our sales team to learn more about our products and how to customize a solution that fits your needs. Join NADDOD and embrace a new era of high-speed and efficient network connectivity!

 

Related links:

https://www.naddod.com/blog/unveiling-the-evolution-of-nvlink

https://www.naddod.com/blog/how-to-optimizing-gpu-communication-for-ai-clusters

https://www.naddod.com/blog/brief-discussion-on-nvidia-nvlink-network

https://www.naddod.com/blog/nvidia-spectrum-x-ethernet-solution