InfiniBand: Unlocking the Power of HPC Networking - NADDOD Blog

InfiniBand: Unlocking the Power of HPC Networking

NADDOD Abel InfiniBand Expert May 19, 2023

NVIDIA/Mellanox Quantum InfiniBand is an off-loadable In-Network Computing platform that provides high-performance computing (HPC), AI, and hyper-scale cloud infrastructures with unmatched performance. InfiniBand is mainly designed for server-side connections, communication between servers, and storage devices and networks. InfiniBand is advocated by the InfiniBand Industry Association and has become the commonest interconnect technology in the TOP500 list, with 44.4% of the systems using InfiniBand for interconnection and 40.4% using Ethernet technology. So What is the difference between InfiniBand and Ethernet?

What is the difference between InfiniBand and Ethernet?

What is InfiniBand?

InfiniBand, developed by the InfiniBand Trade Association (IBTA), is an industry-standard communications specification that defines a switched fabric architecture for interconnecting servers, communications infrastructure equipment, storage, and embedded systems in the data center.

Known for its high bandwidth and low latency, InfiniBand can deliver FDR 56Gbps, EDR 100Gbps, HDR 200Gbps & NDR 400Gbps/800Gbps per second throughput with a 4x link width connection, and faster speeds are on the horizon.The IB network utilizes Mellanox IB network cards and specialized IB switches,and controller software UFM for network communication and management.

In addition, InfiniBand is highly scalable and can support tens of thousands of nodes in a single subnet, making it the ideal choice for high-performance computing (HPC) environments. With the quality of service (QoS) and failover capabilities, InfiniBand is also used as one of the network fabrics for the non-volatile memory expresses over fabrics (NVMe-oF) storage protocol, which also runs over Ethernet, Fibre Channel (FC), and TCP/IP networks. Choose InfiniBand for your data center needs and experience unparalleled performance and scalability.

What is Ethernet?

Ethernet is a LAN specification standard created by Xerox, Intel, and DEC. It’s the most widely adopted communication protocol for LANs, transmitting data through cables. In the 1970s, Xerox developed Ethernet, a wired communication technology that links devices in a LAN or WAN. Ethernet connects several devices, including printers and laptops, across buildings, residences, and small communities. It has a simple user interface, connecting devices like switches, routers, and PCs, creating LANs with just a router and Ethernet connections. Ethernet devices are compatible with slower ones, but the speed is limited by the weakest component.

Although wireless networks have replaced Ethernet in many locations, Ethernet remains prevalent for wired networking as it’s more reliable and less susceptible to interference. Ethernet has undergone several revisions and is continually redesigned to expand and evolve. Today, it’s one of the most widely used network technologies worldwide.

At present, the IEEE 802.3 Standard Organization organized by IEEE has issued Ethernet interface standards of 100GE, 200GE, 400GE , and 800GE.

InfiniBand vs Ethernet

InfiniBand was designed to address the bottleneck of cluster data transmission in high-performance computing scenarios and has since become a widely used interconnection standard that meets modern requirements. The main differences between InfiniBand and Ethernet lie in their bandwidth, latency, network reliability, and networking methods.

Bandwidth

In terms of bandwidth, InfiniBand has developed faster than Ethernet due to its application in high-performance computing and reduced CPU load. Ethernet, on the other hand, is more focused on terminal device interconnection and does not require high bandwidth demands. The average bandwidth of NADDOD’s Infiniband products could reach 197Gb/s.
High BandwidthHigh Bandwidth ≈ 197G

Latency

When it comes to network latency, InfiniBand and Ethernet differ greatly in their processing flows. InfiniBand switches use the Cut-Through technology to significantly reduce forwarding delay to less than 100 ns, while Ethernet switches have longer processing flows due to complex services such as IP, MPLS, and QinQ. NADDOD’s Infiniband products has super low latency less than 97ns with super lower power consumption of less than 4.5W.
Low LatencyLow Latency < 97ns , Low Consumption < 4.5W

Reliability

Network reliability is crucial for high-performance computing, and InfiniBand’s defined layer 1 to layer 4 formats ensure a lossless network through end-to-end flow control. Ethernet, however, lacks a scheduling-based flow control mechanism and requires a larger chip area to temporarily store messages, resulting in higher costs and power consumption.

Network Deploying Method

Finally, InfiniBand’s networking mode is simpler to manage than Ethernet’s, as the idea of SDN is built into its design. A subnet manager is present on each InfiniBand layer 2 networks to configure nodes and calculate forwarding path information, while Ethernet requires MAC entries, IP, and ARP protocols. Furthermore, Ethernet requires regular packet sending to update entries and implement the Vlan mechanism to divide virtual networks and limit their scale, which increases complexity and can result in loop networks that require additional protocols like STP.

Exploring InfiniBand Products

InfiniBand Switches & NICs

Based on the aforementioned comparison between InfiniBand and Ethernet, it is clear that InfiniBand networks possess notable advantages. For those interested in implementing InfiniBand switches in their high-performance data center, additional information is available. The InfiniBand network has undergone rapid iteration, progressing from SDR 10Gbps, DDR 20Gbps, QDR 40Gbps, FDR56Gbps, EDR 100Gbps, and now to HDR 200Gbps and NDR 400Gbps/800gbps InfiniBand, all made possible by the use of RDMA (remote direct memory access) technology.
Traditional vs RDMA mode
NADDOD provides NVIDIA/Mellanox Quantum-2 QM9700 NDR series InfiniBand switches, NVIDIA/Mellannox Quantum QM8700 HDR 200G series InfiniBand switches and NVIDIA/Mellanox Quantum MSB7800 serials EDR 100G InfiniBand switches, which have a latency of less than 130ns, 16Tb/s aggregation switch throughput, high bandwidth of 40 HDR 200Gb/s ports, 80 HDR100 100Gb/s ports, and 274W power consumption. They use NVIDIA Quantum™ IC chips and superior process materials to enable higher performance and more reliable networks.

NVIDIA Quantum-2 QM9700 NVIDIA Quantum QM8700 NVIDIA SB7800 NVIDIA ConnectX-6 NICs

NVIDIA Quantum Infiniband Switches

NVIDIA Quantum-2 QM9700 NVIDIA Quantum QM8700 NVIDIA SB7800 NVIDIA ConnectX-6 NICs
MQM9700-NS2F MQM8700-HS2F MSB7800-ES2F MCX653105A-HDAT
MQM9700-NS2R MQM8700-HS2R MSB7800-ES2R MCX653106A-HDAT
MQM9790-NS2F MQM8790-HS2F MSB7890-ES2F MCX653106A-HCAT
MQM9790-NS2R MQM8790-HS2R MSB7890-ES2R MCX654105A-HCAT

InfiniBand Cables

LinkX Optics offers a wide range of products in QSFP form factor and data rates of Quad data rate (QDR) and FDR10 (40G), FDR (56G), EDR (100G) up to 100m, and HDR (200G) and NDR (400G) up to 150m. LinkX AOC cables are designed for use in supercomputers that have the most stringent requirements for greater than Ethernet AOC industry standards.

NADDOD end-to-end Ethernet and InfiniBand intelligent interconnect products 100% meet the original NVIDIA/Mellanox Linkx cables’ technical and performance requirements by providing the highest throughput and lowest latency, delivering data faster to applications and unlocking system performance.

InfiniBand AOCs

Speed Rate PN Length
800G InfiniBand NDR OSFP ACC MCA4J80-N003 3m, 4m, 5m
800G InfiniBand NDR OSFP ACC Splitter MCA7J65-N004 4m, 5m
800G InfiniBand NDR OSFP ACC Splitter MCA7J75-N004 4m, 5m
800G InfiniBand NDR OSFP ACC Splitter MCA7J60-N004 4m, 5m
800G InfiniBand NDR OSFP ACC Splitter MCA7J70-N004 4m, 5m
200G HDR QSFP56 to QSFP56 AOC MFS1S00-H003E/MFS1S00-H003V 3m, 5m, 7m, 10m, 15m, 20m, 25m, 30m, 35m, 40m, 50m, 60m, 70m, 80m,90m,100m, 130m, 150m
200G HDR QSFP56 to 2xQSFP56 AOC MFS1S50-H001E/MFS1S50-H001V 1m, 3m, 5m, 7m, 10m, 15m, 20m, 30m
100G EDR QSFP28 to QSFP28 AOC MFA1A00-E001 1m, 1.5m, 2m, 3m, 4m, 5m, 7m, 10m, 15m, 20m, 30m, 50m,100m
56G QSFP+ to QSFP+ AOC MC220731V-001 1m, 3m, 5m, 7m, 10m, 15m, 20m, 25m, 30m, 50m,100m

InfiniBand DACs

Speed Rate PN Length
800G InfiniBand NDR OSFP DAC MCP4Y10-N001 0.5m,1m,1.5m, 2m
800G InfiniBand NDR OSFP DAC Splitter MCP7Y10-N001 1m,1.5m,2m, 2.5m,3m
800G InfiniBand NDR OSFP DAC Splitter MCP7Y40-N001 1m,1.5m,2m, 2.5m,3m
800G InfiniBand NDR OSFP DAC Splitter MCP7Y50-N001 1m,1.5m,2m, 2.5m,3m
800G InfiniBand NDR OSFP DAC Splitter MCP7Y00-N001 1m,1.5m,2m, 2.5m,3m
200G HDR QSFP56 to QSFP56 DAC MCP1650-H001E30 0.5m,1m,1.5m,2m,3m, 4m
200G HDR QSFP56 to 2xQSFP56 DAC MCP7H50-H001R30 1m,1.5m,2m,3m, 4m
100G EDR QSFP28 to QSFP28 DAC MCP1600-E001E30 0.5m,1m,1.5m,2m,3m, 4m,5m
56G QSFP+ to QSFP+ DAC MC2207130-001 0.5m,1m,1.5m,2m,3m, 4m,5m

The power consumption of other third-party IB optics is typically 6W, while NADDOD can provide as low as 4.5W. In addition, for more than 98% of compatible InfiniBand Mellanox cable vendors, it is almost impossible to achieve both low latency and low power. However, NADDOD InfiniBand cables can achieve both low latency and low power consumption as well as ultra-low BER and high performance, It can be perfectly adapted to NVIDIA Mellanox switches and NIC products, providing optimal transmission efficiency in supercomputers and hyperscale systems with stringent requirements.

With NVIDIA/Mellanox InfiniBand NDR/HDR/EDR all series switches and NICs, NADDOD Test Center recreates completely the same original environment and tests each part to guarantee its performance and ensure our InfiniBand networking products’ 100% compatibility.

InfiniBand Transceivers

Speed Rate PN Description
800G InfiniBand NDR OSFP Transceiver MMA4Z00-NS NVIDIA/Mellanox MMA4Z00-NS twin port 800Gb/s NDR OSFP 2xMPO12 APC 850nm 50m Finned-top MMF Transceiver Module
800G InfiniBand NDR OSFP Transceiver MMS4X00-NL NVIDIA/Mellanox MMS4X00-NL twin port 800GBPS 2xNDR OSFP DR8 2xMPO 1310nm up to 30m Transceiver for SMF
800G InfiniBand NDR OSFP Transceiver MMS4X00-NS NVIDIA/Mellanox MMS4X00-NS twin port 800Gb/s 2x 400Gb/s OSFP DR8 2xMPO 1310nm up to 100m Transceiver for SMF
800G InfiniBand NDR OSFP Transceiver MMS4X00-NM NVIDIA/Mellanox MMS4X00-NM twin port 800Gb/s 2xNDR 2xDR4 finned-top OSFP 2xMPO12 APC 1310nm up to 500m Transceiver for SMF
400G InfiniBand NDR OSFP Transceivers MMS4X00-NS400 NVIDIA/Mellanox MMS4X00-NS400 single port 400Gb/s OSFP MPO DR4 1310nm up to 150m Transceiver for SMF
400G InfiniBand NDR OSFP Transceivers MMA1Z00-NS400 NVIDIA/Mellanox MMA1Z00-NS400 single port 400Gb/s OSFP112 MPO SR4 850nm up to 30m Flat Top Transceiver for SMF
400G InfiniBand NDR OSFP Transceivers MMS4X00-NL400 NVIDIA/Mellanox MMS4X00-NL400 single port 400Gbps NDR OSFP MPO12 APC DR4 1310nm up to 30m Flat Top Transceiver for SMF
400G InfiniBand NDR OSFP Transceivers MMA4Z00-NS400 NVIDIA/Mellanox Single Port 400Gb/s NDR OSFP SR4 MPO12 APC 850nm 50m Flat Top MMF Transceiver Module
200G QSFP56 Optical Transceivers MMS1W50-HM Mellanox MMS1W50-HM Compatible 200GBASE-FR4 QSFP56 1310nm 2km with FEC CWDM4 PAM4 DOM LC InfiniBand HDR Optical Transceiver Module for SMF
200G QSFP56 Optical Transceivers MMA1T00-HS Mellanox MMA1T00-HS Compatible 200G SR4 QSFP56 PAM4 850nm 100m DOM MTP/MPO-12 InfiniBand HDR Transceiver Module for MMF
100G QSFP28 Optical Transceivers MMA1B00-E100 Mellanox MMA1B00-E100 Compatible 100GBASE-SR4H QSFP28 850nm 70m (OM3)/100m (OM4) DOM MPO/MTP-12 InfiniBand EDR Transceiver Module for MMF, MCX556A-ECAT, MCX653106A-HDAT
100G QSFP28 Optical Transceivers MMS1C10-CM Mellanox Compatible MMS1C10-CM 100GBase-PSM4 QSFP28 1310nm 500m DOM MPO/MTP-12 InfiniBand EDR Transceiver Module for SMF, MCX556A-ECAT, MCX653106A-HDAT
100G QSFP28 Optical Transceivers MMA1L10-CR Mellanox MMA1L10-CR Compatible 100Gb/s QSFP28 LR4 1310nm 10km DOM Duplex LC InfiniBand EDR Optical Transceiver Module for SMF, MCX556A-ECAT, MCX653106A-HDAT
100G QSFP28 Optical Transceivers MMA1L30-CM Mellanox MMA1L30-CM Compatible 100GBASE-CWDM4H QSFP28 1310nm 2km DOM Duplex LC InfiniBand EDR Transceiver Module for SMF, MCX556A-ECAT, MCX653106A-HDAT
100G QSFP28 Optical Transceivers QSFP28-ER4-100G Mellanox QSFP28-ER4-100G Compatible 100GBASE-ER4H QSFP28 1310nm 30km (no FEC) 40km(FEC) DOM Duplex LC InfiniBand EDR Transceiver Module for SMF, MCX556A-ECAT, MCX653106A-HDAT
100G QSFP28 Optical Transceivers QSFP-100G-ZR4 Mellanox QSFP-100G-ZR4 Compatible 100GBASE-ZR4H QSFP28 1310nm 80km (FEC) DOM Duplex LC InfiniBand EDR Transceiver Module for SMF, MCX556A-ECAT, MCX653106A-HDAT
56G InfiniBand FDR QSFP+ Transceivers MMA1B00-F030D NVIDIA MMA1B00-F030D Optical Transceiver FDR QSFP+ VCSEL-Based LSZH MPO 850nm SR4 up to 30m DDMI

InfiniBand Standards

InfiniBand HDR (High Data Rate)

InfiniBand HDR connectivity product line provides 200Gb/s QSFP56 IB HDR MMF active optical cables (AOC), active optical splitter cables, passive direct attach copper cables (DAC), passive copper hybrid cables, and fiber optic transceivers. 200Gb/s QSFP56 InfiniBand HDR cables and transceivers are commonly used throughout the InfiniBand network infrastructure to connect top-of-rack switches (QM8700/QM8790, etc.) to NVIDIA GPU (A100/H100/A30, etc.) and CPU server and storage network adapters (ConnectX-5/6/7 VPI, etc.), as well as in switch-to-switch applications. Saving up to 50% for GPU-accelerated high-performance computing (HPC) cluster applications like model rendering, artificial intelligence (AI), deep learning (DL), and NVIDIA omniverse applications in InfiniBand HDR networks.

InfiniBand EDR (Enhanced Data Rate)

InfiniBand EDR contains InfiniBand 200Gbase QSFP56 to QSFP56 EDR AOC, EDR DAC, and EDR Transceiver, perfectly compatible with Mellanox EDR 100Gb switch SB7800/SB7890 connectivity. Save for 50% GPU-accelerated computing for high-performance connectivity when running HPC, cloud, model rendering, storage, and NVIDIA omniverse applications in InfiniBand 100Gb Networks.

InfiniBand NDR (Next-Generation Data Rate)

InfiniBand NDR contains InfiniBand 400Gbase/800Gbase OSFP AOC, DAC, and Transceiver, perfectly compatible with Mellanox NDR 400Gb/800Gb switch MQM9700/MQM9790 series connectivity. Save for 50% GPU-accelerated computing for high-performance connectivity when running HPC, cloud, model rendering, storage, and NVIDIA omniverse applications in InfiniBand 400Gb/800Gb Networks.

InfiniBand FDR (Fourteen Data Rate)

InfiniBand FDR contains InfiniBand 56Gbase QSFP+ to QSFP+ FDR AOC, FDR DAC, and transceivers, perfectly compatible with Mellanox EDR switch connectivity. Save for 50% GPU-accelerated computing for high-performance connectivity when running HPC, cloud, model rendering, artificial intelligence, and NVIDIA omniverse applications in InfiniBand 56Gb Networks.

Key Benefits of InfiniBand in HPC Networking

The continued evolution of data communication technology, internet technology, and visual presentation is made possible by advancements in computing power, storage capacity, and network efficiency. The InfiniBand network offers high-bandwidth network services, low latency, and reduced consumption of computing resources by offloading protocol processing and data movement from the CPU to the interconnection. These unique advantages make InfiniBand an ideal option for HPC data centers, enabling significant performance improvements across a variety of applications, including Web 2.0, cloud computing, big data, financial services, virtualized data centers, and storage applications.

In terms of speed, InfiniBand has kept pace with Ethernet 100G and now offers 100G/200G to 400G/800G InfiniBand switches that meet the high-performance requirements of HPC architecture. The high bandwidth, speed, and low latency of InfiniBand switches enable high server efficiency and application productivity.

Scalability is another key advantage of InfiniBand, as a single subnet can support up to 48,000 nodes at Network Layer 2. Unlike Ethernet, InfiniBand does not rely on broadcast mechanisms like ARP, which eliminates broadcast storms and prevents additional bandwidth waste. Additionally, multiple subnets can be associated with switches and switches.

NADDOD offers InfiniBand products built with Quantum InfiniBand switch devices that support non-blocking bandwidth up to 16Tb/s and port-to-port delay lower than 130ns, providing high availability and multi-service support for HPC data centers. However, Ethernet networks can also be used for data transmission by reducing the workload across multiple devices. NADDOD provides multiple-speed Ethernet switches to assist with network construction.

Conclusion

InfiniBand and Ethernet both have their own suitable application scenarios. The InfiniBand network significantly increases the rate of data transfer, improving network utilization and reducing the need for CPU resources to process network data. This is why the InfiniBand network is becoming the main network solution for the high-performance computing industry. In the future, 400Gbps NDR and 800Gbps XDR switches will also be available. However, if communication delay is not a high priority between data center nodes, and flexible access and expansion are more important, Ethernet networks can be used for extended periods of time.

The InfiniBand network’s extreme performance, innovative technical architecture, and simplified high-performance network architecture can help HPC data center users maximize business performance. InfiniBand technology reduces the delay caused by multi-level architecture layers and provides strong support for the smooth upgrade of the access bandwidth of key computing nodes. It is expected that InfiniBand networks will continue to enter more and more usage scenarios as their popularity grows.


Related Products:
200G QSFP56 Optical Transceivers MMA1T00-HS
200G QSFP56 Optical Transceivers MMS1W50-HM
200G HDR QSFP56 to QSFP56 AOC MFS1S00-H003E/MFS1S00-H003V
200G HDR QSFP56 to QSFP56 DAC MCP1650-H001E30


Related Resources:
NVIDIA Quantum-2 InfiniBand NDR 400Gb/s
HPC Case Study: Nebulae Supercomputing Center
200G HDR Optics Products Application Scenarios
Why Is InfiniBand Used in HPC?
InfiniBand NDR: The Future of High-Speed Data Transfer in HPC and AI