RoCE v2 Deployment and Application

NADDOD Neo Switch Specialist Feb 28, 2024

1. RoCE v2 Configuration Steps


The configuration of RoCE v2 involves the setup of adapters (network interface cards) and switches.


Note:Before proceeding with the configuration, please ensure that all hardware and drivers support RoCE v2, and that the network infrastructure is ready.


Adapter Configuration:


  1. Install the Adapter: Insert the RoCE v2 compatible adapter into the PCIe slot of the server.


  1. Install Adapter Drivers:Install the RoCE v2 drivers provided by the adapter manufacturer. Ensure that you use the latest version of the drivers for optimal performance and stability.


  1. Enable RDMA Functionality:Enable RDMA functionality on the server, typically through network settings in the operating system or adapter configuration. Make sure that RDMA functionality is enabled.


  1. Configure IP Address and Subnet Mask:Configure the IP address and subnet mask for the RoCE v2 adapter. This can be done through the network settings in the operating system or adapter management tool.


Switch Configuration:


  1. Choose a RoCE v2 Compatible Switch:Ensure that the network switch supports RoCE v2. When selecting a switch, it is preferable to choose a model that has RoCE v2 support and refer to the manufacturer's documentation for detailed information.


  1. Enable PFC (Priority Flow Control):RoCE v2 relies on PFC to ensure ordered transmission of flows. Enable PFC on the switch and configure the appropriate priorities for RoCE flows.


  1. Configure DCB (Data Center Bridging): Configure DCB to ensure that RoCE v2 traffic receives the appropriate bandwidth and priority. Make sure to allocate sufficient bandwidth for RoCE v2 flows to meet performance requirements.


  1. Enable ECN (Explicit Congestion Notification): RoCE v2 supports ECN, which helps with traffic control during network congestion. Enable ECN on the switch and configure it accordingly on the adapters.


Network Parameter Settings:

  1. Configure Subnet:RoCE v2 typically operates within a subnet, so ensure that the subnet configured on the adapters and switches is consistent.


  1. Configure MTU (Maximum Transmission Unit):RoCE v2 often requires a larger MTU for improved performance. Configure the same MTU value on the adapters and switches to maintain consistency across the network links.


  1. Enable IPv6:RoCE v2 can utilize IPv6 for communication. If your network supports IPv6, ensure that IPv6 is enabled on the adapters and switches, and configure the addresses accordingly.


  1. Validate the Connection:After completing the configuration, validate the RoCE v2 connection by using RDMA tools or other testing utilities. Ensure that data can be correctly transmitted over the RoCE v2 network.

RoCE v2 hardware

2. Typical Deployment Scenarios


RoCE v2 offers various deployment scenarios to meet the requirements of different applications in different network environments.


1. Data Center Networks:


  • Large-scale distributed storage


  • Virtualized environments


  • High-performance computing clusters


Deploy a leaf-spine topology to ensure low latency and high throughput. Use RoCE v2 adapters and switches to build a high-performance, RDMA-enabled network.


Enable PFC and DCB on the leaf and spine switches to support ordered transmission of RoCE flows and bandwidth allocation.


Configure Jumbo Frames to support larger MTUs for improved RoCE performance.


Run RoCE v2 within different subnets to ensure network isolation and performance optimization.


Configure RoCE v2 on virtual machine hosts to enable high-performance communication between virtual machines, enhancing the performance of virtualized workloads.


2. Enterprise Networks:


  • Database applications


  • Large-scale file sharing


  • Video streaming


Deploy RoCE v2 in critical nodes of a traditional three-tier network structure suitable for enterprise environments, where high-performance communication is required.


Enable PFC and DCB on critical switches to ensure high-performance support for mission-critical applications.


Increase the MTU as needed to improve performance, particularly in scenarios like large-scale file sharing or video streaming.


Run RoCE v2 within different subnets to ensure network isolation for critical applications.


Configure necessary network security measures to ensure secure data transmission over RoCE v2.


3. High-Performance Computing:


  • Scientific computing


  • Simulation


  • Rendering


Deploy a high-performance network topology such as Fat-Tree or Dragonfly that meets the requirements of high-performance computing clusters.


Enable PFC and DCB in the high-performance computing network to support RDMA communication in large-scale computing clusters.


Configure larger MTUs to enhance data transfer efficiency.


Run RoCE v2 within different subnets to enable high-performance communication between compute nodes.


Configure a distributed file system that supports RDMA to enhance file access performance.


3. Hardware and Software Requirements


Deployment of a RoCE v2 network involves hardware and software requirements across multiple aspects. Here is a detailed checklist, including adapter models, switch specifications, and operating system support:


Adapter (Network Interface Card) Requirements:


1. Manufacturer and Model:


Choose RoCE v2-compatible InfiniBand and Ethernet hybrid adapters, such as Mellanox ConnectX-6, ConnectX-6 Dx, etc.


2. RoCE v2 Support:


The adapter must explicitly support RoCE v2. Ensure that the chosen adapter's driver and firmware versions are compatible with the latest RoCE v2 specifications.


3. Performance Features:


Select an adapter that suits your performance requirements, considering bandwidth, end-to-end latency, and other performance features.


4. PCIe Compatibility:


The adapter must be compatible with the server's PCIe slot to ensure proper installation and performance.


Switch Requirements:


1. Manufacturer and Model:


Select RoCE v2-compatible switches, such as Mellanox Spectrum switch series, Cisco Nexus 9000 series, etc.


2. RoCE v2 Support:


The switch must support RoCE v2, and the documentation regarding configuration options must be clear and comprehensive.


3. PFC and DCB Support:


The switch must support Priority Flow Control (PFC) and Data Center Bridging (DCB) to ensure ordered transmission of RoCE traffic and bandwidth allocation.


4. High-Bandwidth Ports:


Choose switches with sufficient port bandwidth to meet the demands of a high-performance network.


Operating System and Driver Requirements:


1. Operating System Support:


Ensure that the selected adapters and switches support the operating system you are using, such as Linux (especially RDMA-supported Linux distributions like RHEL, Ubuntu), Windows Server, etc.


2. Driver and Firmware:


Install the latest driver and firmware versions provided by the adapter manufacturer to ensure compatibility and optimal performance.

	 RoCE v2 software

Other Requirements:


1. MTU Settings:


Configure Jumbo Frames on the network links to support larger MTUs and improve RoCE performance.


2. Subnet Partitioning:


Run RoCE v2 within different subnets to ensure network isolation and performance optimization.


3. Security Configuration:


Configure appropriate network security measures based on specific environment security requirements.


4. Network Management Tools:


Configuring and managing a RoCE v2 network may require the use of suitable network management tools to ensure comprehensive visibility into network status.


Note: Before deploying a RoCE v2 network, carefully review the documentation for all relevant hardware and software to ensure that the chosen devices and configurations meet the specific requirements of your environment. Additionally, perform appropriate performance testing and validation to ensure the stability and performance of the RoCE v2 network.


4. RoCE v2 Application Cases


1. Optimization of Data Center Networks


The application of RoCE v2 in data center networks can significantly improve performance and efficiency, especially in virtualized environments.


By implementing RDMA, RoCE v2 eliminates the dependency on the host CPU for data transfers between virtual machines, reducing processing overhead and improving performance. One of the design goals of RoCE v2 is to provide low latency and high throughput, which is crucial for high-performance communication between multiple virtual machines in virtualized environments. RoCE v2 supports network isolation in virtualized environments, ensuring secure communication between virtual machines, which is particularly important in multi-tenant data centers.


RoCE v2 typically supports Jumbo Frames and is suitable for large-scale concurrent workloads, especially in data center environments where there are numerous simultaneous data transfers, improving overall performance.


In distributed computing clusters, the low latency and high throughput of RoCE v2 enable more efficient communication between nodes, which is essential for collaborative processing of distributed computing tasks. RoCE v2 is widely used to optimize distributed storage systems, accelerating storage access and improving storage performance, which is critical for large-scale data storage and processing.


2. Improving Storage System Performance


The application of RoCE v2 in storage systems can significantly enhance storage performance and reduce access latency, especially in large-scale data storage and high-frequency read/write operations.


For example, a research institution with a large scientific dataset needs efficient data transfers and access in a distributed storage system. To improve storage performance, they adopt RoCE v2 technology. RoCE v2 compatible network adapters are selected, and it is ensured that each storage node supports RoCE v2. Common choices include adapters like Mellanox ConnectX-6. RoCE v2 compatible switches are deployed, ensuring the enablement of PFC and DCB in the storage network to support ordered transmission and bandwidth allocation for RoCE flows. Distributed storage systems supporting RDMA are deployed to enable direct memory access between storage nodes using RoCE v2, reducing CPU overhead. Configuration of MTU supporting Jumbo Frames improves packet size to reduce header overhead and enhance data transfer efficiency. Storage system optimization is performed for high-frequency read/write operations, ensuring the system fully leverages the low latency and high throughput of RoCE v2.


The low latency of RoCE v2 allows for faster communication between storage nodes, reducing data access latency, particularly evident in large-scale data storage. RoCE v2 supports high-throughput data transfers, enabling storage systems to handle more requests and improve overall storage performance. In scenarios involving large-scale data transfers, RoCE v2's superior performance allows data to be transmitted between storage nodes more efficiently, accelerating data backup, recovery, and migration processes. For high-frequency read/write operations, RoCE v2's performance advantages enable storage systems to respond to requests more quickly, providing better response times.


3. Applications in Ultra-Scale Clusters


RoCE v2 has a significant impact on network performance and effectively supports large-scale parallel computing in ultra-scale clusters such as cloud computing environments and massively parallel computing clusters. In cloud computing environments, RoCE v2 can be used to enhance communication performance between virtual machines, especially in scenarios that require low latency and high throughput. RoCE v2 can accelerate distributed storage systems, improving storage performance in cloud environments. RoCE v2's characteristics enable low latency and high throughput communication in cloud environments, providing higher-performance services for cloud service providers. In virtualized environments, RoCE v2 optimizes data transfers between virtual machines, reducing CPU overhead and improving overall virtualization performance. RoCE v2's network isolation feature allows for effective support of multi-tenancy in cloud environments, ensuring secure and isolated communication between each tenant.


RoCE v2 can be used in High-Performance Computing (HPC) clusters to support communication requirements for large-scale distributed computing tasks. In scenarios such as scientific research and weather simulations, RoCE v2 accelerates large-scale data transfers, improving data processing efficiency. RoCE v2 supports highly concurrent communication, making it suitable for communication needs among thousands of nodes in massively parallel computing clusters. For large-scale computing tasks that require collaboration, RoCE v2's low latency characteristics help reduce communication delays and improve overall computing efficiency. RoCE v2 can be used to optimize distributed file systems, accelerating read/write operations for large-scale data.


RoCE v2's low latency characteristics are crucial for task collaboration and data transfers in ultra-scale clusters, helping to reduce communication latency and improve response speed. RoCE v2 provides high throughput in the network, supporting fast data transfers for large-scale data, which is essential for high-performance computing and large-scale storage systems. RoCE v2 supports network isolation, ensuring secure and isolated communication among multiple tenants or tasks in ultra-scale clusters.


RoCE v2 utilizes optical connectivity technology to achieve high-performance network communication over Ethernet. As a supplier of optical connectivity solutions, NADDOD can provide high-quality and highly reliable optical modules and fiber optic products to meet the requirements of RoCE v2 deployments.Multiple successful deliveries and real-world application cases serve as the best endorsement of our quality assurance. Inquire now for more details!