How Data Centers Support HPC?

NADDOD Quinn InfiniBand Network Architect Jul 24, 2023

Data centers have existed since the 1940s, with the first dedicated computer rooms used for military purposes. As computing and storage needs grew exponentially over the following decades and applications expanded into various areas of life, organizations increasingly sought dedicated data centers to accommodate their infrastructure.


To reduce costs and remain competitive, outsourcing data center infrastructure has become almost essential since the emergence of high-performance computing (HPC). HPC is a powerful solution that requires high density, heat, and bandwidth. These data centers must address the challenges posed by the heat and power density required for many high-performance computers to run simultaneously.

HPC

1.Applications of High-Performance Computing

As HPC enables faster integration of data analytics and artificial intelligence, it is no surprise that top companies using HPC data centers are in the cloud computing and IT industries. However, companies from other industries can also harness the power of HPC.

This may include:

● Research laboratories
● Fintech
● Weather forecasting
● Media and entertainment
● Healthcare
● Government and defense

Data centers supporting HPC can meet customers' growing demands for fast networks while keeping up with the increasingly digital landscape.

2.Three Key Systems of High-Performance Computing

To build infrastructure that is suitable for HPC, it is essential to understand the three key systems of an HPC cluster: computing, storage, and networking.

(1) Computing

An efficient HPC system requires a set of computer services and software programs that work together to run algorithmic programs. Each module needs to be synchronized with the other modules in the cluster; otherwise, the entire HPC system will become outdated. The goal of HPC is to perform high-speed computations, which requires aggregating computing power from different types of hardware. Data centers have enough space and capacity to accommodate the computer systems and hardware needed to support HPC operations, but HPC computing alone requires power and cooling coordination that most enterprises cannot handle.

(2) Storage

To accommodate the massive amounts of data processed by HPC, its storage system should unload from the CPU as frequently as possible without interrupting the computation. According to Weka, the HPC storage system needs to meet the following requirements:

● Data from any node must be available at all times
● The available data must be the most up-to-date
● It must be able to handle data requests of any size
● It must support performance-oriented protocols
● It must use the latest storage technologies (such as SSD)
● It must scale to milliseconds to keep up with constant latency

(3) Network

The topology of an HPC network is very different from the internal network in your office. In addition to the extreme demand for continuous data transfer between CPUs and storage, the many different computing components that make up an HPC environment are viewed as a single computer, combined together by "structure". The key concept of HPC structure is to have a large amount of scalable bandwidth (throughput) while maintaining ultra-low latency.

3.Cooling Facilities in High-Performance Computing

Given the density and heat generated by HPC infrastructure, cooling can be a significant challenge. The traditional hot aisle containment systems used in modern data centers can effectively cool today's 50kW HPC racks. Looking ahead, denser HPC clusters may stimulate the widespread adoption of liquid cooling in data centers that is more widely available. According to the National Renewable Energy Laboratory, liquid cooling can provide cooling capacity that is 1,000 times greater than air cooling and takes up less physical space. Immersion liquid-cooled data center deployments offer higher flexibility and are more future-oriented for customers.

 

NADDOD's innovative liquid-cooled interconnect optical module and interconnect solution is an example. The liquid-cooled high-speed module can operate stably in fluoride liquid and mineral oil at a depth of 1 meter (already certified by customers for long-term use), with higher heat dissipation efficiency and lower energy consumption compared to traditional cooling solutions, and can take the computing power of high-performance computing to a new level.

liquid cooling technology

5.Components of High-Performance Computing

Connecting the equipment in an HPC cluster requires high-performance parallel interconnect components. NADDOD is a pioneer in parallel optical interconnect computing and has been focusing on the development of high-performance parallel optical modules and interconnect cables since 2017. Its product series covers rates such as 10G, 25G, 40G, 100G, 200G, and 400G, supporting INFNIBAND protocol and RoCE.

Naddod HPC Components

1. Server optical network cards supporting parallel interconnect components based on Intel and NVIDIA chip designs, ranging from 10G to 200G, and extending to 400G/800G.

2. High-speed parallel optical modules designed based on VCSEL lasers, DML lasers, or silicon photonics platforms, such as 100G QSFP28 SR4/PSM4, 200G QSFP56 SR4/DR4, 200G QSFP-DD SR8/PSM8, and 400G QSFP-DD SR8/DR4.

3. Short-distance parallel DAC and AOC interconnect cables designed based on low-power consumption, such as InfiniBand HDR 200G QSFP56 AOC/DAC, 400G QSFP-DD DAC/AOC.

4. Electric loopback modules that support system equipment self-loopback testing.

5. Innovative liquid-cooled interconnect optical module and interconnect solution.

6.Summary - Purchasing Products for HPC

High-quality systems, components, and facilities are essential in building high-performance computing data centers that also provide affordable power, networking, scalability, redundancy, and security required by HPC.

When building a data center, it is crucial to procure reliable networking product components. As a leading provider of optical network solutions, NADDOD is dedicated to building a smart world of interconnected things with innovative computing and network solutions. We continuously provide customers with innovative, efficient, and reliable products, solutions, and services, offering the high quality of switch, AOC/DAC/optical module + NIC + DPU + GPU integrated solutions for applications such as data centers, HPC, edge computing, AI, etc., significantly improving their business acceleration capabilities with low cost and outstanding performance.

We put customers at the center of everything we do and continuously create outstanding value for them across various industries. NADDOD has a professional technical team, rich experience in implementing and servicing various application scenarios, and its products and solutions have won customers' trust and favor with high quality and outstanding performance, widely used in industries and key areas such as HPC, data centers, education and research, biomedicine, finance, energy, autonomous driving, internet, manufacturing, and telecom.

Choose NADDOD, and let's build high-performance computing data centers together to achieve your business acceleration and innovation dreams!