These servers are designed for demanding AI applications where low latency and high application performance are essential. The 2U NVIDIA HGX™ A100 4-GPU system is suited for deploying modern AI training clusters at scale with high-speed CPU-GPU and GPU-GPU interconnect. The Supermicro 2U 2-Node system reduces energy usage and costs by sharing power supplies and cooling fans, reducing carbon emissions, and supports a range of discrete GPU accelerators, which can be matched to the workload. Both of these systems include advanced hardware security features that are enabled by the latest Intel Software Guard Extensions (Intel SGX).
"Supermicro engineers have created another extensive portfolio of high-performance GPU-based systems that reduce costs, space, and power consumption compared to other designs in the market," said Charles Liang, president and CEO, Supermicro. “With our innovative design, we can offer customers NVIDIA HGX A100 (code name Redstone) 4-GPU accelerators for AI and HPC workloads in dense 2U form factors. Also, our 2U 2-Node system is uniquely designed to share power and cooling components which reduce OPEX and the impact on the environment."
The 2U NVIDIA HGX A100 server is based on the 3rd Gen Intel Xeon Scalable processors with Intel Deep Learning Boost technology and is optimized for analytics, training, and inference workloads. The system can deliver up to 2.5 petaflops of AI performance, with four A100 GPUs fully interconnected with NVIDIA NVLink®, providing up to 320GB of GPU memory to speed breakthroughs in enterprise data science and AI. The system is up to 4x faster than the previous generation GPUs for complex conversational AI models like BERT large inference and delivers up to 3x performance boost for BERT large AI training.
In addition, the advanced thermal and cooling designs make these systems ideal for high-performance clusters where node density and power efficiency are priorities. Liquid cooling is also available for these systems, resulting in even more OPEX savings. Intel Optane™ Persistent Memory (PMem) is also supported on this platform, enabling significantly larger models to be held in memory, close to the CPU, before processing on the GPUs. For applications that require multi-system interaction, the system can also be equipped with four NVIDIA ConnectX®-6 200Gb/s InfiniBand cards to support GPUDirect RDMA with a 1:1 GPU-to-DPU ratio.
The new 2U 2-Node is an energy-efficient resource-saving architecture designed for each node to support up to three double-width GPUs. Each node also features a single 3rd Gen Intel Xeon Scalable processor with up to 40 cores and built-in AI and HPC acceleration. A wide range of AI, rendering, and VDI applications will benefit from this balance of CPUs and GPUs. Equipped with Supermicro's advanced I/O Module (AIOM) expansion slots for fast and flexible networking capabilities, the system can also process massive data flow for demanding AI/ML applications, deep learning training, and inferencing while securing the workload and learning models. It is also ideal for multi-instance high-end cloud gaming and many other compute-intensive VDI applications. In addition, Virtual Content Delivery Networks (vCDNs) will be able to satisfy increasing demands for streaming services. Power supply redundancy is built-in, as either node can use the adjacent node's power supply in the event of a failure.