To assist in this effort, the Laboratory, Penguin Computing and AMD have reached an agreement to upgrade the Lab’s unclassified, Penguin Computing-built Corona high performance computing (HPC) cluster with an in-kind contribution of cutting-edge AMD Instinct™ accelerators, expected to nearly double the peak performance of the machine.
Under the agreement, AMD will supply its Radeon Instinct MI50 accelerators for the Corona system, expected to enable it to exceed 4.5 petaFLOPS (floating point operations per second) of peak compute power. The system will be used by the COVID-19 HPC Consortium, a nationwide public-private partnership that is providing free computing time and resources to scientists around the country engaged in the fight against COVID-19, and by LLNL researchers, who are working on discovering potential antibodies and anti-viral compounds for SARS-CoV-2, the virus that causes COVID-19. The Corona system is supported by AMD EPYC CPUs, working side-by-side with AMD Radeon Instinct Accelerators, and uses Penguin Computing’s Tundra Extreme Scale platform.
Delivered to LLNL in 2018 under a contract with Penguin Computing, the Corona system — named for the total solar eclipse of 2017 — is used for unclassified open science applications. The upgrade comes at no cost to the National Nuclear Security Administration (NNSA) but is intended by AMD to support research into COVID-19, while furthering its partnership and collaboration with LLNL in software and tools development. In exchange for the upgraded GPUs, AMD is securing compute cycles that will be used for a variety of purposes, including providing time for LLNL COVID-19 research and proposals approved by the COVID-19 HPC Consortium, as well as supporting development efforts by AMD software engineers and application specialists.
“It is well known that AMD is a key partner in the upcoming delivery of the first NNSA exascale-class system, the Hewlett Packard Enterprise El Capitan supercomputer,” said Michel McCoy, director of LLNL’s Advanced Simulation and Computing program. “But an enduring partnership involves multiple collaborations, in each of which we pursue common goals. We are delighted that AMD made this generous offer, particularly given the need for a determined pace in mitigating and, ultimately, in defeating this pathogen.”
The AMD Instinct MI50 server accelerator is optimized for large-scale deep learning. The AMD accelerators deliver up to 26.5 teraFLOPS of native half-precision or up to 13.3 teraFLOPS of single-precision peak floating-point performance, combined with 32GB of high-bandwidth memory.
“An effective COVID-19 response requires the best and brightest minds working together. By leveraging the massive compute capabilities of the world’s most powerful supercomputers, we can help accelerate critical modeling and research to help fight the virus,” said Forrest Norrod, senior vice president and general manager, AMD Datacenter and Embedded Systems Group. “AMD is proud to assist in that effort with a contribution of processors well-suited for the science now underway through the COVID-19 HPC Consortium and Lawrence Livermore National Laboratory.”
“Penguin Computing is committed to helping in the worldwide research efforts to fight against the COVID-19 virus,” said Sid Mair, president of Penguin Computing. “Increasing the capabilities of Corona for both HPC and AI computing could greatly enhance research into the nature of COVID-19, possible vaccines, treatments and contagion pathways.”
Corona will be one of the most capable of the seven LLNL supercomputers made available to researchers through the COVID-19 HPC Consortium, which involves more than a dozen member institutions in government, industry and academia and is spearheaded by the White House Office of Science and Technology Policy, the U.S. Department of Energy and IBM. The consortium aims to accelerate development of detection methods and treatments for COVID-19. AMD officially joined the consortium on April 6.
AMD software engineers will provide support in porting certain applications critical to the COVID-19 effort to Corona and optimizing the performance of the GPUs on relevant applications.
The Corona system also will aid LLNL researchers in their hunt for potential antibodies and antiviral drugs to combat the virus. COVID-19 has become a top priority for the Corona system, where it is being used to virtually screen, design and validate antibody candidates for SARS-CoV-2 and to simulate the interaction of small molecules with the virus’ proteins to discover possible antiviral compounds. The upgrade will allow LLNL researchers to speed up the modeling of molecular interactions vital to the effort and run a wider and more diverse set of applications on the system.
“The addition of these new state-of-the-art GPUs on Corona will boost the capability of the teams working on COVID-19,” said Jim Brase, LLNL’s deputy associate director for Programs. “It’s going to allow us to go faster, with more throughput. We’ll have more resources, so we can run more cases and potentially get to new designs for both antibodies and small molecules faster, that may lead to better treatments. They’ll also enable some of our new software, both for simulation and machine learning applications, to run more efficiently and better.”
Employing a first-of-its-kind virtual screening platform combining experimental data with machine learning, structural biology, bioinformatic modeling and high-fidelity molecular simulations, a team of LLNL researchers has used the Corona system to evaluate therapeutic antibody designs that could have improved binding interactions with the SARS-CoV-2 antigen protein. The team has narrowed the list of antibody candidates from a nearly infinite set to about 20 possibilities and has begun exploring additional antibody designs. The researchers believe the upgrade will double the number of computationally expensive simulations they are performing, making it more likely they’ll discover an effective antibody design.
LLNL computer scientists and computational biologists also are using the Corona system to examine millions of small molecules that could have anti-viral properties with SARS-CoV-2. Increasing the speed and performance of Corona will allow researchers to perform additional, highly detailed molecular dynamics calculations to better evaluate possible SARS-CoV-2 target sites for small molecule inhibitors that could prevent infection or treat COVID-19.