Home

HPE to Build Two Systems for Oak Ridge National Laboratory: Next-generation Exascale Supercomputer “Discovery” and AI Cluster “Lux”

Introducing HPE Cray Supercomputing GX5000 featuring the HPE Cray Supercomputing Storage Systems K3000 to enable supercomputing breakthroughs in the converged AI and HPC era

In this article

  • HPE-built Discovery will bolster productivity up to 10X and unlock new scientific horizons in precision medicine, cancer research, nuclear energy and aerospace
  • The Lux system, to be built by HPE, will offer a flexible, multi-tenant AI cloud platform to support training and inference
  • Discovery will feature the new HPE Cray Supercomputing GX5000, a next generation supercomputer built for the converged AI and HPC era, to drive greater performance and productivity for the lab’s physics-based modeling, simulation and data-driven AI models, as well as providing testbed capabilities for quantum computing
  • The new HPE Cray Supercomputing Storage Systems K3000, an option for the HPE Cray Supercomputing GX5000, is the industry’s first factory-built storage system with embedded Distributed Asynchronous Object Storage (DAOS) open source software

HPE (NYSE:HPE) today announced it was selected to build two systems for the U.S. Department of Energy’s (DOE) Oak Ridge National Laboratory (ORNL) as part of the DOE’s mission to advance American leadership in artificial intelligence (AI) and supercomputing that support science, energy and national security. The new systems include a second generation exascale supercomputer, “Discovery,” which is the successor to ORNL’s Frontier – an HPE-built system that broke the exascale speed barrier – and a new AI cluster “Lux” that will support the DOE’s initiatives in advancing AI and machine learning with a multi-tenant cloud-like platform.

Discovery will be based on the new HPE Cray Supercomputing GX5000, HPE’s next-generation supercomputing platform for leadership class systems that leverages a unified AI and high performance computing (HPC) architecture to streamline operations site-wide and across distributed clusters. It will be augmented by a new DAOS-based HPE Cray Supercomputing Storage Systems K3000, a storage option for HPE Cray Supercomputing GX5000. Discovery will deliver new capabilities for AI, HPC and quantum computing and is expected to increase select application productivity tenfold1, enabling scientists to accelerate breakthroughs in areas such as precision medicine, cancer research, nuclear energy and aerospace.

“When we built Frontier for Oak Ridge National Laboratory and ushered in exascale, we achieved the pinnacle in supercomputing history and a triumph for the U.S.,” said Antonio Neri, president and CEO at HPE. “We are proud to build on that leadership innovation and strong public-private partnership with the U.S. Department of Energy, ORNL and AMD, to build Discovery and Lux, accelerating the next era of scientific discovery and AI innovation.”

Lux will be a dedicated AI system based on the direct liquid-cooled HPE ProLiant Compute XD685 and feature AMD Instinct MI355X GPUs, AMD EPYC™ CPUs and AMD Pensando™ networking. Designed to bolster access to AI resources, Lux will provide researchers across the U.S. with cloud-like access to a sovereign AI factory specifically resourced for training and inference.

Discovery will elevate the exascale computing capabilities first developed for the HPE-built Frontier supercomputer at ORNL. As a result, Discovery will unlock new scientific horizons in various scientific fields while advancing the lab’s mission of innovation and security.

“We are excited for Discovery and Lux to expand the science that researchers are able to do at Oak Ridge,” said Bronson Messer, Director of Science for the Oak Ridge Leadership Computing Facility. “Discovery will set the stage for a new level of converged HPC, AI and quantum computing capabilities, providing additional insight in connection with other systems, while Lux greatly expands researcher access to dedicated AI resources. As a result, we expect both systems will contribute to a paradigm shift in our productivity, reaching unparalleled gains in various, critical areas of scientific research and leadership.”

“For more than a decade, AMD and HPE have partnered to push the limits of high-performance computing, delivering solutions that enable discoveries and change the world,” said Dr. Lisa Su, chair and CEO, AMD. “Together with Oak Ridge National Laboratory, we are advancing the next generation of AI systems with Discovery and Lux—empowering researchers to accelerate innovation and strengthen America’s leadership in science and technology.”

Inside Discovery: the next-generation exascale supercomputer

Discovery's scientific advancements will stem from utilizing the HPE Cray Supercomputing GX5000 unveiled today. Building upon 50 years of supercomputing innovation dating back to the Cray-1 announced in 1975, HPE has designed its next-generation infrastructure for supercomputing in the converged AI and HPC era.

The HPE Cray Supercomputing GX5000 is purpose-built for exascale and features state-of-the-art end-to-end capabilities across CPUs, GPUs, accelerators, networking, software, storage and liquid cooling. By leveraging the new architecture, Discovery will deliver:

  • Greater performance with optimized space - The new platform is purpose-built to scale to exascale performance with greater density compared to the previous version2, using 25 percent less data center space per rack.
  • High performance interconnect with HPE Slingshot – The next generation HPE Slingshot provides Discovery a modern, high-performance interconnect to deliver high-bandwidth and low-latency for HPC, machine learning and analytics applications.
  • Industry-first HPC DAOS storage performance3 – Augmented by the new HPE Cray Supercomputing Storage Systems K3000, Discovery will have 300 percent4 more input/output operations per second (IOPS) per storage rack compared to Frontier to enable AI applications to run with higher productivity. As the industry’s first factory-built storage system with embedded Distributed Asynchronous Object Storage (DAOS) open source software, the HPE Cray Supercomputing Storage Systems K3000 is a cost-effective, all-flash storage system that complements the Lustre file system-based HPE Cray Supercomputing Storage Systems E2000, which will also be featured in Discovery.
  • Next-generation, liquid-cooled and accelerated compute – Discovery will feature next-generation AMD EPYC processors, codenamed “Venice,” with AMD Instinct MI430X GPUs, which offer advanced performance and accuracy for modeling, simulation and AI projects. Leveraging HPE’s 50 years5 of liquid cooling innovation, Discovery’s compute infrastructure will be fully liquid-cooled to optimize energy efficiency and cost-effectiveness in supercomputing environments.

As a world leader in supercomputing6, HPE delivers end-to-end solutions and services to customers with best-in-class AI and HPC expertise. As an integral partner, HPE supercomputing services help enhance outcomes through a fully unified management approach of an organization’s infrastructure and applications with a key focus on core business needs and continuous innovation.

About HPE

HPE (NYSE: HPE) is a leader in essential enterprise technology, bringing together the power of AI, cloud, and networking to help organizations achieve more. As pioneers of possibility, our innovation and expertise advance the way people live and work. We empower our customers across industries to optimize operational performance, transform data into foresight, and maximize their impact. Unlock your boldest ambitions with HPE. Discover more at www.hpe.com.

____________________

1 In comparison to application performance on the predecessor Frontier supercomputer at Oak Ridge National Laboratory

2 Dimension of a singular 900 mm rack of GX5000 as compared to the 1,200 mm EX4000 cabinet

3 DAOS-based storage systems are ranked #1 and #2 on the global IO500 storage benchmark and together have four times the storage benchmark score than the next 30 storage systems

4 The Cray ClusterStor E1000 Storage Systems deployed with Frontier are able to deliver up to 18 million IOPS per storage rack as compared to the HPE Cray Supercomputing Storage Systems K3000 deployed with Discovery can deliver up to 75 million IOPS per storage rack

5 The Cray-1 supercomputer was announced in 1975

6 Hyperion Research Q4 2023 HPC Market Data Report reflecting CY2023, Supercomputer Segment (May 29, 2024), Hyperion Research

 

Contacts