E4S: An HPC-AI Software Ecosystem for Science

E4S 24.11 has been released!

See Downloads for more information on the latest E4S release.

What is E4S?

The Extreme-scale Scientific Software Stack (E4S) is a community effort to provide open source software packages for developing, deploying and running scientific applications on high-performance computing (HPC) and AI platforms sponsored by the US Department of Energy (DOE) Office of Advanced Scientific Computing Research. E4S provides from-source builds, containers, and pre-installed versions of a broad collection of HPC and AI software packages (E4S 24.11 release notes). E4S includes contributions from many organizations, including national laboratories, universities, and industry. E4S is one of the key legacies of the US Exascale Computing Project (ECP), a collaborative effort of the US Department of Energy Office of Advanced Scientific Computing Research and the National Nuclear Security Administration.

Purpose

E4S exists to accelerate the development, deployment and use of HPC-AI software, lowering the barriers for HPC-AI users. E4S represents one of the largest collections of performance-portable GPU-enabled libraries and tools, supporting users of NVIDIA, AMD, and Intel GPUs in addition to Intel, AMD and Arm CPUs. E4S provides containers and turn-key, from-source builds of more than 120 popular HPC-AI products. E4S products include programming models, such as MPI and Kokkos; development tools such as HPCToolkit, TAU and PAPI; math libraries such as PETSc and Trilinos; Data and Viz tools such as HDF5 and Paraview; and AI products such as JAX, PyTorch, TensorFlow, and Horovod. The entire portfolio is tested and validated on a variety of platforms, from laptops to supercomputers, providing confidence for users to upgrade with each E4S release.

Approach

E4S relies on Spack, a powerful package management platform widely used in the HPC-AI community. By using Spack as the package manager and providing containers of pre-built binaries for Docker, Singularity, Shifter and CharlieCloud, E4S enables the flexible use and testing of a large collection of reusable HPC-AI software packages. E4S also provides a set of Software Development Kits (SDKs) to promote interoperability between products. Finally, E4S products provide performance portability across a wide range of CPU and GPU architectures, including Intel, AMD, and Arm CPUs, and NVIDIA, AMD, and Intel GPUs using the Kokkos programming model and similar approaches, the MPI programming model via multiple MPI implementations, and new emerging language parallel programming support in the LLVM ecosystem.

Platforms

E4S packages build on most computer systems, from laptops to supercomputers. E4S is available on all major leadership platforms at the US Department of Energy facilities, including the Exascale systems, Frontier at Oak Ridge National Lab, and Aurora, at Argonne National Lab (capable of a billion-billion operations per second). E4S is also available in containers from DockerHub and on cloud platforms, such as AWS, Azure, and Google Cloud. E4S Pro is a commercial version of E4S that provides additional support and services for E4S users including availability on AWS.

Testing

The E4S software distribution is tested regularly on a variety of platforms, from Linux clusters to leadership platforms. E4S is tested on all major leadership platforms at the US Department of Energy facilities, including the Exascale systems, Frontier at Oak Ridge National Lab, and Aurora, at Argonne National Lab. E4S is also tested on cloud platforms, such as AWS, Azure, and Google Cloud. Finally, E4S is ported and tested on the Frank system at the University of Oregon.

E4S.png

Interoperability Approach

While porting of individual scientific software products is challenging, achieving interoperability between packages is even more difficult. E4S uses a dual-pronged approach for achieving software interoperability: Spack and SDKs.

  • Spack: E4S uses the Spack packages manager for software delivery. Spack provides the ability to specify versions of software packages that are and are not interoperable. It is also a common build layer to not only E4S software, but also an enormous number of software packages that satisfy dependencies for the primary E4S products.
  • Software Development Kits (SDKs): A Software Development Kit is a collection of related software products where coordination across package teams will improve usability and practices, and foster community growth among teams that develop similar and complementary capabilities. An SDK is more of a project than a product, although it involves several products. It can also be considered as an association of products and product teams. The activities that take place inside an SDK promote interoperability (where appropriate and logical) between products. One key SDK is the Extreme-Scale Scientific Software Development Kit (xSDK), a collection of GPU-enabled scientific libraries for HPC-AI applications.

Distribution

E4S is open source software published under the MIT License. E4S can be redistributed and modified under the terms of this license. E4S packages each have their own open source license.

Contacts

Michael A. Heroux

Project Leader

Sameer Shende

E4S Lead

James Willenbring

SDK Lead