Skip to content

Introdcution

In this course, you will explore OpenACC, OpenMP Offloading, and HIP, three programming models designed to facilitate development on modern processors and accelerators, with a focus on GPU computing. OpenACC and OpenMP Offloading are directive-based programming models that simplify development on heterogeneous systems, allowing you to efficiently utilize both CPUs and GPUs within the same application. HIP (Heterogeneous-computing Interface for Portability) is a versatile programming framework that mirrors CUDA. By knowing CUDA, you can easily write HIP code that runs across multiple GPU architectures, including NVIDIA and AMD GPUs.

Description of the image Pictures clockwise: (a) Astrophysics; (b) CFD: turbulence; (C) Bioinformatics; (d) Material science

Programming on GPUs is often referred to as General-purpose GPU (GPGPU) programming. GPUs excel in high-throughput, data-intensive tasks due to their massively parallel architecture and higher memory bandwidth compared to CPUs, making them ideal for scientific computing, machine learning, and other computationally demanding applications. Today, GPUs drive a diverse range of use cases, from rendering graphics in video games to powering advanced scientific simulations and artificial intelligence.

Accelerators in Exascale and Pre-Exascale Supercomputers

Modern exascale and pre-exascale supercomputing projects increasingly rely on accelerators to reach unprecedented computational performance. Table 1 highlights some leading exascale supercomputers and their hardware configurations, emphasizing the importance of GPUs (from AMD, Intel, and NVIDIA) in high-performance computing (HPC). This trend signifies that scientific applications need to leverage heterogeneous computing to fully utilize available resources, or they risk underusing the hardware's potential.

figure Frontier and Lumi supercomputers

The following table highlights the top four supercomputers as of June 2024, according to the TOP500 list. These systems demonstrate the cutting-edge integration of CPUs and accelerators, achieving remarkable computational performance.

Rank System Name Location Performance (Rmax) Processor Accelerator Vendor
1 Frontier Oak Ridge National Laboratory, USA 1.206 exaflops AMD EPYC 64C 2GHz AMD Instinct MI250X HPE Cray EX235a
2 Aurora Argonne National Laboratory, USA 1.012 exaflops Intel Xeon CPU Max Series Intel Data Center GPU Max Series HPE Cray EX
3 Eagle Microsoft Azure, USA 561.2 petaflops Intel Xeon Platinum 8480C NVIDIA H100 Microsoft NDv5
4 Fugaku RIKEN Center for Computational Science, Japan 442 petaflops Fujitsu A64FX 48C 2.2GHz None Fujitsu

Table 1: Top four supercomputers as of June 2024 (Updated Top Supercomputers from the TOP500 List (June 2024)).

Description of the image Computing power(left); Share of the accelerators(right)

These systems exemplify the integration of advanced CPUs and accelerators to achieve unprecedented computational performance. For example, Frontier utilizes AMD’s EPYC processors alongside Instinct MI250X GPUs, while Aurora combines Intel’s Xeon CPUs with their Data Center GPUs. Eagle, a cloud-based system, leverages NVIDIA’s H100 GPUs, and Fugaku operates solely on Fujitsu’s A64FX processors without additional accelerators. This diversity in architectures underscores the importance of heterogeneous computing in modern high-performance computing (HPC) environments. The below Figure illustrates the primary application areas of HPC and the distribution of GPU vendors within HPC.

Description of the image Applications uses HPC resources (left); GPU vendors support HPC(right)

For more detailed information, refer to the official TOP500 June 2024 list. top500.org

The Role of HPC in Science, Engineering, and AI

Description of the image Numerical computation arises from computational science and artificial intelligence

In scientific computing, engineering, and artificial intelligence, computational workloads often involve solving large systems of numerical equations. For instance, in science and engineering, many problems are defined by partial differential equations (PDEs), which are converted into systems of equations through numerical methods like finite difference or finite element methods. Solving these systems involves finding unknown variable values and requires significant computational power, which GPUs provide effectively. The above Figure symbolically illustrates how a large set of arithmetic operations is used in science, engineering, and AI.