Introdcution¶

In this course, you will explore OpenACC, OpenMP Offloading, and HIP, three programming models designed to facilitate development on modern processors and accelerators, with a focus on GPU computing. OpenACC and OpenMP Offloading are directive-based programming models that simplify development on heterogeneous systems, allowing you to efficiently utilize both CPUs and GPUs within the same application. HIP (Heterogeneous-computing Interface for Portability) is a versatile programming framework that mirrors CUDA. By knowing CUDA, you can easily write HIP code that runs across multiple GPU architectures, including NVIDIA and AMD GPUs.

Pictures clockwise: (a) Astrophysics; (b) CFD: turbulence; (C) Bioinformatics; (d) Material science

Programming on GPUs is often referred to as General-purpose GPU (GPGPU) programming. GPUs excel in high-throughput, data-intensive tasks due to their massively parallel architecture and higher memory bandwidth compared to CPUs, making them ideal for scientific computing, machine learning, and other computationally demanding applications. Today, GPUs drive a diverse range of use cases, from rendering graphics in video games to powering advanced scientific simulations and artificial intelligence.

Accelerators in Exascale and Pre-Exascale Supercomputers¶

Modern exascale and pre-exascale supercomputing projects increasingly rely on accelerators to reach unprecedented computational performance. Table 1 highlights some leading exascale supercomputers and their hardware configurations, emphasizing the importance of GPUs (from AMD, Intel, and NVIDIA) in high-performance computing (HPC). This trend signifies that scientific applications need to leverage heterogeneous computing to fully utilize available resources, or they risk underusing the hardware's potential.

Frontier and Lumi supercomputers

The following table highlights the top four supercomputers as of June 2024, according to the TOP500 list. These systems demonstrate the cutting-edge integration of CPUs and accelerators, achieving remarkable computational performance.

Rank	System Name	Location	Performance (Rmax)	Processor	Accelerator	Vendor
1	Frontier	Oak Ridge National Laboratory, USA	1.206 exaflops	AMD EPYC 64C 2GHz	AMD Instinct MI250X	HPE Cray EX235a
2	Aurora	Argonne National Laboratory, USA	1.012 exaflops	Intel Xeon CPU Max Series	Intel Data Center GPU Max Series	HPE Cray EX
3	Eagle	Microsoft Azure, USA	561.2 petaflops	Intel Xeon Platinum 8480C	NVIDIA H100	Microsoft NDv5
4	Fugaku	RIKEN Center for Computational Science, Japan	442 petaflops	Fujitsu A64FX 48C 2.2GHz	None	Fujitsu

Table 1: Top four supercomputers as of June 2024 (Updated Top Supercomputers from the TOP500 List (June 2024)).

Computing power(left); Share of the accelerators(right)

These systems exemplify the integration of advanced CPUs and accelerators to achieve unprecedented computational performance. For example, Frontier utilizes AMD’s EPYC processors alongside Instinct MI250X GPUs, while Aurora combines Intel’s Xeon CPUs with their Data Center GPUs. Eagle, a cloud-based system, leverages NVIDIA’s H100 GPUs, and Fugaku operates solely on Fujitsu’s A64FX processors without additional accelerators. This diversity in architectures underscores the importance of heterogeneous computing in modern high-performance computing (HPC) environments. The below Figure illustrates the primary application areas of HPC and the distribution of GPU vendors within HPC.

Applications uses HPC resources (left); GPU vendors support HPC(right)

For more detailed information, refer to the official TOP500 June 2024 list. top500.org

The Role of HPC in Science, Engineering, and AI¶

Numerical computation arises from computational science and artificial intelligence

In scientific computing, engineering, and artificial intelligence, computational workloads often involve solving large systems of numerical equations. For instance, in science and engineering, many problems are defined by partial differential equations (PDEs), which are converted into systems of equations through numerical methods like finite difference or finite element methods. Solving these systems involves finding unknown variable values and requires significant computational power, which GPUs provide effectively. The above Figure symbolically illustrates how a large set of arithmetic operations is used in science, engineering, and AI.