GPU Architecture and NVIDIA & AMD Technologies Overview¶

CPU vs. GPU: Architectural Differences¶

1. What is a primary advantage of GPUs over CPUs for high-performance computing?

A. Higher clock speeds
B. Optimized for sequential tasks
C. Large number of cores optimized for parallel processing
D. Smaller physical size

Click to reveal the answer

Answer: C. Large number of cores optimized for parallel processing

2. True or False: CPUs generally have thousands of cores optimized for handling many tasks in parallel.

Click to reveal the answer

Answer: False

GPU Core Organization: GPCs, SMs, and TPCs¶

3. In NVIDIA GPUs, what does SM stand for?

A. Shared Memory
B. Single Matrix
C. Streaming Multiprocessor
D. Spatial Memory

Click to reveal the answer

Answer: C. Streaming Multiprocessor

4. True or False: NVIDIA GPUs operate in a Single Instruction, Multiple Threads (SIMT) fashion, allowing each SM to manage thousands of threads simultaneously.

Click to reveal the answer

Answer: True

5. What is the primary purpose of a Texture Processor Cluster (TPC) in NVIDIA GPUs?

A. To manage global memory
B. To process texture data
C. To handle thread scheduling
D. To control data flow between CPUs and GPUs

Click to reveal the answer

Answer: B. To process texture data

GPU Memory Hierarchy¶

6. Which type of memory in NVIDIA GPUs is cached on-chip and optimized for frequently accessed constants?

A. Global Memory
B. Local Memory
C. Constant Memory
D. Texture Memory

Click to reveal the answer

Answer: C. Constant Memory

7. True or False: Shared memory in NVIDIA GPUs is on-chip memory within each SM, allowing for faster access times and efficient data reuse.

Click to reveal the answer

Answer: True

8. In GPU memory hierarchy, what type of memory is primarily used for handling large datasets and is accessible by both host and device?

A. Local Memory
B. Global Memory
C. Texture Memory
D. Register Memory

Click to reveal the answer

Answer: B. Global Memory

SM Architecture and Execution Model¶

9. How many threads are grouped together in a warp on NVIDIA GPUs?

A. 8
B. 16
C. 32
D. 64

Click to reveal the answer

Answer: C. 32

10. True or False: Cooperative Thread Arrays (CTAs) organize threads into blocks that execute in parallel on the GPU.

Click to reveal the answer

Answer: True

NVIDIA Microarchitectures¶

11. Which NVIDIA microarchitecture introduced tensor cores specifically designed for AI workloads?

A. Maxwell
B. Volta
C. Kepler
D. Fermi

Click to reveal the answer

Answer: B. Volta

12. True or False: Compute capability defines the feature set available for each NVIDIA GPU architecture.

Click to reveal the answer

Answer: True

Unified Memory¶

13. What benefit does NVIDIA's Unified Memory provide?

A. Increases GPU clock speeds
B. Reduces memory transfer bottlenecks between CPU and GPU
C. Optimizes GPU temperature control
D. Improves network communication

Click to reveal the answer

Answer: B. Reduces memory transfer bottlenecks between CPU and GPU

14. True or False: Unified Memory is particularly beneficial for applications where data must be shared between the CPU and GPU.

Click to reveal the answer

Answer: True

NVLink and NVSwitch¶

15. NVLink and NVSwitch are designed to improve data transfer speed between:

A. CPU and RAM
B. GPU and Storage
C. GPU and Network
D. GPU and GPU

Click to reveal the answer

Answer: D. GPU and GPU

16. True or False: NVLink outperforms traditional PCIe interconnects in multi-GPU configurations, improving data throughput.

Click to reveal the answer

Answer: True

AMD GPU Architecture and Technologies Overview¶

17. In AMD GPUs, cores are organized into groups called:

A. Compute Units (CUs)
B. Stream Multiprocessors (SMs)
C. Processing Arrays (PAs)
D. Warp Clusters

Click to reveal the answer

Answer: A. Compute Units (CUs)

18. True or False: AMD's wavefronts contain 64 threads, comparable to NVIDIA's warps which contain 32 threads.

Click to reveal the answer

Answer: True

AMD ROCm Platform¶

19. What is HIP in the context of AMD's ROCm platform?

A. A graphics library for rendering
B. A parallel processing model exclusive to CPUs
C. A C++ runtime API allowing code portability between AMD and NVIDIA GPUs
D. A type of memory in AMD GPUs

Click to reveal the answer

Answer: C. A C++ runtime API allowing code portability between AMD and NVIDIA GPUs

20. True or False: MIOpen is an AMD library optimized for deep learning, similar to cuDNN on NVIDIA GPUs.

Click to reveal the answer

Answer: True

Infinity Fabric and Multi-GPU Scaling¶

21. Infinity Fabric is a high-speed interconnect technology developed by:

A. Intel
B. NVIDIA
C. AMD
D. ARM

Click to reveal the answer

Answer: C. AMD

22. True or False: Infinity Fabric supports data transfer between GPUs and CPUs, improving multi-GPU performance.

Click to reveal the answer

Answer: True

GPU Memory Management and AI-Specific Enhancements¶

23. Which of the following is a specialized feature in AMD's RDNA 2 architecture designed for real-time graphics rendering?

A. Tensor Cores
B. Ray Accelerators
C. NVLink
D. L3 Cache

Click to reveal the answer

Answer: B. Ray Accelerators

24. True or False: Matrix Cores in AMD GPUs are specifically designed to accelerate matrix operations in AI workloads.

Click to reveal the answer

Answer: True