Matrix Multiplication with OpenMP Offloading Quiz¶
1. Which OpenMP directive should be used to offload matrix multiplication to the GPU while collapsing both outer loops (row and col)?
- A.
#pragma omp target teams collapse(2) - B.
#pragma omp target parallel for collapse(2) - C.
#pragma omp target loop collapse(2) - D.
#pragma omp target parallel collapse(2)
Click to reveal the answer
Answer: B. `#pragma omp target parallel for collapse(2)`2. In matrix multiplication with #pragma omp target teams distribute parallel for, what is the main advantage of using teams?
- A. It specifies that the code should run on the CPU.
- B. It creates multiple teams on the GPU, each responsible for a portion of the work, improving workload distribution.
- C. It forces all computations to be sequential.
- D. It prevents data from being transferred to the GPU.
Click to reveal the answer
Answer: B. It creates multiple teams on the GPU, each responsible for a portion of the work, improving workload distribution.3. True or False: The collapse(2) clause is used to merge two loops (e.g., row and col loops) so they can be executed in parallel as a single loop.
Click to reveal the answer
Answer: True4. Which clause would you use to ensure each thread has a private copy of row, col, and i within a parallelized matrix multiplication loop?
- A.
collapse - B.
num_teams - C.
private(row, col, i) - D.
map(tofrom: row, col, i)
Click to reveal the answer
Answer: C. `private(row, col, i)`5. In the option #pragma omp target teams distribute parallel for num_teams(5) collapse(2), what does num_teams(5) do?
- A. Specifies that each team should have 5 threads.
- B. Creates exactly 5 teams to distribute the work on the GPU.
- C. Allocates 5 memory spaces on the GPU.
- D. Limits the number of iterations each thread can execute to 5.
Click to reveal the answer
Answer: B. Creates exactly 5 teams to distribute the work on the GPU.6. In the C/C++ example, what does the map(to: a[0:n*n], b[0:n*n]) map(from: c[0:n*n]) clause do?
- A. Copies
aandbfrom the device to the host andcfrom the host to the device. - B. Allocates memory for
a,b, andcon the device but does not initialize them. - C. Copies
aandbto the device and bringscback from the device to the host after computation. - D. Allocates memory for
con the host but initializesaandbon the device.
Click to reveal the answer
Answer: C. Copies `a` and `b` to the device and brings `c` back from the device to the host after computation.7. Which of the following is a benefit of using teams distribute parallel for over parallel for alone in GPU offloading?
- A. It prevents memory access conflicts by limiting memory access.
- B. It supports multi-level parallelism by creating teams and allowing parallel execution within each team.
- C. It enables execution on the CPU instead of the GPU.
- D. It disables the need for memory mapping.
Click to reveal the answer
Answer: B. It supports multi-level parallelism by creating teams and allowing parallel execution within each team.8. What is the purpose of collapse(2) in #pragma omp target teams distribute parallel for collapse(2) when performing matrix multiplication?
- A. It divides the workload into two halves.
- B. It forces two copies of each variable to be created.
- C. It merges the row and col loops to treat them as a single loop, improving parallelism.
- D. It prevents any data races by creating private copies of variables.
Click to reveal the answer
Answer: C. It merges the row and col loops to treat them as a single loop, improving parallelism.9. True or False: #pragma omp target teams distribute parallel for can create both teams and threads to handle a distributed workload on the GPU.
Click to reveal the answer
Answer: True10. Which construct would you use if you want to control the number of teams created on the GPU during matrix multiplication?
- A.
collapse(2) - B.
private - C.
num_teams(5) - D.
map(to: a[0:n*n], b[0:n*n])