Matrix Multiplication with OpenMP Offloading Quiz¶
1. Which OpenMP directive should be used to offload matrix multiplication to the GPU while collapsing both outer loops (row and col)?
- A.
#pragma omp target teams collapse(2)
- B.
#pragma omp target parallel for collapse(2)
- C.
#pragma omp target loop collapse(2)
- D.
#pragma omp target parallel collapse(2)
Click to reveal the answer
Answer: B. `#pragma omp target parallel for collapse(2)`2. In matrix multiplication with #pragma omp target teams distribute parallel for
, what is the main advantage of using teams
?
- A. It specifies that the code should run on the CPU.
- B. It creates multiple teams on the GPU, each responsible for a portion of the work, improving workload distribution.
- C. It forces all computations to be sequential.
- D. It prevents data from being transferred to the GPU.
Click to reveal the answer
Answer: B. It creates multiple teams on the GPU, each responsible for a portion of the work, improving workload distribution.3. True or False: The collapse(2)
clause is used to merge two loops (e.g., row and col loops) so they can be executed in parallel as a single loop.
Click to reveal the answer
Answer: True4. Which clause would you use to ensure each thread has a private copy of row
, col
, and i
within a parallelized matrix multiplication loop?
- A.
collapse
- B.
num_teams
- C.
private(row, col, i)
- D.
map(tofrom: row, col, i)
Click to reveal the answer
Answer: C. `private(row, col, i)`5. In the option #pragma omp target teams distribute parallel for num_teams(5) collapse(2)
, what does num_teams(5)
do?
- A. Specifies that each team should have 5 threads.
- B. Creates exactly 5 teams to distribute the work on the GPU.
- C. Allocates 5 memory spaces on the GPU.
- D. Limits the number of iterations each thread can execute to 5.
Click to reveal the answer
Answer: B. Creates exactly 5 teams to distribute the work on the GPU.6. In the C/C++ example, what does the map(to: a[0:n*n], b[0:n*n]) map(from: c[0:n*n])
clause do?
- A. Copies
a
andb
from the device to the host andc
from the host to the device. - B. Allocates memory for
a
,b
, andc
on the device but does not initialize them. - C. Copies
a
andb
to the device and bringsc
back from the device to the host after computation. - D. Allocates memory for
c
on the host but initializesa
andb
on the device.
Click to reveal the answer
Answer: C. Copies `a` and `b` to the device and brings `c` back from the device to the host after computation.7. Which of the following is a benefit of using teams distribute parallel for
over parallel for
alone in GPU offloading?
- A. It prevents memory access conflicts by limiting memory access.
- B. It supports multi-level parallelism by creating teams and allowing parallel execution within each team.
- C. It enables execution on the CPU instead of the GPU.
- D. It disables the need for memory mapping.
Click to reveal the answer
Answer: B. It supports multi-level parallelism by creating teams and allowing parallel execution within each team.8. What is the purpose of collapse(2)
in #pragma omp target teams distribute parallel for collapse(2)
when performing matrix multiplication?
- A. It divides the workload into two halves.
- B. It forces two copies of each variable to be created.
- C. It merges the row and col loops to treat them as a single loop, improving parallelism.
- D. It prevents any data races by creating private copies of variables.
Click to reveal the answer
Answer: C. It merges the row and col loops to treat them as a single loop, improving parallelism.9. True or False: #pragma omp target teams distribute parallel for
can create both teams and threads to handle a distributed workload on the GPU.
Click to reveal the answer
Answer: True10. Which construct would you use if you want to control the number of teams created on the GPU during matrix multiplication?
- A.
collapse(2)
- B.
private
- C.
num_teams(5)
- D.
map(to: a[0:n*n], b[0:n*n])