Data Transfer with map
Clause in OpenMP Offloading¶
The map
clause in OpenMP Offloading is crucial for handling data transfers between the host (CPU) and device (GPU). It allows developers to control how variables are allocated, initialized, and transferred between the host and device environments. The map
clause includes several types to manage data efficiently and reduce memory transfer overhead, thus optimizing performance.
Overview of map
Clause Types¶
The map
clause types specify the data transfer direction and whether the data should be initialized or copied back from the device:
- to: Copies the original host variable to the device at the start of the target region. Ideal for input data that does not need to return to the host.
- from: Copies the variable from the device back to the host when the target region ends, useful for variables updated on the device.
- tofrom: Copies the original host variable to the device at the start and back to the host at the end of the target region.
- alloc: Allocates memory for the variable on the device without initializing it from the host. Used for temporary variables within the target region.
Syntax for map
Clause¶
For C/C++:
#pragma omp target map([to | from | tofrom | alloc] : variable_list)
{
// Code that operates on mapped variables
}
For Fortran:
! Fortran
!$omp target map([to | from | tofrom | alloc] : variable_list)
! Code to be executed on device
!$omp end target
Example: Vector Addition with map Clause¶
In this example, arrays A
, B
, and C
are mapped to the device to perform vector addition. The to
and from
clauses handle data movement between host and device.
C/C++ Version
void vector_add(float *A, float *B, float *C, int n) {
#pragma omp target map(to: A[0:n], B[0:n]) map(from: C[0:n])
{
for (int i = 0; i < n; i++) {
C[i] = A[i] + B[i];
}
}
}
subroutine vector_add(A, B, C, n)
real(8), intent(in) :: A(:), B(:)
real(8), intent(out) :: C(:)
integer :: n, i
!$omp target map(to: A, B) map(from: C)
do i = 1, n
C(i) = A(i) + B(i)
end do
!$omp end target
end subroutine vector_add
In both examples:
- to (A and B): Moves input arrays A and B to the device at the start of the region.
- from (C): Moves the result array C back to the host after the target region completes.
Advanced Data Management with target data
and target update
¶
The map
clause can be combined with constructs like target data
and target update
to manage data efficiently in complex applications.
target data
Construct The target data
construct allocates data on the device for a specific code block, allowing data to persist across multiple target regions.
C/C++ Version
#pragma omp target data map(to: A[0:n], B[0:n]) map(alloc: C[0:n])
{
#pragma omp target
{
for (int i = 0; i < n; i++) {
C[i] = A[i] + B[i];
}
}
#pragma omp target
{
for (int i = 0; i < n; i++) {
C[i] += B[i];
}
}
}
Fortran Version
subroutine vector_add(A, B, C, n)
real(8), intent(in) :: A(:), B(:)
real(8), intent(out) :: C(:)
integer :: n, i
!$omp target data map(to: A, B) map(alloc: C)
!$omp target
do i = 1, n
C(i) = A(i) + B(i)
end do
!$omp end target
!$omp target
do i = 1, n
C(i) = C(i) + B(i)
end do
!$omp end target
!$omp end target data
end subroutine vector_add
In these examples:
map(to: A, B)
: Transfers arrays A and B to the device.map(alloc: C)
: Allocates memory for C on the device without initializing it.
target update
Clause
The target update
directive provides finer control over data synchronization between host and device. It allows data updates at any point within a target data
region.
C/C++ Version
#pragma omp target data map(to: A[0:n]) map(alloc: B[0:n])
{
#pragma omp target
{
for (int i = 0; i < n; i++) {
B[i] = 2 * A[i];
}
}
// Update B from device to host
#pragma omp target update from(B[0:n])
}
subroutine update_vector(A, B, n)
real(8), intent(in) :: A(:)
real(8), intent(out) :: B(:)
integer :: n, i
!$omp target data map(to: A) map(alloc: B)
!$omp target
do i = 1, n
B(i) = 2 * A(i)
end do
!$omp end target
!$omp target update from(B)
!$omp end target data
end subroutine update_vector
- target update from(B): Brings the updated array B back from the device to the host.
Further Data Management with target enter data
and target exit data
¶
The target enter data
and target exit data
constructs provide explicit control over allocating and deallocating data on the device. This is useful when device memory needs to be managed for multiple target regions without code execution.
C/C++ Version
#pragma omp target enter data map(to: A[0:n], B[0:n]) map(alloc: C[0:n])
// Execute a target region
#pragma omp target
{
for (int i = 0; i < n; i++) {
C[i] = A[i] + B[i];
}
}
// Deallocate device memory
#pragma omp target exit data map(from: C[0:n]) map(delete: A[0:n], B[0:n])
subroutine manage_data(A, B, C, n)
real(8), intent(in) :: A(:), B(:)
real(8), intent(out) :: C(:)
integer :: n, i
!$omp target enter data map(to: A, B) map(alloc: C)
!$omp target
do i = 1, n
C(i) = A(i) + B(i)
end do
!$omp end target
!$omp target exit data map(from: C) map(delete: A, B)
end subroutine manage_data
In this example:
target enter data
: Allocates and mapsA
,B
, andC
to the device without immediate execution.target exit data
: BringsC
back to the host and deallocatesA
,B
, andC
on the device.
Summary of OpenMP Offloading Data Management API¶
OpenMP API | Description |
---|---|
#pragma omp target data | Manages data on a device for a structured code block. |
#pragma omp target update to(list) | Updates data from host to device within a target data region. |
#pragma omp target update from(list) | Updates data from device to host within a target data region. |
#pragma omp target enter data map(to: list) | Moves data into the device data environment without executing code. |
#pragma omp target exit data map(from: list) | Moves data from the device back to the host and deallocates memory. |
By using the map
clause and data management constructs effectively, developers can control data transfers and optimize performance in heterogeneous systems, reducing overhead and maximizing parallel computation efficiency.