Skip to content

Data Transfer with map Clause in OpenMP Offloading

Script
  • "The map clause in OpenMP Offloading is a powerful tool for managing data transfer between the host and the device. It optimizes memory usage and minimizes transfer overhead, ensuring that data is available where it is needed without unnecessary duplication or delays. In this section, we will explore the key types of map clauses—to, from, tofrom, and alloc—and demonstrate their practical applications."
  • "Each type of map clause serves a specific purpose in managing data transfer. The to clause transfers data from the host to the device, while the from clause moves data back from the device to the host. The tofrom clause supports bidirectional data transfer, and the alloc clause allocates memory on the device without initializing it. These clauses can be strategically combined to optimize data management for different scenarios."
  • "This slide illustrates the syntax for applying the map clause in OpenMP Offloading. In C and C++, the clause is used within the target directive, specifying options like to, from, tofrom, or alloc, followed by the variables to be mapped. The same structure applies in Fortran, ensuring consistent usability across supported languages."
  • "In this example of vector addition, we use the map clause to handle data transfers efficiently. The input arrays, A and B, are mapped to the device using the to clause, while the result, array C, is mapped back to the host with the from clause after computation. This approach ensures efficient use of memory and minimizes transfer overhead."
  • "OpenMP provides advanced constructs for data management, such as target data and target update. The target data construct allows data to persist on the device across multiple target regions, reducing redundant transfers. The target update clause enables fine-grained synchronization of data between the host and the device, allowing updates at specific points within a target data scope."
  • "This example illustrates the use of the target data construct to manage persistent data on the device. Arrays A and B are mapped to the device using the to clause, while array C is allocated with the alloc clause. This setup enables multiple target regions to access the same data without remapping, enhancing performance and efficiency."
  • "The target update clause provides a mechanism to selectively synchronize data between the device and the host. In this example, only the updated data in array B is brought back to the host. By controlling when and what data is synchronized, this approach optimizes memory usage and reduces unnecessary transfers."
  • "The directives target enter data and target exit data provide explicit control over data mapping and deallocation without executing any computations. target enter data maps data to the device and allocates memory, while target exit data transfers data back to the host and deallocates it from the device. These constructs are particularly useful in scenarios that demand precise control over memory management."
  • "This table summarizes key data management directives in OpenMP Offloading. Each directive offers distinct capabilities for handling data on the device. For example, target data manages data across multiple target regions, while target update and target enter/exit data provide fine-grained synchronization and memory control. Together, these directives enable developers to efficiently manage data flow and memory allocation."
  • "To optimize data transfer in OpenMP Offloading, it’s essential to use the appropriate directive for the task. The map clauses allow selective transfers, while target data provides persistent device memory. target update facilitates selective synchronization, and target enter/exit data ensures explicit memory management. By combining these strategies, developers can reduce transfer overhead and maximize the efficiency of parallel computations."

The map clause in OpenMP Offloading is crucial for handling data transfers between the host (CPU) and device (GPU). It allows developers to control how variables are allocated, initialized, and transferred between the host and device environments. The map clause includes several types to manage data efficiently and reduce memory transfer overhead, thus optimizing performance.

Description of Image


Overview of map Clause Types

The map clause types specify the data transfer direction and whether the data should be initialized or copied back from the device:

  • to: Copies the original host variable to the device at the start of the target region. Ideal for input data that does not need to return to the host.
  • from: Copies the variable from the device back to the host when the target region ends, useful for variables updated on the device.
  • tofrom: Copies the original host variable to the device at the start and back to the host at the end of the target region.
  • alloc: Allocates memory for the variable on the device without initializing it from the host. Used for temporary variables within the target region.

Syntax for map Clause

For C/C++:

#pragma omp target map([to | from | tofrom | alloc] : variable_list)
{
    // Code that operates on mapped variables
}

For Fortran:

! Fortran
!$omp target map([to | from | tofrom | alloc] : variable_list)
    ! Code to be executed on device
!$omp end target

Example: Vector Addition with map Clause

In this example, arrays A, B, and C are mapped to the device to perform vector addition. The to and from clauses handle data movement between host and device.

C/C++ Version

void vector_add(float *A, float *B, float *C, int n) {
    #pragma omp target map(to: A[0:n], B[0:n]) map(from: C[0:n])
    {
        for (int i = 0; i < n; i++) {
            C[i] = A[i] + B[i];
        }
    }
}

Fortran Version

subroutine vector_add(A, B, C, n)
    real(8), intent(in) :: A(:), B(:)
    real(8), intent(out) :: C(:)
    integer :: n, i

    !$omp target map(to: A, B) map(from: C)
    do i = 1, n
        C(i) = A(i) + B(i)
    end do
    !$omp end target
end subroutine vector_add

In both examples:

  • to (A and B): Moves input arrays A and B to the device at the start of the region.
  • from (C): Moves the result array C back to the host after the target region completes.

Advanced Data Management with target data and target update

The map clause can be combined with constructs like target data and target update to manage data efficiently in complex applications.

target data Construct The target data construct allocates data on the device for a specific code block, allowing data to persist across multiple target regions.

C/C++ Version

#pragma omp target data map(to: A[0:n], B[0:n]) map(alloc: C[0:n])
{
    #pragma omp target
    {
        for (int i = 0; i < n; i++) {
            C[i] = A[i] + B[i];
        }
    }
    #pragma omp target
    {
        for (int i = 0; i < n; i++) {
            C[i] += B[i];
        }
    }
}

Fortran Version

subroutine vector_add(A, B, C, n)
    real(8), intent(in) :: A(:), B(:)
    real(8), intent(out) :: C(:)
    integer :: n, i

    !$omp target data map(to: A, B) map(alloc: C)
    !$omp target
    do i = 1, n
        C(i) = A(i) + B(i)
    end do
    !$omp end target

    !$omp target
    do i = 1, n
        C(i) = C(i) + B(i)
    end do
    !$omp end target
    !$omp end target data
end subroutine vector_add

In these examples:

  • map(to: A, B): Transfers arrays A and B to the device.
  • map(alloc: C): Allocates memory for C on the device without initializing it.

target update Clause

The target update directive provides finer control over data synchronization between host and device. It allows data updates at any point within a target data region.

C/C++ Version

#pragma omp target data map(to: A[0:n]) map(alloc: B[0:n])
{
    #pragma omp target
    {
        for (int i = 0; i < n; i++) {
            B[i] = 2 * A[i];
        }
    }
    // Update B from device to host
    #pragma omp target update from(B[0:n])
}

Fortran Version

subroutine update_vector(A, B, n)
    real(8), intent(in) :: A(:)
    real(8), intent(out) :: B(:)
    integer :: n, i

    !$omp target data map(to: A) map(alloc: B)
    !$omp target
    do i = 1, n
        B(i) = 2 * A(i)
    end do
    !$omp end target

    !$omp target update from(B)
    !$omp end target data
end subroutine update_vector

Here:

  • target update from(B): Brings the updated array B back from the device to the host.

Further Data Management with target enter data and target exit data

The target enter data and target exit data constructs provide explicit control over allocating and deallocating data on the device. This is useful when device memory needs to be managed for multiple target regions without code execution.

C/C++ Version

#pragma omp target enter data map(to: A[0:n], B[0:n]) map(alloc: C[0:n])

// Execute a target region
#pragma omp target
{
    for (int i = 0; i < n; i++) {
        C[i] = A[i] + B[i];
    }
}

// Deallocate device memory
#pragma omp target exit data map(from: C[0:n]) map(delete: A[0:n], B[0:n])

Fortran Version

subroutine manage_data(A, B, C, n)
    real(8), intent(in) :: A(:), B(:)
    real(8), intent(out) :: C(:)
    integer :: n, i

    !$omp target enter data map(to: A, B) map(alloc: C)

    !$omp target
    do i = 1, n
        C(i) = A(i) + B(i)
    end do
    !$omp end target

    !$omp target exit data map(from: C) map(delete: A, B)
end subroutine manage_data

In this example:

  • target enter data: Allocates and maps A, B, and C to the device without immediate execution.
  • target exit data: Brings C back to the host and deallocates A, B, and C on the device.

Summary of OpenMP Offloading Data Management API

OpenMP API Description
#pragma omp target data Manages data on a device for a structured code block.
#pragma omp target update to(list) Updates data from host to device within a target data region.
#pragma omp target update from(list) Updates data from device to host within a target data region.
#pragma omp target enter data map(to: list) Moves data into the device data environment without executing code.
#pragma omp target exit data map(from: list) Moves data from the device back to the host and deallocates memory.

By using the map clause and data management constructs effectively, developers can control data transfers and optimize performance in heterogeneous systems, reducing overhead and maximizing parallel computation efficiency.