Introduction to OpenCL

OpenCL at NERSC

OpenCL is an open standard for programming heterogeneous computers composed of CPUs, GPUs and other processors. OpenCL consists of a framework to define the platform as a host (typically a CPU) and one or more compute devices (e.g. a GPU) plus a C-based programming language for writing programs for the compute devices. Using OpenCL, a programmer can write parallel programs that use all the resources of the heterogeneous computer. We give an example of a C++ API but the concepts are relevant to OpenCL as a whole. OpenCL is current available on NERSC's DIRAC Testbed.

From: SC13 Tutorial -- OpenCL: A Hands-on Introduction

Tim Mattson, Intel Corp.; Alice Koniges, Berkeley Lab; Simon McIntosh-Smith University of Bristol.

SC13 attendees are shown working on Dirac. Tutorial Credits: This content is based on slides produced by Tom Deakin and Simon which were based on slides by Tim and Simon with help from Ben Gaster (Qualcomm) .

Dirac Instructions (comments are denoted with a #):

# Log into a login node for accessing Dirac (carver)
ssh -Y username@carver.nersc.gov
qsub -I -V -q dirac_int -l nodes=1:ppn=8
# Wait to be taken to a node
#you are there when your prompt says [username@dirac37] for example

Note: Module set-up is subject to default changes

module unload cuda
module load cuda/5.5
module unload pgi
module load gcc-sl6

Go to previous working directory if in a newly started PBS shell:

cd $PBS_O_WORKDIR

Make a directory for your exercises and grab them if you have not already

mkdir OpenCL_exercises
cd OpenCL_exercises
svn export http://portal.nersc.gov/svn/training/SC13/opencl

Compilation and first execution:

make; ./vadd

Example: vector addition

The hello world of program of data parallel programming is to add two vectors

C[i] = A[i] + B[i] for i=0 to N-1

For the OpenCL solution, there are two parts

– Kernel code

– Host code

Vector Addition – Kernel

__kernel void vadd(  __global const float *a,
                                                     __global const float *b,
                                                     __global            float *c)
 {
     int gid = get_global_id(0);
     c[gid]  = a[gid] + b[gid];
}

– Take the Vadd program we provide you. It will run a simple kernel to add two vectors together.

– Look at the host code and identify the API calls in the host code. Compare them against the API descriptions on the OpenCL C++ reference card.

• Expected output:

– A message verifying that the program completed successfully

Vector Addition – Host

• The host program is the code that runs on the host to:

– Setup the environment for the OpenCL program

– Create and manage kernels

• 5 simple steps in a basic host program:

– Define the platform … platform = devices+context+queues

– Create and Build the program (dynamic library for kernels)

– Setup memory objects

– Define the kernel (attach arguments to kernel function)

– Submit commands … transfer memory objects and execute kernels

The C++ Interface

• Khronos has defined a common C++ header file containing a high level interface to OpenCL, cl.hpp

• This interface is dramatically easier to work with¹

• Key features:

– Uses common defaults for the platform and command-queue, saving the programmer from extra coding for the most common use cases

– Simplifies the basic API by bundling key parameters with the objects rather than requiring verbose and repetitive argument lists

– Ability to “call” a kernel from the host, like a regular function

– Error checking can be performed with C++ exceptions

¹ especially for C++ programmers…

Need Help?

Help Portal

Accounts Portal

Allocations Portal

Code of Conduct

Introduction to OpenCL

OpenCL at NERSC

From: SC13 Tutorial -- OpenCL: A Hands-on Introduction

Tim Mattson, Intel Corp.; Alice Koniges, Berkeley Lab; Simon McIntosh-Smith University of Bristol.

Example: vector addition