From Tau Wiki
Jump to: navigation, search


Link Code Version Machine Date
ORNL website Download 1.0.1 Keeneland December 2011

SHOC instrumented with TAU shows a lot of features of TAU's GPU measurement system. The Keeneland cluster was chosen to run these experiments.

Building SHOC

Setting up environment

setup your environment this way:

   module load tau/2.21
   export TAU_MAKEFILE=$TAUROOT/lib/Makefile.tau-cupti-pdt

Compiling SHOC 1.0.1 with TAU

After configuring SHOC edit the config/ to:

   # === Basics ===
   CC       =
   CXX      =
   LD       =
   AR       = /usr/bin/ar
   RANLIB   = ranlib
   CPPFLAGS += -I$(SHOC_ROOT)/src/common -I${SHOC_ROOT}/config
   CFLAGS   += -m64 -g -O2
   CXXFLAGS += -m64 -g -O2
   ARFLAGS  = rcv
   LIBS     = -L$(SHOC_ROOT)/lib  -lrt -L/sw/keeneland/cuda/3.2RC/lib64/ -lcudart
   USE_MPI         = no
   OCL_CPPFLAGS    += -I${SHOC_ROOT}/src/opencl/common
   OCL_LIBS        =
   NVCC            = /sw/keeneland/cuda/3.2/bin/nvcc
   CUDA_CXX        =
   CUDA_INC        = -I/sw/keeneland/cuda/3.2/include
   CUDA_CPPFLAGS   += -gencode=arch=compute_10,code=sm_10 \
   -gencode=arch=compute_11,code=sm_11  -gencode=arch=compute_13,code=sm_13 \
   -gencode=arch=compute_20,code=sm_20  -gencode=arch=compute_20,code=compute_20 \
   -I${SHOC_ROOT}/src/cuda/include $(TAU_LIBS)

Then make/install as you normally would.

More info at: TAU's userguide

Running CUDA applications

Both CUDA and OpenCL are instrumented dynamically through library preloading, use the tau_exec script to run the CUDA application:

   %> tau_exec -T serial -cuda ./Stencil2D

The -T serial specifies with TAU configuration to use, you can change this for MPI applications and run:

   %> mpirun -np 4 tau_exec -T mpi -cuda ./SGEMM

This could be done with executables build with or without TAU.


Traces can be recorded by first setting:

   %> export TAU_TRACE=1
   %> tau_exec -T serial -cuda ./Stencil2D
   %> tau_multimerge
   %> tau2slog2 tau.trc tau.ed -o stencil2d.slog2
   %> jumpshot

Running OpenCL applications

Use tau_exec as well:

   %> tau_exec -T serial -opencl ./SGEMM 

Performance Data

Some example performance data from S3D:



File:S3D-cuda.ppk and File:S3D-cuda.slog2