Difference between revisions of "Cruft"

From Tau Wiki
Jump to: navigation, search
 
(3 intermediate revisions by the same user not shown)
Line 18: Line 18:
 
|March 2012
 
|March 2012
 
|}
 
|}
 +
 +
==== These instructions can also be used for CoMD ====
  
 
== Building Cruft ==
 
== Building Cruft ==
 +
 +
For OpenCL:
 +
 +
export OPENCL_INCLUDE_DIR=<path to OpenCL include dir>
  
 
Modify the CmakeLists.txt and add these lines:
 
Modify the CmakeLists.txt and add these lines:
Line 41: Line 47:
 
  END_INSTRUMENT_SECTION
 
  END_INSTRUMENT_SECTION
  
For the Opencl code edit src-ocl/eam_kernels.c to move this section about the typedef CL_REAL_T real_t;
+
For the OpenCL binary edit src-ocl/eam_kernels.c to move this section about the typedef CL_REAL_T real_t;
  
 
  #if defined(cl_khr_fp64)  // Khronos extension available?
 
  #if defined(cl_khr_fp64)  // Khronos extension available?
Line 51: Line 57:
 
Then set:
 
Then set:
  
  export TAU_OPTIONS="-optVerbose -optTauSelectFile=`pwd`/select.tau"
+
  export TAU_OPTIONS="-optShared -optVerbose -optTauSelectFile=`pwd`/select.tau"
 
  export TAU_MAKEFILE=<path to TAU>/x86_64/lib/Makefile.tau-icpc-pdt
 
  export TAU_MAKEFILE=<path to TAU>/x86_64/lib/Makefile.tau-icpc-pdt
 
  make
 
  make
Line 73: Line 79:
  
 
EAM method:
 
EAM method:
 +
 +
First the serial version of Cruft shows two loops in eam.c consumes most of the time.
  
 
[[Image:cruft-EAM-profile.png|750px]]
 
[[Image:cruft-EAM-profile.png|750px]]
  
[[Image:cruft-EAM.ppk]]
+
In comparison the OpenCL accelerated version two kernels dominate the runtime.  
  
 
[[Image:cruftOCL-eam-profile.png|450px]]
 
[[Image:cruftOCL-eam-profile.png|450px]]
  
 +
One thing you can check with OpenCL application is the time spent in command queue here the table for each kernel:
 +
 +
[[Image:cruftOCL-eam-queue.png|750px]]
 +
 +
Profile Data:
 +
 +
[[Image:cruft-EAM.ppk]],
 
[[Image:cruftOCL-EAM.ppk]]
 
[[Image:cruftOCL-EAM.ppk]]
  
 
LJ method:
 
LJ method:
 +
 +
First the serial version of Cruft shows a single loop accounts for runtime.
  
 
[[Image:cruft-LJ-profile.png|750px]]
 
[[Image:cruft-LJ-profile.png|750px]]
  
[[Image:cruft-LJ.ppk]]
+
In comparison the OpenCL accelerated version the LJ_Force kernel dominate the runtime.  
  
 
[[Image:cruftOCL-lj-profile.png|450px]]
 
[[Image:cruftOCL-lj-profile.png|450px]]
  
 +
Ones again here is the time spent in the queue for this kernels.
 +
 +
[[Image:cruftOCL-lj-queue.png|750px]]
 +
 +
Profile Data:
 +
 +
[[Image:cruft-LJ.ppk]],
 
[[Image:cruftOCL-LJ.ppk]]
 
[[Image:cruftOCL-LJ.ppk]]

Latest revision as of 23:08, 27 December 2012


Background

Link Code Version Machine Date
LLNL website git repo Kyle Spafford fork Keeneland March 2012

These instructions can also be used for CoMD

Building Cruft

For OpenCL:

export OPENCL_INCLUDE_DIR=<path to OpenCL include dir>

Modify the CmakeLists.txt and add these lines:

set (CMAKE_CXX_COMPILER tau_cxx.sh)
set (CMAKE_C_COMPILER tau_cc.sh)

Then issue

cmake .

You can safety proceed when you encounter reversions.

Selective instrumentation of Loops:

BEGIN_INSTRUMENT_SECTION

loops file="eam.c" routine="eamForce#"
loops file="ljForce.c" routine="LJ#"

END_INSTRUMENT_SECTION

For the OpenCL binary edit src-ocl/eam_kernels.c to move this section about the typedef CL_REAL_T real_t;

#if defined(cl_khr_fp64)  // Khronos extension available?
#pragma OPENCL EXTENSION cl_khr_fp64 : enable
#elif defined(cl_amd_fp64)  // AMD extension available?
#pragma OPENCL EXTENSION cl_amd_fp64 : enable
#endif

Then set:

export TAU_OPTIONS="-optShared -optVerbose -optTauSelectFile=`pwd`/select.tau"
export TAU_MAKEFILE=<path to TAU>/x86_64/lib/Makefile.tau-icpc-pdt
make

Running Cruft

./cruft -p ag -e -f data/8k.inp.gz

or

./cruft -f data/8k.inp.gz


And for OpenCL accelerated version:

tau_exec -T serial -opencl ./cruftOCL -p ag -e -f data/8k.inp.gz
tau_exec -T serial -opencl ./cruftOCL -f data/8k.inp.gz

Performance Data

EAM method:

First the serial version of Cruft shows two loops in eam.c consumes most of the time.

Cruft-EAM-profile.png

In comparison the OpenCL accelerated version two kernels dominate the runtime.

CruftOCL-eam-profile.png

One thing you can check with OpenCL application is the time spent in command queue here the table for each kernel:

CruftOCL-eam-queue.png

Profile Data:

File:Cruft-EAM.ppk, File:CruftOCL-EAM.ppk

LJ method:

First the serial version of Cruft shows a single loop accounts for runtime.

Cruft-LJ-profile.png

In comparison the OpenCL accelerated version the LJ_Force kernel dominate the runtime.

CruftOCL-lj-profile.png

Ones again here is the time spent in the queue for this kernels.

CruftOCL-lj-queue.png

Profile Data:

File:Cruft-LJ.ppk, File:CruftOCL-LJ.ppk