Difference between revisions of "Cruft"

From Tau Wiki
Jump to: navigation, search
 
(8 intermediate revisions by the same user not shown)
Line 18: Line 18:
 
|March 2012
 
|March 2012
 
|}
 
|}
 +
 +
==== These instructions can also be used for CoMD ====
  
 
== Building Cruft ==
 
== Building Cruft ==
 +
 +
For OpenCL:
 +
 +
export OPENCL_INCLUDE_DIR=<path to OpenCL include dir>
  
 
Modify the CmakeLists.txt and add these lines:
 
Modify the CmakeLists.txt and add these lines:
Line 40: Line 46:
 
   
 
   
 
  END_INSTRUMENT_SECTION
 
  END_INSTRUMENT_SECTION
 +
 +
For the OpenCL binary edit src-ocl/eam_kernels.c to move this section about the typedef CL_REAL_T real_t;
 +
 +
#if defined(cl_khr_fp64)  // Khronos extension available?
 +
#pragma OPENCL EXTENSION cl_khr_fp64 : enable
 +
#elif defined(cl_amd_fp64)  // AMD extension available?
 +
#pragma OPENCL EXTENSION cl_amd_fp64 : enable
 +
#endif
  
 
Then set:
 
Then set:
  
  export TAU_OPTIONS="-optVerbose -optTauSelectFile=`pwd`/select.tau"
+
  export TAU_OPTIONS="-optShared -optVerbose -optTauSelectFile=`pwd`/select.tau"
 
  export TAU_MAKEFILE=<path to TAU>/x86_64/lib/Makefile.tau-icpc-pdt
 
  export TAU_MAKEFILE=<path to TAU>/x86_64/lib/Makefile.tau-icpc-pdt
 
  make
 
  make
Line 55: Line 69:
 
  ./cruft -f data/8k.inp.gz
 
  ./cruft -f data/8k.inp.gz
  
 +
 +
And for OpenCL accelerated version:
 +
 +
tau_exec -T serial -opencl ./cruftOCL -p ag -e -f data/8k.inp.gz
 +
 +
tau_exec -T serial -opencl ./cruftOCL -f data/8k.inp.gz
  
 
== Performance Data ==
 
== Performance Data ==
 +
 +
EAM method:
 +
 +
First the serial version of Cruft shows two loops in eam.c consumes most of the time.
 +
 +
[[Image:cruft-EAM-profile.png|750px]]
 +
 +
In comparison the OpenCL accelerated version two kernels dominate the runtime.
 +
 +
[[Image:cruftOCL-eam-profile.png|450px]]
 +
 +
One thing you can check with OpenCL application is the time spent in command queue here the table for each kernel:
 +
 +
[[Image:cruftOCL-eam-queue.png|750px]]
 +
 +
Profile Data:
 +
 +
[[Image:cruft-EAM.ppk]],
 +
[[Image:cruftOCL-EAM.ppk]]
 +
 +
LJ method:
 +
 +
First the serial version of Cruft shows a single loop accounts for runtime.
 +
 +
[[Image:cruft-LJ-profile.png|750px]]
 +
 +
In comparison the OpenCL accelerated version the LJ_Force kernel dominate the runtime.
 +
 +
[[Image:cruftOCL-lj-profile.png|450px]]
 +
 +
Ones again here is the time spent in the queue for this kernels.
 +
 +
[[Image:cruftOCL-lj-queue.png|750px]]
 +
 +
Profile Data:
 +
 +
[[Image:cruft-LJ.ppk]],
 +
[[Image:cruftOCL-LJ.ppk]]

Latest revision as of 23:08, 27 December 2012


Background

Link Code Version Machine Date
LLNL website git repo Kyle Spafford fork Keeneland March 2012

These instructions can also be used for CoMD

Building Cruft

For OpenCL:

export OPENCL_INCLUDE_DIR=<path to OpenCL include dir>

Modify the CmakeLists.txt and add these lines:

set (CMAKE_CXX_COMPILER tau_cxx.sh)
set (CMAKE_C_COMPILER tau_cc.sh)

Then issue

cmake .

You can safety proceed when you encounter reversions.

Selective instrumentation of Loops:

BEGIN_INSTRUMENT_SECTION

loops file="eam.c" routine="eamForce#"
loops file="ljForce.c" routine="LJ#"

END_INSTRUMENT_SECTION

For the OpenCL binary edit src-ocl/eam_kernels.c to move this section about the typedef CL_REAL_T real_t;

#if defined(cl_khr_fp64)  // Khronos extension available?
#pragma OPENCL EXTENSION cl_khr_fp64 : enable
#elif defined(cl_amd_fp64)  // AMD extension available?
#pragma OPENCL EXTENSION cl_amd_fp64 : enable
#endif

Then set:

export TAU_OPTIONS="-optShared -optVerbose -optTauSelectFile=`pwd`/select.tau"
export TAU_MAKEFILE=<path to TAU>/x86_64/lib/Makefile.tau-icpc-pdt
make

Running Cruft

./cruft -p ag -e -f data/8k.inp.gz

or

./cruft -f data/8k.inp.gz


And for OpenCL accelerated version:

tau_exec -T serial -opencl ./cruftOCL -p ag -e -f data/8k.inp.gz
tau_exec -T serial -opencl ./cruftOCL -f data/8k.inp.gz

Performance Data

EAM method:

First the serial version of Cruft shows two loops in eam.c consumes most of the time.

Cruft-EAM-profile.png

In comparison the OpenCL accelerated version two kernels dominate the runtime.

CruftOCL-eam-profile.png

One thing you can check with OpenCL application is the time spent in command queue here the table for each kernel:

CruftOCL-eam-queue.png

Profile Data:

File:Cruft-EAM.ppk, File:CruftOCL-EAM.ppk

LJ method:

First the serial version of Cruft shows a single loop accounts for runtime.

Cruft-LJ-profile.png

In comparison the OpenCL accelerated version the LJ_Force kernel dominate the runtime.

CruftOCL-lj-profile.png

Ones again here is the time spent in the queue for this kernels.

CruftOCL-lj-queue.png

Profile Data:

File:Cruft-LJ.ppk, File:CruftOCL-LJ.ppk