Difference between revisions of "Cray"
(→Cray XK6/XK7) |
(→GPU performance tracking) |
||
(One intermediate revision by the same user not shown) | |||
Line 92: | Line 92: | ||
--> | --> | ||
== Cray XK6/XK7 == | == Cray XK6/XK7 == | ||
− | |||
− | |||
=== GPU performance tracking === | === GPU performance tracking === | ||
Line 114: | Line 112: | ||
3. OpenACC | 3. OpenACC | ||
− | Both PGI and Cray uses the CUDA driver API to interact with the GPU, so setup TAU to collect those calls: | + | Remember to load the cray acc module: |
+ | |||
+ | module load craype-accel-nvidia20 | ||
+ | |||
+ | Both PGI and Cray compilers uses the CUDA driver API to interact with the GPU, so to setup TAU to collect those calls: | ||
export TAU_CUPTI_API=driver | export TAU_CUPTI_API=driver | ||
− | Compile as normally would and run with tau_exec | + | Compile as normally would and run with tau_exec: |
aprun -N 1 tau_exec -T serial,cupti -cupti ./matmult | aprun -N 1 tau_exec -T serial,cupti -cupti ./matmult | ||
Line 124: | Line 126: | ||
4. Viewing profiles | 4. Viewing profiles | ||
− | TAU profile are written to disk as '''profile.*''' (you may have several files.) You can view TAU | + | TAU profile are written to disk as '''profile.*''' (you may have several files.) You can view TAU profiles either through '''pprof''' (text-basied) or '''paraprof''' (GUI). |
5. Tracing | 5. Tracing | ||
Line 133: | Line 135: | ||
before running your application. The traces need to be post-processed as well, issue these commands: | before running your application. The traces need to be post-processed as well, issue these commands: | ||
− | |||
aprun -N 1 tau_exec -T serial,cupti -cupti ./matmult | aprun -N 1 tau_exec -T serial,cupti -cupti ./matmult | ||
tau_multimerge | tau_multimerge | ||
− | tau2slog2 tau.trc tau.edf -o | + | tau2slog2 tau.trc tau.edf -o matmult.slog2 |
− | jumpshot | + | jumpshot matmult.slog2 |
− | Jumpshot is a | + | Jumpshot is a common Trace visualizer bundled with TAU. |
Latest revision as of 18:21, 25 October 2012
Cray XK6/XK7
GPU performance tracking
1. Configuring TAU:
module load cudatoolkit ./configure -arch=craycnl -cuda="$CRAY_CUDATOOLKIT_DIR" -cudalibrary="$CRAY_CUDATOOLKIT_POST_LINK_OPTS" -bfd=download
Setup your environment:
export PATH=<path to tau2>/craycnl/bin:$PATH export LD_LIBRARY_PATH=<path to tau2>/craycnl/lib:$LD_LIBRARY_PATH
2. CUDA
Build as normally would, and modify your run command to be:
aprun -N 1 tau_exec -T serial,cupti -cupti ./matmult
3. OpenACC
Remember to load the cray acc module:
module load craype-accel-nvidia20
Both PGI and Cray compilers uses the CUDA driver API to interact with the GPU, so to setup TAU to collect those calls:
export TAU_CUPTI_API=driver
Compile as normally would and run with tau_exec:
aprun -N 1 tau_exec -T serial,cupti -cupti ./matmult
4. Viewing profiles
TAU profile are written to disk as profile.* (you may have several files.) You can view TAU profiles either through pprof (text-basied) or paraprof (GUI).
5. Tracing
Traces can be capture by setting:
export TAU_TRACE=1
before running your application. The traces need to be post-processed as well, issue these commands:
aprun -N 1 tau_exec -T serial,cupti -cupti ./matmult tau_multimerge tau2slog2 tau.trc tau.edf -o matmult.slog2 jumpshot matmult.slog2
Jumpshot is a common Trace visualizer bundled with TAU.