Difference between revisions of "Openacc"
(→Matrix Multiply) |
|||
Line 16: | Line 16: | ||
Compile | Compile | ||
− | tau_f90.sh -ta=nvidia matmult.f90 -o matmult | + | tau_f90.sh -ta=nvidia,time -Minfo matmult.f90 -o matmult |
Run: | Run: | ||
− | tau_exec -T serial,cupti | + | tau_exec -T serial,cupti -cupti ./matmult |
− | + | module | |
Use TAU analysis tool to view performance data: | Use TAU analysis tool to view performance data: | ||
pprof | pprof | ||
paraprof | paraprof |
Revision as of 19:53, 15 March 2012
Contents
Matrix Multiply
TAU has support for the OpenACC directives available in PGI 12.3 and greater. Configure TAU:
./configure -c++=pgCC -cc=pgcc -fortran=pgi -cuda=<path to CUDA 4.1 or greater> make install
export TAU_MAKEFILE=<path to TAU>/x86_64/lib/Makefile.tau-cupti-pgi export TAU_OPTIONS='-optVerbose -optShared'
Compile
tau_f90.sh -ta=nvidia,time -Minfo matmult.f90 -o matmult
Run:
tau_exec -T serial,cupti -cupti ./matmult module
Use TAU analysis tool to view performance data:
pprof paraprof