WRF

From Tau Wiki
Jump to: navigation, search

WRF

Installing netcdf

Example (neuronic):

export NETCDF=$HOME/packages/i386/netcdf-3.6.0-p1
tar -xzf netcdf.tar.gz
cd netcdf-3.6.0-p1/src
CC=icc CXX=icpc FC=ifort CPPFLAGS=-DpgiFortran ./configure --prefix=$NETCDF
make
mkdir $NETCDF
make install

Building WRF (uninstrumented)

Example (neuronic):

export NETCDF=$HOME/packages/i386/netcdf-3.6.0-p1
./configure

select option 11 (for neuronic, MPI)

./compile em_real

Building WRF (uninstrumented)

Example (neuronic):

export NETCDF=$HOME/packages/i386/netcdf-3.6.0-p1
./configure

select option 11 (for neuronic, MPI)

./compile em_real

Neuronic's Intel Fortran Compiler get's the following error:

fortcom: Severe: **Internal compiler error: segmentation violation signal raised** Please report this error along with the circumstances in which it occurred in a Software Problem Report.  Note: File and line given may not be explicit cause of this error.

Just go to external/esmf_time_f90 and run the command again, e.g.:

make FC="mpif90 -f90=ifort  -FR -cm -w -I.  -convert big_endian -mp"
CPP="/lib/cpp -traditional -I../../inc -I. -DNO_RRTM_PHYSICS  -DEM_CORE=1
-DNMM_CORE=0 -DNMM_MAX_DIM=1250 -DCOAMPS_CORE=0 -DEXP_CORE=0 -DNONSTANDARD_SYSTEM
-DF90_STANDALONE -DCONFIG_BUF_LEN=8192 -DMAX_DOMAINS_F=21 -DYYY"

Note: MPI used will be mpicc and mpif90 based on path.

Instrumenting WRF

Configure TAU with pdt, f90, and MPI. Edit configure.wrf:

< FC              =       mpif90 -f90=ifort
< LD              =       mpif90 -f90=ifort
---
> FC              =       tau_f90.sh
> LD              =       tau_f90.sh

Then set TAU_MAKEFILE, TAU_OPTIONS, and compile:

export TAU_MAKEFILE=$HOME/tau2/include/Makefile
export TAU_OPTIONS='-optVerbose -optKeepFiles -optPdtF95Opts="-R free"'
./clean
./compile

Running WRF (simple)

cd test/em_real
mpirun -np 4 ./wrf.exe

Running WRF (real)

tar -xzvf wrfinput.tar.gz
mpirun -np 3 ./wrf.exe

For our runs, we modified namelist.input to run only 8 hours of simulation time. Edit namelist.input: ...

Performance Data

The following runs are from LLNL's MCR P4 Xeon (2.4 Ghz) Linux cluster. They were run over the course of the week of 2005-07-13.

  • Flat Profile

8 cpu's 16 cpu's 32 cpu's 64 cpu's 128 cpu's256 cpu's 512 cpu's 1024 cpu's

  • Callpath Profile (Full depth)

8 cpu's 16 cpu's 32 cpu's 64 cpu's 128 cpu's 256 cpu's 512 cpu's

  • Flat Profile with PAPI (Metrics: P_WALL_CLOCK_TIME, PAPI_FP_INS, and PAPI_L1_DCM)

8 cpu's 16 cpu's 32 cpu's 64 cpu's 128 cpu's 256 cpu's 512 cpu's