Skip to content

Using and Compiling with YAKL

Matt Norman edited this page May 25, 2022 · 14 revisions

Source files

To use YAKL in a source file, you must include the YAKL.h header file. This includes nearly all of the YAKL library.

#include "YAKL.h"
int main(int argc, char **argv) {
  MPI_Init(&argc , &argv);
  yakl::init();
  {
    ...
  }
  yakl::finalize();
  MPI_Finalize();
}

It is important that yakl::init() be called before any YAKL operations are performed (creating Array objects, invoking parallel_fors, etc.). It is also important that all operations be completed and all Array objects be free'd before calling yakl::finalize(). If you are using MPI, please be sure to initialize MPI before initializing YAKL. This is because YAKL needs the MPI rank to ensure that only rank 0 writes to stdout when informing the user.

The YAKL.h file excludes the following, which you will need to include separately if you want to use that functionality:

  • #include "YAKL_netcdf.h": For NetCDF I/O
  • #include "YAKL_pnetcdf.h": For Parallel NetCDF I/O
  • #include "YAKL_fft.h": For simple and small FFTs performed on SArray and FSArray objects
  • #include "YAKL_tridiagonal.h": For small tridiagonal solves on SArray and FSArray objects
  • #include "YAKL_pentadiagonal.h": For small pentadiagonal solves on SArray and FSArray objects

Building with YAKL

It is highly recommended to use the YAKL CMake integration via add_subdirectory(/path/to/yakl /path/to/binary/location).

CMake

To use YAKL in another CMake project, the following is a template workflow to use in your CMakeLists.txt, assuming a target called TARGET and a yakl repo in the directory ${YAKL_HOME}.

# YAKL_ARCH can be CUDA, HIP, SYCL, OPENMP, or empty (for serial CPU backend)
set(YAKL_ARCH "CUDA")  
# Set YAKL_[ARCH_NAME]_FLAGS where ARCH_NAME is CUDA, HIP, SYCL, OPENMP, or CXX (for CPU targets)
set(YAKL_CUDA_FLAGS "-O3 -arch sm_70 -ccbin mpic++")
# YAKL_F90_FLAGS will be used for compiling internal YAKL Fortran 90 files
#    Currently only for gator_mod.F90, Fortran hooks to the YAKL pool allocator
#        and YAKL init() and finalize()
set(YAKL_F90_FLAGS "-O3")
# Add the YAKL library and perform other needed target tasks
# This provides a CMake status message giving the flags YAKL is using for all YAKL source files
# This will also set ${YAKL_COMPILER_FLAGS} in the parent scope for the user to check
add_subdirectory(${YAKL_HOME} ./yakl)
# Set YAKL properties on all C++ source files in a target, 
#    link yakl into that target, and set the appropriate C++ standard
include(${YAKL_HOME}/yakl_utils.cmake)
yakl_process_target(TARGET)

YAKL's yakl_process_target() macro processes the target's C++ source files, it will automatically link the yakl library target into the TARGET you pass in, and it will set the appropriate C++ standard. For the CUDA backend, YAKL will set all C++ source files as CUDA files for CMake compilation.

Important:

  • All of the processed target's C++ files are processed with YAKL's flags and other tasks. If this is not desirable, and you'd rather deal with a list of C++ source files, you can use yakl_process_cxx_source_files("${files_list}"), where ${files_list} is a list of C++ source files you want to process with YAKL flags and cmake attributes. Be sure not to forget the quotations around the variable, or the list may not properly make its way to the macro. When using yakl_process_cxx_source_files, you need to also call target_link_libraries() yourself and set the appropriate C++ standard yourself.
  • The only flags that make their way into YAKL source files come from the CMake variable YAKL_<LANG>_FLAGS (where <LANG> the same as YAKL_ARCH=="<LANG" except when YAKL_ARCH isn't defined, in which case <LANG>==CXX). If you have set the CXXFLAGS environment variable, then those will also be included for all backends except CUDA, in which case the environment variable, CUDAFLAGS will be included. YAKL does not use the CMAKE_CXX_FLAGS flags.

The following compiler flags will also affect YAKL behavior:

  • -DYAKL_DEBUG: Turns on YAKL's debugging, including checks for array index bounds, invalid host and device data accesses, use of uninitialized Array objects, and more. This is valid for all hardware backends, and it does significantly slow down the code.
  • -DYAKL_MANAGED_MEMORY: Leads to CUDA, HIP, and SYCL backends using Managed memory allocations. If -D_OPENACC is also specified, then managed memory allocations will also tell the OpenACC runtime to ignore memory addresses in the ranges allocated. if -D_OPENMP45 is also specified, then the same thing is done for the OpenMP runtime. This causes those runtimes to avoid moving the data, since the Managed memory runtime will automaticall do this.
  • -DHAVE_MPI: Highly recommended for all MPI applications to allow YAKL to only inform the user from the master task (rank 0)
  • -DMEMORY_DEBUG: Turns on memory debugging printing to help the user identify memory issues. This will do a lot of output, and it is recommended to only do this on runs with one MPI task.
  • -DYAKL_PROFILE: Turn on the YAKL timers so that calls to yakl::timer_start() and yakl::timer_stop() will actually call the timers.
  • -DYAKL_AUTO_PROFILE: Turn on the YAKL timers, and automatically add timers around all parallel_for calls that provide a label.

Table of backend options

Here is a table describing the CMake options to enable a given backend. YAKL automatically sets the C++ standard to 17 inside CMake for all YAKL source files.

Backend -DYAKL_ARCH Setting flags Notes
None / CPU -DYAKL_ARCH="" -DYAKL_CXX_FLAGS="..." You must add all flags yourself.
OpenMP 3.5 -DYAKL_ARCH="OPENMP" -DYAKL_OPENMP_FLAGS="..." You must add all flags yourself, including OpenMP-enabling flags.
CUDA -DYAKL_ARCH="CUDA" -DYAKL_CUDA_FLAGS="..." YAKL automatically adds -x cu --expt-extended-lambda --expt-relaxed-constexpr -Wno-deprecated-gpu-targets -DTHRUST_IGN ORE_CUB_VERSION_CHECK -std=c++17. You must add -arch sm_?? yourself along with any other CUDA specific options you want. Please do not specify -std=c++?? in your YAKL_CUDA_FLAGS. CUDA doesn't like multiple definitions of the standard. If -std=c++?? is added through other avenues in CMake, this may also cause a problem. CUDA also does not like multiple optimization level flags -O[0-4].
HIP -DYAKL_ARCH="HIP" -DYAKL_HIP_FLAGS="..." You must add all flags yourself.
SYCL -DYAKL_ARCH="SYCL" -DYAKL_SYCL_FLAGS="..." You must add all flags yourself.

Traditional Makefile

If you want to use a Makefile approach, then when you want to use a given YAKL backend, you'll need to specify -DYAKL_ARCH_<LANG>, where <LANG> is one of the previously mentioned backend labels (CXX, CUDA, HIP, SYCL, or OPENMP). You're on your own regarding any other compilation issues, but you can use the CMakeLists.txt file for guidance. It is kept as simple as possible for this purpose.

Clone this wiki locally