===
HSA
===

Note: pocl's HSA support is currently in experimental stage.

The experimental HSA driver works only with an AMD Kaveri or Carrizo APUs
using the HSAIL-supported LLVM and Clang. Other than that, you will need
a recent linux (4.0+) and some software.

Installing prerequisite software
---------------------------------

1) Install the HSA AMD runtime library
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  Pre-built binaries can be found here:

  https://github.com/HSAFoundation/HSA-Runtime-AMD

  This usually installs into /opt/hsa. Make sure to read Q&A in README.md, it
  lists some common issues (like /dev/kfd permissions) and run sample/vector_copy
  to verify you have a working runtime.

2) Build & install the LLVM with HSAIL support
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

  Fetch the HSAIL branch of LLVM 3.7:

  `git clone https://github.com/HSAFoundation/HLC-HSAIL-Development-LLVM/ -b hsail-stable-3.7`

  Patch it a bit with:

  `patch -p1 PATHTO/pocl/tools/patches/llvm-3.7-hsail-branch.patch`

  Fetch the upstream Clang's 3.7 branch:

  `cd tools; svn co http://llvm.org/svn/llvm-project/cfe/branches/release_37 clang`

  Patch it also:

  `cd clang; patch -p0 pocl/tools/patches/clang-3.7-hsail-branch.patch`

  An LLVM cmake configuration command like this worked for me:

  `cd ../../ ; mkdir build; cd build; cmake .. -DLLVM_EXPERIMENTAL_TARGETS_TO_BUILD=HSAIL \
  -DBUILD_SHARED_LIBS=off -DCMAKE_INSTALL_PREFIX=INSTALL_DIR -DLLVM_ENABLE_RTTI=on \
  -DLLVM_BUILD_LLVM_DYLIB=on -DLLVM_ENABLE_EH=ON -DHSAIL_USE_LIBHSAIL=OFF`

  HSAIL_USE_LIBHSAIL=OFF is only for safety. If you accidentally build clang with libHSAIL,
  it will cause mysterious link errors later when building pocl.

  Change INSTALL_DIR to your target prefix of choice. Note that these are **required** :

  `-DLLVM_ENABLE_RTTI=ON -DLLVM_ENABLE_EH=ON -DLLVM_EXPERIMENTAL_TARGETS_TO_BUILD=HSAIL`

  Also, if you don't want to build all the default targets, you'll need AMDGPU.

  Then build and install the Clang/LLVM:

  `make -j4 && make install`


3) Build HSAIL-Tools
~~~~~~~~~~~~~~~~~~~~~

   `git clone https://github.com/HSAFoundation/HSAIL-Tools`

   Build it (check CMAKE_INSTALL_PREFIX):

   `mkdir -p build/lnx64
    cd build/lnx64
    cmake ../.. -DCMAKE_INSTALL_PREFIX=$HOME/bin
    make -j`

   You might need to add

   `-DCMAKE_CXX_FLAGS=-I$HOME/llvm-3.7-hsa/include` or similar to the cmake command line
   if it doesn't find your LLVM headers.

   In particular **HSAILasm** executable will be required by pocl.


4) Build pocl.
~~~~~~~~~~~~~~~

  Using autotools:

    `./configure --with-hsa-runtime-dir=\</opt/hsa\>
    LLVM_CONFIG=<hsail-built-llvm-dir>/bin/llvm-config
    HSAILASM=\<path/to/HSAILasm\>`

  Or using cmake:
    `cmake -DENABLE_HSA=ON -DWITH_HSA_RUNTIME_DIR=\</opt/hsa\>
    -DWITH_HSAILASM_PATH=\<path/to/HSAILasm\>`

  Both should result in "hsa" appearing in pocl's targets to build ("OCL_TARGETS"
  in cmake output, "Enabled device drivers:" in autoconf output)

5) Run tests & play around
~~~~~~~~~~~~~~~~~~~~~~~~~~~

  After building pocl, you can smoke test the HSA driver by executing the HSA
  tests of the pocl testsuite:

  `make check TESTSUITEFLAGS="-k hsa"`


HSA Support notes
------------------

Note that the support is still experimental and very much unfinished. You're
welcome to try it out and report any issues, though.

What's implemented:
 * global/local/private memory
 * atomics, barriers
 * most of the OpenCL kernel library builtins

What's missing
 * printf() is not implemented
 * several builtins are not implemented yet (erf(c), lgamma, tgamma,
   logb, remainder, nextafter) and some are suboptimal or may give incorrect
   results with under/overflows (e.g. hypot, length, distance). We're working on
   this, if you find any problem  please let us know)
 * image support is not implemented
 * OpenCL 2.0 features (SVM) are unimplemented
 * Performance is suboptimal in many cases
