From 94bc968f7d097b075fc1be79c4337fe2b39719a0 Mon Sep 17 00:00:00 2001 From: fxzjshm Date: Mon, 3 Feb 2025 19:41:42 +0800 Subject: [PATCH] HIP: add doc on small default launch bounds Related: #10610 Signed-off-by: fxzjshm --- docs/build.md | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/docs/build.md b/docs/build.md index dd6495028..45e0e7b8b 100644 --- a/docs/build.md +++ b/docs/build.md @@ -197,6 +197,19 @@ You can download it from your Linux distro's package manager or from here: [ROCm && cmake --build build -- -j 16 ``` + If you get the following error during execution (kernel name might vary): + ``` + Launch params (1024, 1, 1) are larger than launch bounds (256) for kernel _ZL12rms_norm_f32ILi1024EEvPKfPfif please add launch_bounds to kernel define or use --gpu-max-threads-per-block recompile program ! + ``` + this occurs because the compiler uses a smaller default launch bound value. + Try reconfigure with `HIPFLAGS="--gpu-max-threads-per-block=1024"` and rebuild, e.g. + ```bash + HIPCXX="$(hipconfig -l)/clang" HIP_PATH="$(hipconfig -p)" \ + HIPFLAGS="--gpu-max-threads-per-block=1024" \ + cmake -S . -B build -DGGML_HIP=ON -DAMDGPU_TARGETS=gfx906 -DCMAKE_BUILD_TYPE=Release \ + && cmake --build build -- -j 16 + ``` + - Using `CMake` for Windows (using x64 Native Tools Command Prompt for VS, and assuming a gfx1100-compatible AMD GPU): ```bash set PATH=%HIP_PATH%\bin;%PATH%