clarify build process for ROCm on linux with cmake
This commit is contained in:
parent
d59c0b3a56
commit
e754a83a40
1 changed files with 6 additions and 6 deletions
12
README.md
12
README.md
|
@ -421,12 +421,11 @@ Building the program with BLAS support may lead to some performance improvements
|
|||
```bash
|
||||
make LLAMA_HIPBLAS=1
|
||||
```
|
||||
- Using `CMake` for Linux:
|
||||
- Using `CMake` for Linux.
|
||||
```bash
|
||||
mkdir build
|
||||
cd build
|
||||
CC=/opt/rocm/llvm/bin/clang CXX=/opt/rocm/llvm/bin/clang++ cmake .. -DLLAMA_HIPBLAS=ON
|
||||
cmake --build .
|
||||
CC=/opt/rocm/llvm/bin/clang CXX=/opt/rocm/llvm/bin/clang++ \
|
||||
cmake -H. -Bbuild -DLLAMA_HIPBLAS=ON -DAMDGPU_TARGETS=gfx1030 -DCMAKE_BUILD_TYPE=Release \
|
||||
&& cmake --build build -- -j 16
|
||||
```
|
||||
- Using `CMake` for Windows (using x64 Native Tools Command Prompt for VS):
|
||||
```bash
|
||||
|
@ -437,10 +436,11 @@ Building the program with BLAS support may lead to some performance improvements
|
|||
cmake --build .
|
||||
```
|
||||
Make sure that `AMDGPU_TARGETS` is set to the GPU arch you want to compile for. The above example uses `gfx1100` that corresponds to Radeon RX 7900XTX/XT/GRE. You can find a list of targets [here](https://llvm.org/docs/AMDGPUUsage.html#processors)
|
||||
Find your gpu version string by matching the most significant version information from `rocminfo | grep gfx | head -1 | awk '{print $2}'` with the list of processors, e.g. `gfx1035` maps to `gfx1030`.
|
||||
|
||||
|
||||
The environment variable [`HIP_VISIBLE_DEVICES`](https://rocm.docs.amd.com/en/latest/understand/gpu_isolation.html#hip-visible-devices) can be used to specify which GPU(s) will be used.
|
||||
If your GPU is not officially supported you can use the environment variable [`HSA_OVERRIDE_GFX_VERSION`] set to a similar GPU, for example 10.3.0 on RDNA2 or 11.0.0 on RDNA3.
|
||||
If your GPU is not officially supported you can use the environment variable [`HSA_OVERRIDE_GFX_VERSION`] set to a similar GPU, for example 10.3.0 on RDNA2 (e.g. gfx1030, gfx1031, or gfx1035) or 11.0.0 on RDNA3.
|
||||
The following compilation options are also available to tweak performance (yes, they refer to CUDA, not HIP, because it uses the same code as the cuBLAS version above):
|
||||
|
||||
| Option | Legal values | Default | Description |
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue