test anchor link

This commit is contained in:
Yuval Peled 2023-06-02 14:47:00 +03:00
parent ffb06a345e
commit 89b377d7f8
3 changed files with 9 additions and 0 deletions

View file

@ -293,6 +293,8 @@ Building the program with BLAS support may lead to some performance improvements
cmake --build . -config Release
```
<a name="cublas"></a>
- **cuBLAS**
This provides BLAS acceleration using the CUDA cores of your Nvidia GPU. Make sure to have the CUDA toolkit installed. You can download it from your Linux distro's package manager or from here: [CUDA Toolkit](https://developer.nvidia.com/cuda-downloads).

View file

@ -0,0 +1,7 @@
# Token generation performance tips
## Verifying that the model is running on the GPU
Make sure you compiled llama with the correct env variables according to [this guide](../README.md#cublas)
When running `llama.cpp`, outputs some helpful diagnostic information to stderr.
To verify that the workload is