test anchor link

2023-06-02 14:47:00 +03:00 · 2023-06-02 14:47:00 +03:00 · 89b377d7f8
commit 89b377d7f8
parent ffb06a345e
3 changed files with 9 additions and 0 deletions
--- a/README.md
+++ b/README.md
@ -293,6 +293,8 @@ Building the program with BLAS support may lead to some performance improvements
  cmake --build . -config Release
  ```

+<a name="cublas"></a>
+
 - **cuBLAS**

  This provides BLAS acceleration using the CUDA cores of your Nvidia GPU. Make sure to have the CUDA toolkit installed. You can download it from your Linux distro's package manager or from here: [CUDA Toolkit](https://developer.nvidia.com/cuda-downloads).
--- a/docs/BLIS.md
+++ b/docs/BLIS.md
--- a/docs/token_generation_performance_tips.md
+++ b/docs/token_generation_performance_tips.md
@ -0,0 +1,7 @@
+# Token generation performance tips
+
+## Verifying that the model is running on the GPU
+Make sure you compiled llama with the correct env variables according to [this guide](../README.md#cublas)
+
+When running `llama.cpp`, outputs some helpful diagnostic information to stderr.
+To verify that the workload is