more build.md updates

2024-12-02 18:34:11 +01:00 · 2024-12-02 18:34:11 +01:00 · 0ec5b62f27
commit 0ec5b62f27
parent bd8c5a81ac
1 changed files with 15 additions and 11 deletions
--- a/docs/build.md
+++ b/docs/build.md
@ -63,11 +63,11 @@ When built with Metal support, you can explicitly disable GPU inference with the
 Building the program with BLAS support may lead to some performance improvements in prompt processing using batch sizes higher than 32 (the default is 512). Using BLAS doesn't affect the generation performance. There are currently several different BLAS implementations available for build and use:
-### Accelerate Framework:
+### Accelerate Framework
 This is only available on Mac PCs and it's enabled by default. You can just build using the normal instructions.
-### OpenBLAS:
+### OpenBLAS
 This provides BLAS acceleration using only the CPU. Make sure to have OpenBLAS installed on your machine.
@ -82,15 +82,7 @@ This provides BLAS acceleration using only the CPU. Make sure to have OpenBLAS i
 Check [BLIS.md](./backend/BLIS.md) for more information.
-## SYCL
+### Intel oneMKL
 SYCL is a higher-level programming model to improve programming productivity on various hardware accelerators.
 llama.cpp based on SYCL is used to **support Intel GPU** (Data Center Max series, Flex series, Arc series, Built-in GPU and iGPU).
 For detailed info, please refer to [llama.cpp for SYCL](./backend/SYCL.md).
 ## Intel oneMKL
 Building through oneAPI compilers will make avx_vnni instruction set available for intel processors that do not support avx512 and avx512_vnni. Please note that this build config **does not support Intel GPU**. For Intel GPU support, please refer to [llama.cpp for SYCL](./backend/SYCL.md).
@ -107,6 +99,18 @@ Building through oneAPI compilers will make avx_vnni instruction set available f
 Check [Optimizing and Running LLaMA2 on Intel® CPU](https://www.intel.com/content/www/us/en/content-details/791610/optimizing-and-running-llama2-on-intel-cpu.html) for more information.
 ### Other BLAS libraries
 Any other BLAS library can be used by setting the `GGML_BLAS_VENDOR` option. See the [CMake documentation](https://cmake.org/cmake/help/latest/module/FindBLAS.html#blas-lapack-vendors) for a list of supported vendors.
 ## SYCL
 SYCL is a higher-level programming model to improve programming productivity on various hardware accelerators.
 llama.cpp based on SYCL is used to **support Intel GPU** (Data Center Max series, Flex series, Arc series, Built-in GPU and iGPU).
 For detailed info, please refer to [llama.cpp for SYCL](./backend/SYCL.md).
 ## CUDA
 This provides GPU acceleration using an NVIDIA GPU. Make sure to have the CUDA toolkit installed. You can download it from your Linux distro's package manager (e.g. `apt install nvidia-cuda-toolkit`) or from here: [CUDA Toolkit](https://developer.nvidia.com/cuda-downloads).