build : on Mac OS enable Metal by default (#2901)

* build : on Mac OS enable Metal by default * make : try to fix build on Linux * make : move targets back to the top * make : fix target clean * llama : enable GPU inference by default with Metal * llama : fix vocab_only logic when GPU is enabled * common : better `n_gpu_layers` assignment * readme : update Metal instructions * make : fix merge conflict remnants * gitignore : metal
2023-09-04 22:26:24 +03:00 · 2023-09-04 22:26:24 +03:00 · e36ecdccc8
commit e36ecdccc8
parent bd33e5ab92
9 changed files with 143 additions and 133 deletions
--- a/README.md
+++ b/README.md
@ -280,29 +280,11 @@ In order to build llama.cpp you have three different options.

 ### Metal Build

-Using Metal allows the computation to be executed on the GPU for Apple devices:
+On MacOS, Metal is enabled by default. Using Metal makes the computation run on the GPU.
+To disable the Metal build at compile time use the `LLAMA_NO_METAL=1` flag or the `LLAMA_METAL=OFF` cmake option.

- Using `make`:
-
-  ```bash
-  LLAMA_METAL=1 make
-  ```
-
- Using `CMake`:
-
-    ```bash
-    mkdir build-metal
-    cd build-metal
-    cmake -DLLAMA_METAL=ON ..
-    cmake --build . --config Release
-    ```
-
-When built with Metal support, you can enable GPU inference with the `--gpu-layers|-ngl` command-line argument.
-Any value larger than 0 will offload the computation to the GPU. For example:
-
-```bash
-./main -m ./models/7B/ggml-model-q4_0.gguf -n 128 -ngl 1
-```
+When built with Metal support, you can explicitly disable GPU inference with the `--gpu-layers|-ngl 0` command-line
+argument.

 ### MPI Build