tune: update readme

2023-06-19 13:50:35 +08:00 · 2023-06-19 13:50:35 +08:00 · 65fd65e0c1
commit 65fd65e0c1
parent 6609c229e8
1 changed files with 50 additions and 27 deletions
--- a/examples/mulmat-tune/README.md
+++ b/examples/mulmat-tune/README.md
@ -19,21 +19,60 @@ run bench ahead of time (saving tens of seconds), but there are two shortcomings
  outdated format. So I integrated mulmat tune into `main` and `perplexity` as
  a complementary solution.
-## Build into main and perplexity
+The `load` mode try validates at least the following fields:
 - version
 - model
 - ftype
 - n_threads
 - n_profiles
 - profiles
 `n_threads` is very critical to performance, to select best n_threads.
 when run `main` or `perplexity`, the n_threads is automatically set, the default
 n_threads generally works well. Example:
 ```
 system_info: n_threads = 4 / 12
 ```
 This is read as use 4 of total 12 cores(with 6 physical cores).
 ## Build
 Compile options:
 - `LLAMA_TUNE` for CMake (default ON)
 - `LLAMA_NO_TUNE` for Make (default undefined)
 `GGML_USE_TUNE` and `GGML_TUNE_NDEBUG` are defined when llama tune is enabled.
 When `GGML_USE_TUNE` is defined, mulmat_tune functionalities are compiled into
 main and perplexity:
 - cli args `--tune`, `--tune-file` are visible.
 - try selecting fastest task profile according to tune result for mul_mat.
 The standalone tool `mulmat-tune` is always build: no compile options.
 **Makefile**
 To use tune, at least one of the vendors have to be built:
 - BLAS(ACCELERATE, OpenNBLAS, BLIS)
 - ClBlast
 - CUDA (may not run)
 To enable the debug, comment out `-DGGML_TUNE_NDEBUG` from Makefile.
 Makefile:
 ```
 make clean && make
 ```
-CMake (with BLAS):
+**CMake**
 ```
-cmake --build . --target clean
+rm -rf build/*
-cmake .. -DLLAMA_BLAS=ON
+cd build
 cmake ..
 cmake --build . --config Release
 ```
-Run examples:
+## Run main or perplexity
 ```
 # bench and run:
@ -48,21 +87,7 @@ Run examples:
 ./main -m ./models/3B/open-llama-3b-q4-0.bin -c 512 -b 1024 -n 256 --keep 48 --repeat_penalty 1.0 --color -i -r "User:" -f prompts/chat-with-bob.txt -t 4 --tune-file <FILE>
 ```
-# Build the standalone `mulmat-tune`
+## Run mulmat-tune tool
 Makefile:
 ```
 make clean && make
 ```
 CMake (with BLAS)
 ```
 cmake --build . --target clean
 cmake .. -DLLAMA_BLAS=ON
 cmake --build . --config Release
 ```
 Run examples:
 ```
 ./mulmat-tune -h
@ -82,8 +107,8 @@ Run examples:
 # customized n_pass: run 1 pass only instead of the default 3.
 ./mulmat-tune --n_pass 1
-# customized n_threads instead of the default 1.
+# customized n_threads instead of the default 4.
-./mulmat-tune --n_threads 4
+./mulmat-tune --n_threads 6
 # save to file
 ./mulmat-tune --file <FILE>
@ -93,9 +118,7 @@ Run examples:
 ```
-# End to End Test
+## Example: compare With Master
 ## Compare With Master
 You may want to run the following commands. Make sure the tune result file is
 setup properly.
@ -103,7 +126,7 @@ setup properly.
 General steps:
 1. run `./mulmat-tune -h` to see how to build for misc vendors.
-   To enable the debug, comment out `-DGGML_TUNE_NDEBUG` from makefile then run:
+   then run:
   ```
   make clean; make