tune: update readme
This commit is contained in:
parent
6609c229e8
commit
65fd65e0c1
1 changed files with 50 additions and 27 deletions
|
@ -19,21 +19,60 @@ run bench ahead of time (saving tens of seconds), but there are two shortcomings
|
||||||
outdated format. So I integrated mulmat tune into `main` and `perplexity` as
|
outdated format. So I integrated mulmat tune into `main` and `perplexity` as
|
||||||
a complementary solution.
|
a complementary solution.
|
||||||
|
|
||||||
## Build into main and perplexity
|
The `load` mode try validates at least the following fields:
|
||||||
|
- version
|
||||||
|
- model
|
||||||
|
- ftype
|
||||||
|
- n_threads
|
||||||
|
- n_profiles
|
||||||
|
- profiles
|
||||||
|
|
||||||
|
`n_threads` is very critical to performance, to select best n_threads.
|
||||||
|
when run `main` or `perplexity`, the n_threads is automatically set, the default
|
||||||
|
n_threads generally works well. Example:
|
||||||
|
```
|
||||||
|
system_info: n_threads = 4 / 12
|
||||||
|
```
|
||||||
|
This is read as use 4 of total 12 cores(with 6 physical cores).
|
||||||
|
|
||||||
|
## Build
|
||||||
|
|
||||||
|
Compile options:
|
||||||
|
- `LLAMA_TUNE` for CMake (default ON)
|
||||||
|
- `LLAMA_NO_TUNE` for Make (default undefined)
|
||||||
|
|
||||||
|
`GGML_USE_TUNE` and `GGML_TUNE_NDEBUG` are defined when llama tune is enabled.
|
||||||
|
|
||||||
|
When `GGML_USE_TUNE` is defined, mulmat_tune functionalities are compiled into
|
||||||
|
main and perplexity:
|
||||||
|
- cli args `--tune`, `--tune-file` are visible.
|
||||||
|
- try selecting fastest task profile according to tune result for mul_mat.
|
||||||
|
|
||||||
|
The standalone tool `mulmat-tune` is always build: no compile options.
|
||||||
|
|
||||||
|
**Makefile**
|
||||||
|
|
||||||
|
To use tune, at least one of the vendors have to be built:
|
||||||
|
- BLAS(ACCELERATE, OpenNBLAS, BLIS)
|
||||||
|
- ClBlast
|
||||||
|
- CUDA (may not run)
|
||||||
|
|
||||||
|
To enable the debug, comment out `-DGGML_TUNE_NDEBUG` from Makefile.
|
||||||
|
|
||||||
Makefile:
|
|
||||||
```
|
```
|
||||||
make clean && make
|
make clean && make
|
||||||
```
|
```
|
||||||
|
|
||||||
CMake (with BLAS):
|
**CMake**
|
||||||
|
|
||||||
```
|
```
|
||||||
cmake --build . --target clean
|
rm -rf build/*
|
||||||
cmake .. -DLLAMA_BLAS=ON
|
cd build
|
||||||
|
cmake ..
|
||||||
cmake --build . --config Release
|
cmake --build . --config Release
|
||||||
```
|
```
|
||||||
|
|
||||||
Run examples:
|
## Run main or perplexity
|
||||||
|
|
||||||
```
|
```
|
||||||
# bench and run:
|
# bench and run:
|
||||||
|
@ -48,21 +87,7 @@ Run examples:
|
||||||
./main -m ./models/3B/open-llama-3b-q4-0.bin -c 512 -b 1024 -n 256 --keep 48 --repeat_penalty 1.0 --color -i -r "User:" -f prompts/chat-with-bob.txt -t 4 --tune-file <FILE>
|
./main -m ./models/3B/open-llama-3b-q4-0.bin -c 512 -b 1024 -n 256 --keep 48 --repeat_penalty 1.0 --color -i -r "User:" -f prompts/chat-with-bob.txt -t 4 --tune-file <FILE>
|
||||||
```
|
```
|
||||||
|
|
||||||
# Build the standalone `mulmat-tune`
|
## Run mulmat-tune tool
|
||||||
|
|
||||||
Makefile:
|
|
||||||
```
|
|
||||||
make clean && make
|
|
||||||
```
|
|
||||||
|
|
||||||
CMake (with BLAS)
|
|
||||||
```
|
|
||||||
cmake --build . --target clean
|
|
||||||
cmake .. -DLLAMA_BLAS=ON
|
|
||||||
cmake --build . --config Release
|
|
||||||
```
|
|
||||||
|
|
||||||
Run examples:
|
|
||||||
|
|
||||||
```
|
```
|
||||||
./mulmat-tune -h
|
./mulmat-tune -h
|
||||||
|
@ -82,8 +107,8 @@ Run examples:
|
||||||
# customized n_pass: run 1 pass only instead of the default 3.
|
# customized n_pass: run 1 pass only instead of the default 3.
|
||||||
./mulmat-tune --n_pass 1
|
./mulmat-tune --n_pass 1
|
||||||
|
|
||||||
# customized n_threads instead of the default 1.
|
# customized n_threads instead of the default 4.
|
||||||
./mulmat-tune --n_threads 4
|
./mulmat-tune --n_threads 6
|
||||||
|
|
||||||
# save to file
|
# save to file
|
||||||
./mulmat-tune --file <FILE>
|
./mulmat-tune --file <FILE>
|
||||||
|
@ -93,9 +118,7 @@ Run examples:
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
||||||
# End to End Test
|
## Example: compare With Master
|
||||||
|
|
||||||
## Compare With Master
|
|
||||||
|
|
||||||
You may want to run the following commands. Make sure the tune result file is
|
You may want to run the following commands. Make sure the tune result file is
|
||||||
setup properly.
|
setup properly.
|
||||||
|
@ -103,7 +126,7 @@ setup properly.
|
||||||
General steps:
|
General steps:
|
||||||
|
|
||||||
1. run `./mulmat-tune -h` to see how to build for misc vendors.
|
1. run `./mulmat-tune -h` to see how to build for misc vendors.
|
||||||
To enable the debug, comment out `-DGGML_TUNE_NDEBUG` from makefile then run:
|
then run:
|
||||||
|
|
||||||
```
|
```
|
||||||
make clean; make
|
make clean; make
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue