build
: rename main → llama-cli, server → llama-server, llava-cli → llama-llava-cli, etc... (#7809)
* `main`/`server`: rename to `llama` / `llama-server` for consistency w/ homebrew
* server: update refs -> llama-server
gitignore llama-server
* server: simplify nix package
* main: update refs -> llama
fix examples/main ref
* main/server: fix targets
* update more names
* Update build.yml
* rm accidentally checked in bins
* update straggling refs
* Update .gitignore
* Update server-llm.sh
* main: target name -> llama-cli
* Prefix all example bins w/ llama-
* fix main refs
* rename {main->llama}-cmake-pkg binary
* prefix more cmake targets w/ llama-
* add/fix gbnf-validator subfolder to cmake
* sort cmake example subdirs
* rm bin files
* fix llama-lookup-* Makefile rules
* gitignore /llama-*
* rename Dockerfiles
* rename llama|main -> llama-cli; consistent RPM bin prefixes
* fix some missing -cli suffixes
* rename dockerfile w/ llama-cli
* rename(make): llama-baby-llama
* update dockerfile refs
* more llama-cli(.exe)
* fix test-eval-callback
* rename: llama-cli-cmake-pkg(.exe)
* address gbnf-validator unused fread warning (switched to C++ / ifstream)
* add two missing llama- prefixes
* Updating docs for eval-callback binary to use new `llama-` prefix.
* Updating a few lingering doc references for rename of main to llama-cli
* Updating `run-with-preset.py` to use new binary names.
Updating docs around `perplexity` binary rename.
* Updating documentation references for lookup-merge and export-lora
* Updating two small `main` references missed earlier in the finetune docs.
* Update apps.nix
* update grammar/README.md w/ new llama-* names
* update llama-rpc-server bin name + doc
* Revert "update llama-rpc-server bin name + doc"
This reverts commit e474ef1df4
.
* add hot topic notice to README.md
* Update README.md
* Update README.md
* rename gguf-split & quantize bins refs in **/tests.sh
---------
Co-authored-by: HanClinto <hanclinto@gmail.com>
This commit is contained in:
parent
963552903f
commit
1c641e6aac
128 changed files with 578 additions and 578 deletions
|
@ -1,4 +1,4 @@
|
|||
set(TARGET finetune)
|
||||
set(TARGET llama-finetune)
|
||||
add_executable(${TARGET} finetune.cpp)
|
||||
install(TARGETS ${TARGET} RUNTIME)
|
||||
target_link_libraries(${TARGET} PRIVATE common llama ${CMAKE_THREAD_LIBS_INIT})
|
||||
|
|
|
@ -7,7 +7,7 @@ Basic usage instructions:
|
|||
wget https://raw.githubusercontent.com/brunoklein99/deep-learning-notes/master/shakespeare.txt
|
||||
|
||||
# finetune LORA adapter
|
||||
./bin/finetune \
|
||||
./bin/llama-finetune \
|
||||
--model-base open-llama-3b-v2-q8_0.gguf \
|
||||
--checkpoint-in chk-lora-open-llama-3b-v2-q8_0-shakespeare-LATEST.gguf \
|
||||
--checkpoint-out chk-lora-open-llama-3b-v2-q8_0-shakespeare-ITERATION.gguf \
|
||||
|
@ -18,7 +18,7 @@ wget https://raw.githubusercontent.com/brunoklein99/deep-learning-notes/master/s
|
|||
--use-checkpointing
|
||||
|
||||
# predict
|
||||
./bin/main -m open-llama-3b-v2-q8_0.gguf --lora lora-open-llama-3b-v2-q8_0-shakespeare-LATEST.bin
|
||||
./bin/llama-cli -m open-llama-3b-v2-q8_0.gguf --lora lora-open-llama-3b-v2-q8_0-shakespeare-LATEST.bin
|
||||
```
|
||||
|
||||
**Only llama based models are supported!** The output files will be saved every N iterations (config with `--save-every N`).
|
||||
|
@ -38,14 +38,14 @@ After 10 more iterations:
|
|||
Checkpoint files (`--checkpoint-in FN`, `--checkpoint-out FN`) store the training process. When the input checkpoint file does not exist, it will begin finetuning a new randomly initialized adapter.
|
||||
|
||||
llama.cpp compatible LORA adapters will be saved with filename specified by `--lora-out FN`.
|
||||
These LORA adapters can then be used by `main` together with the base model, like in the 'predict' example command above.
|
||||
These LORA adapters can then be used by `llama-cli` together with the base model, like in the 'predict' example command above.
|
||||
|
||||
In `main` you can also load multiple LORA adapters, which will then be mixed together.
|
||||
In `llama-cli` you can also load multiple LORA adapters, which will then be mixed together.
|
||||
|
||||
For example if you have two LORA adapters `lora-open-llama-3b-v2-q8_0-shakespeare-LATEST.bin` and `lora-open-llama-3b-v2-q8_0-bible-LATEST.bin`, you can mix them together like this:
|
||||
|
||||
```bash
|
||||
./bin/main -m open-llama-3b-v2-q8_0.gguf \
|
||||
./bin/llama-cli -m open-llama-3b-v2-q8_0.gguf \
|
||||
--lora lora-open-llama-3b-v2-q8_0-shakespeare-LATEST.bin \
|
||||
--lora lora-open-llama-3b-v2-q8_0-bible-LATEST.bin
|
||||
```
|
||||
|
@ -55,7 +55,7 @@ You can change how strong each LORA adapter is applied to the base model by usin
|
|||
For example to apply 40% of the 'shakespeare' LORA adapter, 80% of the 'bible' LORA adapter and 100% of yet another one:
|
||||
|
||||
```bash
|
||||
./bin/main -m open-llama-3b-v2-q8_0.gguf \
|
||||
./bin/llama-cli -m open-llama-3b-v2-q8_0.gguf \
|
||||
--lora-scaled lora-open-llama-3b-v2-q8_0-shakespeare-LATEST.bin 0.4 \
|
||||
--lora-scaled lora-open-llama-3b-v2-q8_0-bible-LATEST.bin 0.8 \
|
||||
--lora lora-open-llama-3b-v2-q8_0-yet-another-one-LATEST.bin
|
||||
|
|
|
@ -2,7 +2,7 @@
|
|||
cd `dirname $0`
|
||||
cd ../..
|
||||
|
||||
EXE="./finetune"
|
||||
EXE="./llama-finetune"
|
||||
|
||||
if [[ ! $LLAMA_MODEL_DIR ]]; then LLAMA_MODEL_DIR="./models"; fi
|
||||
if [[ ! $LLAMA_TRAINING_DIR ]]; then LLAMA_TRAINING_DIR="."; fi
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue