Improve usability of --model-url & related flags (#6930)

* args: default --model to models/ + filename from --model-url or --hf-file (or else legacy models/7B/ggml-model-f16.gguf) * args: main & server now call gpt_params_handle_model_default * args: define DEFAULT_MODEL_PATH + update cli docs * curl: check url of previous download (.json metadata w/ url, etag & lastModified) * args: fix update to quantize-stats.cpp * curl: support legacy .etag / .lastModified companion files * curl: rm legacy .etag file support * curl: reuse regex across headers callback calls * curl: unique_ptr to manage lifecycle of curl & outfile * curl: nit: no need for multiline regex flag * curl: update failed test (model file collision) + gitignore *.gguf.json
2024-04-30 00:52:50 +01:00 · 2024-04-30 00:52:50 +01:00 · 8843a98c2b
commit 8843a98c2b
parent b8c1476e44
7 changed files with 143 additions and 132 deletions
--- a/examples/server/tests/features/embeddings.feature
+++ b/examples/server/tests/features/embeddings.feature
@ -5,7 +5,7 @@ Feature: llama.cpp server
  Background: Server startup
    Given a server listening on localhost:8080
    And   a model url https://huggingface.co/ggml-org/models/resolve/main/bert-bge-small/ggml-model-f16.gguf
-    And   a model file ggml-model-f16.gguf
+    And   a model file bert-bge-small.gguf
    And   a model alias bert-bge-small
    And   42 as server seed
    And   2 slots