common: llama_load_model_from_url using --model-url (#6098)

* common: llama_load_model_from_url with libcurl dependency

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
This commit is contained in:
Pierrick Hymbert 2024-03-17 19:12:37 +01:00 committed by GitHub
parent cd776c37c9
commit d01b3c4c32
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
16 changed files with 397 additions and 55 deletions

View file

@ -4,7 +4,8 @@ Feature: llama.cpp server
Background: Server startup
Given a server listening on localhost:8080
And a model file tinyllamas/stories260K.gguf from HF repo ggml-org/models
And a model url https://huggingface.co/ggml-org/models/resolve/main/tinyllamas/stories260K.gguf
And a model file stories260K.gguf
And a model alias tinyllama-2
And 42 as server seed
# KV Cache corresponds to the total amount of tokens