server : add --no-context-shift option (#9607)
* server : add --no-context-shift option * small fix * Update examples/server/tests/features/embeddings.feature Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * tests : minor fix * revert usage of GGML_ASSERT * update server documentation --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
This commit is contained in:
parent
f0c7b5edf8
commit
0b3bf966f4
6 changed files with 139 additions and 22 deletions
|
@ -10,11 +10,11 @@ Feature: llama.cpp server
|
|||
And 42 as server seed
|
||||
And 2 slots
|
||||
# the bert-bge-small model has context size of 512
|
||||
# since the generated prompts are as big as the batch size, we need to set the batch size to 512
|
||||
# since the generated prompts are as big as the batch size, we need to set the batch size to <= 512
|
||||
# ref: https://huggingface.co/BAAI/bge-small-en-v1.5/blob/5c38ec7c405ec4b44b94cc5a9bb96e735b38267a/config.json#L20
|
||||
And 512 as batch size
|
||||
And 512 as ubatch size
|
||||
And 2048 KV cache size
|
||||
And 128 as batch size
|
||||
And 128 as ubatch size
|
||||
And 512 KV cache size
|
||||
And embeddings extraction
|
||||
Then the server is starting
|
||||
Then the server is healthy
|
||||
|
@ -26,6 +26,20 @@ Feature: llama.cpp server
|
|||
"""
|
||||
Then embeddings are generated
|
||||
|
||||
Scenario: Embedding (error: prompt too long)
|
||||
When embeddings are computed for:
|
||||
"""
|
||||
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
|
||||
Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.
|
||||
Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.
|
||||
Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
|
||||
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
|
||||
Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.
|
||||
Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.
|
||||
Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
|
||||
"""
|
||||
And embeddings request with 500 api error
|
||||
|
||||
Scenario: OAI Embeddings compatibility
|
||||
Given a model bert-bge-small
|
||||
When an OAI compatible embeddings computation request for:
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue