tests : fix batch size of bert model

ggml-ci
This commit is contained in:
Georgi Gerganov 2024-09-07 23:19:07 +03:00
parent 6726e3f29a
commit 748d516e34
No known key found for this signature in database
GPG key ID: 449E073F9DC10735

View file

@ -9,8 +9,11 @@ Feature: llama.cpp server
And a model alias bert-bge-small And a model alias bert-bge-small
And 42 as server seed And 42 as server seed
And 2 slots And 2 slots
And 1024 as batch size # the bert-bge-small model has context size of 512
And 1024 as ubatch size # since the generated prompts are as big as the batch size, we need to set the batch size to 512
# ref: https://huggingface.co/BAAI/bge-small-en-v1.5/blob/5c38ec7c405ec4b44b94cc5a9bb96e735b38267a/config.json#L20
And 512 as batch size
And 512 as ubatch size
And 2048 KV cache size And 2048 KV cache size
And embeddings extraction And embeddings extraction
Then the server is starting Then the server is starting