diff --git a/examples/server/tests/features/embeddings.feature b/examples/server/tests/features/embeddings.feature index e34ea5e59..818ea3beb 100644 --- a/examples/server/tests/features/embeddings.feature +++ b/examples/server/tests/features/embeddings.feature @@ -10,7 +10,7 @@ Feature: llama.cpp server And 42 as server seed And 2 slots # the bert-bge-small model has context size of 512 - # since the generated prompts are as big as the batch size, we need to set the batch size to 512 + # since the generated prompts are as big as the batch size, we need to set the batch size to <= 512 # ref: https://huggingface.co/BAAI/bge-small-en-v1.5/blob/5c38ec7c405ec4b44b94cc5a9bb96e735b38267a/config.json#L20 And 128 as batch size And 128 as ubatch size