server: tests - add explanation about KV Cache.
This commit is contained in:
parent
482eb30f89
commit
60781f0a2b
1 changed files with 3 additions and 0 deletions
|
@ -6,6 +6,9 @@ Feature: llama.cpp server
|
||||||
And a model file stories260K.gguf
|
And a model file stories260K.gguf
|
||||||
And a model alias tinyllama-2
|
And a model alias tinyllama-2
|
||||||
And 42 as server seed
|
And 42 as server seed
|
||||||
|
# KV Cache corresponds to the total amount of tokens
|
||||||
|
# that can be stored across all independent sequences: #4130
|
||||||
|
# see --ctx-size and #5568
|
||||||
And 32 KV cache size
|
And 32 KV cache size
|
||||||
And 1 slots
|
And 1 slots
|
||||||
And embeddings extraction
|
And embeddings extraction
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue