server: tests: add passkey test
This commit is contained in:
parent
319ded7dde
commit
18e739d61d
2 changed files with 51 additions and 5 deletions
51
examples/server/tests/features/passkey.feature
Normal file
51
examples/server/tests/features/passkey.feature
Normal file
|
@ -0,0 +1,51 @@
|
|||
#@llama.cpp
|
||||
@passkey
|
||||
@wip
|
||||
@slow
|
||||
@bug
|
||||
Feature: Passkey / Self-extend with context shift
|
||||
|
||||
Background: Server startup
|
||||
Given a server listening on localhost:8080
|
||||
|
||||
# Generates a long text of junk and inserts a secret passkey number inside it.
|
||||
# We process the entire prompt using batches of n_batch and shifting the cache
|
||||
# when it is full and then we query the LLM for the secret passkey.
|
||||
# see #3856 and #4810
|
||||
Scenario Outline: Passkey
|
||||
Given a model file <hf_file> from HF repo <hf_repo>
|
||||
And <n_batch> as batch size
|
||||
And <n_junk> as number of junk
|
||||
And a self-extend context with a factor of <n_grp>
|
||||
And <seed> as seed
|
||||
And a KV cache size based on the model trained context <n_ctx_train> extended by <n_grp> with additional <n_keep> tokens
|
||||
And 1 slots
|
||||
# Can be override with N_GPU_LAYERS
|
||||
And <ngl> GPU offloaded layers
|
||||
Then the server is starting
|
||||
Then the server is healthy
|
||||
Given available models
|
||||
Then model 0 is trained on <n_ctx_train> tokens context
|
||||
Given a prefix prompt:
|
||||
"""
|
||||
here is an important info hidden inside a lot of irrelevant text. Find it and memorize them. I will quiz you about the important information there.
|
||||
"""
|
||||
And a passkey prompt template:
|
||||
"""
|
||||
The pass key is <passkey> Remember it. <passkey> is the pass key.
|
||||
"""
|
||||
And a junk suffix prompt:
|
||||
"""
|
||||
The grass is green. The sky is blue. The sun is yellow. Here we go. There and back again.
|
||||
"""
|
||||
And a suffix prompt:
|
||||
"""
|
||||
What is the pass key? The pass key is
|
||||
"""
|
||||
Given a "<passkey>" passkey challenge prompt with the passkey inserted every <i_pos> junk
|
||||
And a completion request with no api error
|
||||
Then <n_predicted> tokens are predicted matching <re_content>
|
||||
|
||||
Examples:
|
||||
| hf_repo | hf_file | n_ctx_train | ngl | n_batch | n_junk | n_grp | i_pos | seed | n_keep | passkey | n_predicted | re_content |
|
||||
| TheBloke/phi-2-GGUF | phi-2.Q4_K_M.gguf | 2048 | 5 | 512 | 250 | 4 | 50 | 86 | 32 | 42 | 4 | .*42.* |
|
|
@ -298,11 +298,6 @@ def step_prompt_passkey(context, passkey, i_pos):
|
|||
context.prompts.append(prompt)
|
||||
|
||||
|
||||
@step(u'The passkey is found')
|
||||
def step_passkey_found(context):
|
||||
raise NotImplementedError(u'STEP: Then The passkey is found')
|
||||
|
||||
|
||||
@step(u'an OAI compatible chat completions request with {api_error} api error')
|
||||
@async_run_until_complete
|
||||
async def step_oai_chat_completions(context, api_error):
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue