server: tests: add passkey test

This commit is contained in:
Pierrick HYMBERT 2024-03-02 13:02:05 +01:00
parent 319ded7dde
commit 18e739d61d
2 changed files with 51 additions and 5 deletions

View file

@ -0,0 +1,51 @@
#@llama.cpp
@passkey
@wip
@slow
@bug
Feature: Passkey / Self-extend with context shift
Background: Server startup
Given a server listening on localhost:8080
# Generates a long text of junk and inserts a secret passkey number inside it.
# We process the entire prompt using batches of n_batch and shifting the cache
# when it is full and then we query the LLM for the secret passkey.
# see #3856 and #4810
Scenario Outline: Passkey
Given a model file <hf_file> from HF repo <hf_repo>
And <n_batch> as batch size
And <n_junk> as number of junk
And a self-extend context with a factor of <n_grp>
And <seed> as seed
And a KV cache size based on the model trained context <n_ctx_train> extended by <n_grp> with additional <n_keep> tokens
And 1 slots
# Can be override with N_GPU_LAYERS
And <ngl> GPU offloaded layers
Then the server is starting
Then the server is healthy
Given available models
Then model 0 is trained on <n_ctx_train> tokens context
Given a prefix prompt:
"""
here is an important info hidden inside a lot of irrelevant text. Find it and memorize them. I will quiz you about the important information there.
"""
And a passkey prompt template:
"""
The pass key is <passkey> Remember it. <passkey> is the pass key.
"""
And a junk suffix prompt:
"""
The grass is green. The sky is blue. The sun is yellow. Here we go. There and back again.
"""
And a suffix prompt:
"""
What is the pass key? The pass key is
"""
Given a "<passkey>" passkey challenge prompt with the passkey inserted every <i_pos> junk
And a completion request with no api error
Then <n_predicted> tokens are predicted matching <re_content>
Examples:
| hf_repo | hf_file | n_ctx_train | ngl | n_batch | n_junk | n_grp | i_pos | seed | n_keep | passkey | n_predicted | re_content |
| TheBloke/phi-2-GGUF | phi-2.Q4_K_M.gguf | 2048 | 5 | 512 | 250 | 4 | 50 | 86 | 32 | 42 | 4 | .*42.* |

View file

@ -298,11 +298,6 @@ def step_prompt_passkey(context, passkey, i_pos):
context.prompts.append(prompt) context.prompts.append(prompt)
@step(u'The passkey is found')
def step_passkey_found(context):
raise NotImplementedError(u'STEP: Then The passkey is found')
@step(u'an OAI compatible chat completions request with {api_error} api error') @step(u'an OAI compatible chat completions request with {api_error} api error')
@async_run_until_complete @async_run_until_complete
async def step_oai_chat_completions(context, api_error): async def step_oai_chat_completions(context, api_error):