server: tests: keep only the PHI-2 test

2024-03-02 20:53:00 +01:00 · 2024-03-02 20:53:00 +01:00 · a6ea72541f
commit a6ea72541f
parent 2cdd21e26b
1 changed files with 5 additions and 5 deletions
--- a/examples/server/tests/features/passkey.feature
+++ b/examples/server/tests/features/passkey.feature
@ -7,8 +7,7 @@ Feature: Passkey / Self-extend with context shift
    Given a server listening on localhost:8080
  # Generates a long text of junk and inserts a secret passkey number inside it.
-  # We process the entire prompt using batches of n_batch and shifting the cache
+  # Then we query the LLM for the secret passkey.
  # when it is full and then we query the LLM for the secret passkey.
  # see #3856 and #4810
  Scenario Outline: Passkey
    Given a model file <hf_file> from HF repo <hf_repo>
@ -17,6 +16,7 @@ Feature: Passkey / Self-extend with context shift
    And   <n_predicted> server max tokens to predict
    And   42 as seed
    And   <n_ctx> KV cache size
    And   1 slots
    And   <n_ga> group attention factor to extend context size through self-extend
    And   <n_ga_w> group attention width to extend context size through self-extend
    # Can be override with N_GPU_LAYERS
@ -47,7 +47,7 @@ Feature: Passkey / Self-extend with context shift
    Examples:
      | hf_repo                         | hf_file                     | n_ctx_train | ngl | n_ctx | n_batch | n_ga | n_ga_w | n_junk | i_pos | passkey | n_predicted | re_content |
-      | TheBloke/phi-2-GGUF             | phi-2.Q4_K_M.gguf           | 2048        | 5   | 8192  | 512     | 16   | 512    | 250    | 50    | 42      | 1           | 42         |
+      | TheBloke/phi-2-GGUF             | phi-2.Q4_K_M.gguf           | 2048        | 5   | 8192  | 512     | 4    | 512    | 250    | 50    | 42      | 1           | 42         |
-      | TheBloke/Llama-2-7B-GGUF        | llama-2-7b.Q2_K.gguf        | 4096        | 3   | 16384 | 512     | 4    | 512    | 500    | 300   | 1234    | 5           | 1234       |
+      #| TheBloke/Llama-2-7B-GGUF        | llama-2-7b.Q2_K.gguf        | 4096        | 3   | 16384 | 512     | 4    | 512    | 500    | 300   | 1234    | 5           | 1234       |
-      | TheBloke/Mixtral-8x7B-v0.1-GGUF | mixtral-8x7b-v0.1.Q2_K.gguf | 4096        | 2   | 16384 | 512     | 4    | 512    | 500    | 100   | 0987    | 5           | 0987       |
+      #| TheBloke/Mixtral-8x7B-v0.1-GGUF | mixtral-8x7b-v0.1.Q2_K.gguf | 32768       | 2   | 16384 | 512     | 4    | 512    | 500    | 100   | 0987    | 5           | 0987       |