From b509b8b3de4bb89eb3803d0d939a6519a41c929a Mon Sep 17 00:00:00 2001 From: Jan Boon Date: Sat, 30 Mar 2024 23:57:38 +0800 Subject: [PATCH] add special --- examples/server/tests/features/slotsave.feature | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/examples/server/tests/features/slotsave.feature b/examples/server/tests/features/slotsave.feature index 9f1e58d23..1c281c074 100644 --- a/examples/server/tests/features/slotsave.feature +++ b/examples/server/tests/features/slotsave.feature @@ -26,7 +26,7 @@ Feature: llama.cpp server slot management # Since we have cache, this should only process the last tokens Given a user prompt "What is the capital of Germany?" And a completion request with no api error - Then 24 tokens are predicted matching Thank + Then 24 tokens are predicted matching (Thank|special) And 7 prompt tokens are processed # Loading the original cache into slot 0, # we should only be processing 1 prompt token and get the same output @@ -41,7 +41,7 @@ Feature: llama.cpp server slot management Given a user prompt "What is the capital of Germany?" And using slot id 1 And a completion request with no api error - Then 24 tokens are predicted matching Thank + Then 24 tokens are predicted matching (Thank|special) And 1 prompt tokens are processed Scenario: Erase Slot