diff --git a/examples/server/tests/features/slotsave.feature b/examples/server/tests/features/slotsave.feature
index 37eefd5c0..9f1e58d23 100644
--- a/examples/server/tests/features/slotsave.feature
+++ b/examples/server/tests/features/slotsave.feature
@@ -1,5 +1,5 @@
 @llama.cpp
-@server
+@slotsave
 Feature: llama.cpp server slot management
 
   Background: Server startup
@@ -15,34 +15,44 @@ Feature: llama.cpp server slot management
     Then  the server is healthy
 
   Scenario: Save and Restore Slot
+    # First prompt in slot 1 should be fully processed
     Given a user prompt "What is the capital of France?"
     And   using slot id 1
     And   a completion request with no api error
-    Then  24 tokens are predicted matching Lily
+    Then  24 tokens are predicted matching (Lily|cake)
     And   22 prompt tokens are processed
     When  the slot 1 is saved with filename "slot1.bin"
     Then  the server responds with status code 200
+    # Since we have cache, this should only process the last tokens
     Given a user prompt "What is the capital of Germany?"
     And   a completion request with no api error
     Then  24 tokens are predicted matching Thank
     And   7 prompt tokens are processed
-    When  the slot 2 is restored with filename "slot1.bin"
+    # Loading the original cache into slot 0,
+    # we should only be processing 1 prompt token and get the same output
+    When  the slot 0 is restored with filename "slot1.bin"
     Then  the server responds with status code 200
     Given a user prompt "What is the capital of France?"
-    And   using slot id 2
+    And   using slot id 0
     And   a completion request with no api error
-    Then  24 tokens are predicted matching Lily
+    Then  24 tokens are predicted matching (Lily|cake)
+    And   1 prompt tokens are processed
+    # For verification that slot 1 was not corrupted during slot 0 load, same thing
+    Given a user prompt "What is the capital of Germany?"
+    And   using slot id 1
+    And   a completion request with no api error
+    Then  24 tokens are predicted matching Thank
     And   1 prompt tokens are processed
 
   Scenario: Erase Slot
     Given a user prompt "What is the capital of France?"
     And   using slot id 1
     And   a completion request with no api error
-    Then  24 tokens are predicted matching Lily
+    Then  24 tokens are predicted matching (Lily|cake)
     And   22 prompt tokens are processed
     When  the slot 1 is erased
     Then  the server responds with status code 200
     Given a user prompt "What is the capital of France?"
     And   a completion request with no api error
-    Then  24 tokens are predicted matching Lily
+    Then  24 tokens are predicted matching (Lily|cake)
     And   22 prompt tokens are processed