server: tests: disable issue 3969 scenario

This commit is contained in:
Pierrick HYMBERT 2024-02-20 23:35:44 +01:00
parent b0b6d83c76
commit 1ecda0d13e
2 changed files with 8 additions and 3 deletions

View file

@ -5,11 +5,13 @@ Feature: llama.cpp server
Then the server is starting Then the server is starting
Then the server is healthy Then the server is healthy
@llama.cpp
Scenario: Health Scenario: Health
When the server is healthy When the server is healthy
Then the server is ready Then the server is ready
And all slots are idle And all slots are idle
@llama.cpp
Scenario Outline: Completion Scenario Outline: Completion
Given a <prompt> completion request with maximum <n_predict> tokens Given a <prompt> completion request with maximum <n_predict> tokens
Then <predicted_n> tokens are predicted Then <predicted_n> tokens are predicted
@ -19,6 +21,7 @@ Feature: llama.cpp server
| I believe the meaning of life is | 128 | 128 | | I believe the meaning of life is | 128 | 128 |
| Write a joke about AI | 512 | 512 | | Write a joke about AI | 512 | 512 |
@llama.cpp
Scenario Outline: OAI Compatibility Scenario Outline: OAI Compatibility
Given a system prompt <system_prompt> Given a system prompt <system_prompt>
And a user prompt <user_prompt> And a user prompt <user_prompt>
@ -33,6 +36,7 @@ Feature: llama.cpp server
| llama-2 | You are ChatGPT. | Say hello. | 64 | false | 64 | | llama-2 | You are ChatGPT. | Say hello. | 64 | false | 64 |
| codellama70b | You are a coding assistant. | Write the fibonacci function in c++. | 512 | true | 512 | | codellama70b | You are a coding assistant. | Write the fibonacci function in c++. | 512 | true | 512 |
@llama.cpp
Scenario: Multi users Scenario: Multi users
Given a prompt: Given a prompt:
""" """
@ -50,7 +54,7 @@ Feature: llama.cpp server
And all slots are idle And all slots are idle
Then all prompts are predicted Then all prompts are predicted
@llama.cpp
Scenario: Multi users OAI Compatibility Scenario: Multi users OAI Compatibility
Given a system prompt "You are an AI assistant." Given a system prompt "You are an AI assistant."
And a model tinyllama-2 And a model tinyllama-2
@ -71,7 +75,8 @@ Feature: llama.cpp server
And all slots are idle And all slots are idle
Then all prompts are predicted Then all prompts are predicted
# FIXME: infinite loop on the CI, not locally, if n_prompt * n_predict > kv_size # FIXME: #3969 infinite loop on the CI, not locally, if n_prompt * n_predict > kv_size
@bug
Scenario: Multi users with total number of tokens to predict exceeds the KV Cache size Scenario: Multi users with total number of tokens to predict exceeds the KV Cache size
Given a prompt: Given a prompt:
""" """

View file

@ -32,4 +32,4 @@ set -eu
"$@" & "$@" &
# Start tests # Start tests
behave --summary --stop behave --summary --stop --tags llama.cpp