server : refactor multitask handling (#9274)

* server : remove multitask from server_task

* refactor completions handler

* fix embeddings

* use res_ok everywhere

* small change for handle_slots_action

* use unordered_set everywhere

* (try) fix test

* no more "mutable" lambda

* Apply suggestions from code review

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* use deque

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
This commit is contained in:
Xuan Son Nguyen 2024-09-02 17:11:51 +02:00 committed by GitHub
parent b60074f1c2
commit 6e7d133a5f
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
5 changed files with 365 additions and 462 deletions

View file

@ -818,7 +818,7 @@ async def concurrent_requests(context, f_completion, *args, **kwargs):
for prompt_no in range(context.n_prompts):
shifted_args = [context.prompts.pop(), seeds[prompt_no], *args]
context.concurrent_tasks.append(asyncio.create_task(f_completion(*shifted_args, **kwargs)))
await asyncio.sleep(0.1)
await asyncio.sleep(0.01)
@step('the slot {slot_id:d} is saved with filename "{filename}"')