From bf7df95798bd2101ed46ce7868b446e65333f302 Mon Sep 17 00:00:00 2001 From: Xuan Son Nguyen Date: Wed, 1 Jan 2025 19:44:00 +0100 Subject: [PATCH] update docs --- examples/server/README.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/examples/server/README.md b/examples/server/README.md index bcef81946..91b5c9424 100644 --- a/examples/server/README.md +++ b/examples/server/README.md @@ -452,6 +452,8 @@ These words will not be included in the completion, so make sure to add them to `response_fields`: A list of response fields, for example: `"response_fields": ["content", "generation_settings/n_predict"]`. If the specified field is missing, it will simply be omitted from the response without triggering an error. Note that fields with a slash will be unnested; for example, `generation_settings/n_predict` will move the field `n_predict` from the `generation_settings` object to the root of the response and give it a new name. +`lora`: A list of LoRA adapters to be applied to this specific request. Each object in the list must contain `id` and `scale` fields. For example: `[{"id": 0, "scale": 0.5}, {"id": 1, "scale": 1.1}]`. If a LoRA adapter is not specified in the list, its scale will default to `0.0`. Please note that requests with different LoRA configurations will not be batched together, which may result in performance degradation. + **Response format** - Note: In streaming mode (`stream`), only `content`, `tokens` and `stop` will be returned until end of completion. Responses are sent using the [Server-sent events](https://html.spec.whatwg.org/multipage/server-sent-events.html) standard. Note: the browser's `EventSource` interface cannot be used due to its lack of `POST` request support.