Added description to server readme.

2024-01-27 00:37:53 +01:00 · 2024-01-27 00:37:53 +01:00 · 4df0e88aed
commit 4df0e88aed
parent 960cfb003f
2 changed files with 5 additions and 1 deletions
--- a/examples/server/README.md
+++ b/examples/server/README.md
@ -30,7 +30,9 @@ Command line options:
 -   `-cb`, `--cont-batching`: enable continuous batching (a.k.a dynamic batching) (default: disabled)
 -   `-spf FNAME`, `--system-prompt-file FNAME` Set a file to load "a system prompt (initial prompt of all slots), this is useful for chat applications. [See more](#change-system-prompt-on-runtime)
 -   `--mmproj MMPROJ_FILE`: Path to a multimodal projector file for LLaVA.
-
+-   `--grp-attn-n`: Extend context size through self extend. Extend context size n-times (default: 1), used together with `--grp-attn-w`
+-   `--grp-attn-w`: Width of the self extend context size extension.  (default: 512) shouldn't be greater than original context size
+- 
 ## Build

 server is build alongside everything else from the root of the project
--- a/examples/server/server.cpp
+++ b/examples/server/server.cpp
@ -1810,6 +1810,8 @@ static void server_print_usage(const char *argv0, const gpt_params &params,
    printf("  --override-kv KEY=TYPE:VALUE\n");
    printf("                        advanced option to override model metadata by key. may be specified multiple times.\n");
    printf("                        types: int, float, bool. example: --override-kv tokenizer.ggml.add_bos_token=bool:false\n");
+    printf("  --grp-attn-n N    Extend context size through self extend. Extend context size n-times (default: 1), used together with `--grp-attn-w`");
+    printf("  --grp-attn-w N    Width of the self extend context size extension. (default: 512) shouldn't be greater than original context size");
    printf("\n");
 }