From 62ef858d0059f9b0ad2754536d89aa324e2abd04 Mon Sep 17 00:00:00 2001 From: pudepiedj Date: Fri, 23 Feb 2024 10:01:24 +0000 Subject: [PATCH] Update README.md --- examples/server/README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/examples/server/README.md b/examples/server/README.md index 9e6433bf1..9ebfc9fba 100644 --- a/examples/server/README.md +++ b/examples/server/README.md @@ -41,8 +41,8 @@ see https://github.com/ggerganov/llama.cpp/issues/1437 - `-cb`, `--cont-batching`: enable continuous batching (a.k.a dynamic batching) (default: disabled) - `-spf FNAME`, `--system-prompt-file FNAME` Set a file to load "a system prompt (initial prompt of all slots), this is useful for chat applications. [See more](#change-system-prompt-on-runtime) - `--mmproj MMPROJ_FILE`: Path to a multimodal projector file for LLaVA. -- '-skvg' or '--show-graphics': display a dynamic graphic of kvcache occupancy per slot. -- '-skvi' or '--show-interactive-graphics': display a dynamic graphic of kvcache that requires user intervention to move on after each request +- `-skvg` or `--show-graphics`: display a dynamic graphic of kvcache occupancy per slot. +- `-skvi` or `--show-interactive-graphics`: display a dynamic graphic of kvcache that requires user intervention to move on after each request - `--grp-attn-n`: Set the group attention factor to extend context size through self-extend(default: 1=disabled), used together with group attention width `--grp-attn-w` - `--grp-attn-w`: Set the group attention width to extend context size through self-extend(default: 512), used together with group attention factor `--grp-attn-n` - `-n, --n-predict`: Set the maximum tokens to predict (default: -1)