doc: add doc for ubatch-size

This commit is contained in:
Ting Sun 2024-03-27 13:34:26 +07:00
parent c8d4b6b54e
commit 62397b7757

View file

@ -296,7 +296,9 @@ These options help improve the performance and memory usage of the LLaMA models.
### Batch Size ### Batch Size
- `-b N, --batch-size N`: Set the batch size for prompt processing (default: 2048). This large batch size benefits users who have BLAS installed and enabled it during the build. If you don't have BLAS enabled ("BLAS=0"), you can use a smaller number, such as 8, to see the prompt progress as it's evaluated in some situations. - `-b N, --batch-size N`: Set the batch size for prompt processing (default: `2048`). This large batch size benefits users who have BLAS installed and enabled it during the build. If you don't have BLAS enabled ("BLAS=0"), you can use a smaller number, such as 8, to see the prompt progress as it's evaluated in some situations.
- `-ub N`, `--ubatch-size N`: physical maximum batch size. This is for pipeline parallelization. Default: `512`.
### Prompt Caching ### Prompt Caching