From 554b54152145c30618bac171efb712cf4a7d1e96 Mon Sep 17 00:00:00 2001 From: Pavol Rusnak Date: Sat, 18 Mar 2023 21:58:46 +0100 Subject: [PATCH 1/4] Add memory/disk requirements to readme --- README.md | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 808d54e89..fc8b2fda3 100644 --- a/README.md +++ b/README.md @@ -155,7 +155,17 @@ python3 convert-pth-to-ggml.py models/7B/ 1 When running the larger models, make sure you have enough disk space to store all the intermediate files. -TODO: add model disk/mem requirements +### Memory/Disk Requirements + +As the models are currently fully loaded into memory, you will need adequate disk space to save them +and sufficient RAM to load them. At the moment, memory and disk requirements are the same. + +| model | original size | quantized size (4-bit) | +|-------|---------------|------------------------| +| 7B | 13 GB | 3.9 GB | +| 15B | 24 GB | 7.8 GB | +| 30B | 60 GB | 19.5 GB | +| 65B | 120 GB | 38.5 GB | ### Interactive mode From 1e5a6d088d0f3a967c6e86298a756daec9e8df12 Mon Sep 17 00:00:00 2001 From: Pavol Rusnak Date: Sat, 18 Mar 2023 22:20:04 +0100 Subject: [PATCH 2/4] Add note about Python 3.11 to readme --- README.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/README.md b/README.md index fc8b2fda3..187f82f61 100644 --- a/README.md +++ b/README.md @@ -153,6 +153,8 @@ python3 convert-pth-to-ggml.py models/7B/ 1 ./main -m ./models/7B/ggml-model-q4_0.bin -n 128 ``` +Currently, it's best to use Python 3.9 or Python 3.10, as `sentencepiece` has not yet published a wheel for Python 3.11. + When running the larger models, make sure you have enough disk space to store all the intermediate files. ### Memory/Disk Requirements From 6f61c18ec9a30416e21ed5abfb1321bdb14979be Mon Sep 17 00:00:00 2001 From: Pavol Rusnak Date: Sat, 18 Mar 2023 22:39:46 +0100 Subject: [PATCH 3/4] Fix typo in readme --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 187f82f61..1fe5b5426 100644 --- a/README.md +++ b/README.md @@ -165,7 +165,7 @@ and sufficient RAM to load them. At the moment, memory and disk requirements are | model | original size | quantized size (4-bit) | |-------|---------------|------------------------| | 7B | 13 GB | 3.9 GB | -| 15B | 24 GB | 7.8 GB | +| 13B | 24 GB | 7.8 GB | | 30B | 60 GB | 19.5 GB | | 65B | 120 GB | 38.5 GB | From d7def1a7524f712e5ebb7cd02bab0f13aa56a7f9 Mon Sep 17 00:00:00 2001 From: Ronsor Date: Sat, 18 Mar 2023 17:10:47 -0700 Subject: [PATCH 4/4] Warn user if a context size greater than 2048 tokens is specified (#274) LLaMA doesn't support more than 2048 token context sizes, and going above that produces terrible results. --- main.cpp | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/main.cpp b/main.cpp index c88405b82..105dd91ee 100644 --- a/main.cpp +++ b/main.cpp @@ -792,6 +792,11 @@ int main(int argc, char ** argv) { if (gpt_params_parse(argc, argv, params) == false) { return 1; } + + if (params.n_ctx > 2048) { + fprintf(stderr, "%s: warning: model does not support context sizes greater than 2048 tokens (%d specified);" + "expect poor results\n", __func__, params.n_ctx); + } if (params.seed < 0) { params.seed = time(NULL);