llama: increase MEM_REQ_EVAL for MODEL_3B

It avoids crashing for quantized weights on CPU. Better ways to calculate the required buffer size would be better.
2023-07-03 21:31:34 -05:00 · 2023-07-03 21:31:34 -05:00 · 5c6eed39ee
commit 5c6eed39ee
parent 41819b0bd7
1 changed files with 1 additions and 1 deletions
--- a/llama.cpp
+++ b/llama.cpp
@ -122,7 +122,7 @@ static const std::map<e_model, size_t> & MEM_REQ_KV_SELF()
 static const std::map<e_model, size_t> & MEM_REQ_EVAL()
 {
    static std::map<e_model, size_t> k_sizes = {
-        { MODEL_3B,   512ull * MB },
+        { MODEL_3B,   640ull * MB },
        { MODEL_7B,   768ull * MB },
        { MODEL_13B, 1024ull * MB },
        { MODEL_30B, 1280ull * MB },