Command line switch to use F16 for memory_k and memory_v (refactor of #154) (#294)

* Use F16 for memory_k and memory_v

* add command line switch to use f16 instead of f32 for memory k+v

---------

Co-authored-by: Ty Everett <ty@tyweb.us>
This commit is contained in:
Erik Scholz 2023-03-19 18:57:00 +01:00 committed by GitHub
parent 160bfb217d
commit 0b366e7357
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
3 changed files with 11 additions and 6 deletions

View file

@ -18,6 +18,7 @@ struct gpt_params {
int32_t n_predict = 128; // new tokens to predict
int32_t repeat_last_n = 64; // last n tokens to penalize
int32_t n_ctx = 512; //context size
bool memory_f16 = false; // use f16 instead of f32 for memory kv
// sampling parameters
int32_t top_k = 40;