quantize : be able to override metadata by key (#6321)

* quantize: be able to override metadata by key * minor : spacing --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2024-03-26 13:09:30 +01:00 · 2024-03-26 13:09:30 +01:00 · d25b1c31b0
commit d25b1c31b0
parent deb7240100
3 changed files with 96 additions and 27 deletions
--- a/llama.h
+++ b/llama.h
@ -284,6 +284,7 @@ extern "C" {
        bool only_copy;                      // only copy tensors - ftype, allow_requantize and quantize_output_tensor are ignored
        bool pure;                           // quantize all tensors to the default type
        void * imatrix;                      // pointer to importance matrix data
+        void * kv_overrides;                 // pointer to vector containing overrides
    } llama_model_quantize_params;

    // grammar types