ggml : introduce structs for the q4 data blocks (#356)

* Introduce structs for the q4 data blocks * ggml : rename quant struct variables + fix ARM_NEON --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
2023-03-28 15:56:03 +00:00 · 2023-03-28 15:56:03 +00:00 · c1f885067c
commit c1f885067c
parent e0670260fb
6 changed files with 150 additions and 235 deletions
--- a/llama.h
+++ b/llama.h
@ -81,8 +81,7 @@ extern "C" {
    LLAMA_API int llama_model_quantize(
            const char * fname_inp,
            const char * fname_out,
-                   int   itype,
-                   int   qk);
+                   int   itype);

    // Run the llama inference to obtain the logits and probabilities for the next token.
    // tokens + n_tokens is the provided batch of new tokens to process