ggml : introduce structs for the q4 data blocks (#356)

* Introduce structs for the q4 data blocks

* ggml : rename quant struct variables + fix ARM_NEON

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
This commit is contained in:
Stephan Walter 2023-03-28 15:56:03 +00:00 committed by GitHub
parent e0670260fb
commit c1f885067c
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
6 changed files with 150 additions and 235 deletions

View file

@ -81,8 +81,7 @@ extern "C" {
LLAMA_API int llama_model_quantize(
const char * fname_inp,
const char * fname_out,
int itype,
int qk);
int itype);
// Run the llama inference to obtain the logits and probabilities for the next token.
// tokens + n_tokens is the provided batch of new tokens to process