gguf-py : add support for I8, I16 and I32 (#6045)

* Refactor dtype handling to be extensible

This code is equivalent as before, but now it is prepared to easily add
more NumPy dtypes.

* Add support for I8, I16 and I32

These types are allowed in the GGUF specification.

* Add support for I8, I16 and I32 to gguf_writer

* Add support for I8, I16, I32 to gguf_reader
This commit is contained in:
Ondřej Čertík 2024-03-14 04:40:14 -06:00 committed by GitHub
parent 3fe8d7a17f
commit 3ca23481dd
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
3 changed files with 27 additions and 4 deletions

View file

@ -248,6 +248,15 @@ class GGUFReader:
elif ggml_type == GGMLQuantizationType.F16:
item_count = n_elems
item_type = np.float16
elif ggml_type == GGMLQuantizationType.I8:
item_count = n_elems
item_type = np.int8
elif ggml_type == GGMLQuantizationType.I16:
item_count = n_elems
item_type = np.int16
elif ggml_type == GGMLQuantizationType.I32:
item_count = n_elems
item_type = np.int32
else:
item_count = n_bytes
item_type = np.uint8