| * Add n_key_dim and n_value_dim Some models use values that are not derived from `n_embd`. Also remove `n_embd_head` and `n_embd_gqa` because it is not clear which "head" is referred to (key or value). Fix issue #4648. * Fix `llm_build_kqv` to use `n_value_gqa` * Rebase * Rename variables * Fix llm_build_kqv to be more generic wrt n_embd_head_k * Update default values for n_embd_head_k and n_embd_head_v Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * Fix llm_load_tensors: the asserts were not backcompat --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> | ||
|---|---|---|
| .. | ||
| examples | ||
| gguf | ||
| scripts | ||
| tests | ||
| LICENSE | ||
| pyproject.toml | ||
| README.md | ||
gguf
This is a Python package for writing binary files in the GGUF (GGML Universal File) format.
See convert-llama-hf-to-gguf.py as an example for its usage.
Installation
pip install gguf
API Examples/Simple Tools
examples/writer.py — Generates example.gguf in the current directory to demonstrate generating a GGUF file. Note that this file cannot be used as a model.
scripts/gguf-dump.py — Dumps a GGUF file's metadata to the console.
scripts/gguf-set-metadata.py — Allows changing simple metadata values in a GGUF file by key.
scripts/gguf-convert-endian.py — Allows converting the endianness of GGUF files.
Development
Maintainers who participate in development of this package are advised to install it in editable mode:
cd /path/to/llama.cpp/gguf-py
pip install --editable .
Note: This may require to upgrade your Pip installation, with a message saying that editable installation currently requires setup.py.
In this case, upgrade Pip to the latest:
pip install --upgrade pip
Automatic publishing with CI
There's a GitHub workflow to make a release automatically upon creation of tags in a specified format.
- Bump the version in pyproject.toml.
- Create a tag named gguf-vx.x.xwherex.x.xis the semantic version number.
git tag -a gguf-v1.0.0 -m "Version 1.0 release"
- Push the tags.
git push origin --tags
Manual publishing
If you want to publish the package manually for any reason, you need to have twine and build installed:
pip install build twine
Then, follow these steps to release a new version:
- Bump the version in pyproject.toml.
- Build the package:
python -m build
- Upload the generated distribution archives:
python -m twine upload dist/*
TODO
- Add tests
- Include conversion scripts as command line entry points in this package.