* convert.py: Outfile default name change and additional metadata support
* convert.py: don't stringify Metadata load method output
* convert.py: typo fix
* convert.py: fix metadata format to sync with LLM_KV_NAMES in llama.cpp
Reuse the code already moved into GroupKV
Add explicit get and set wrt int32_t, which was added after move
to GroupKV wrt basic MapOfMapOfVariant logic.
The json library generates less informative exception message,
which doesnt help one identify which key is missing, so switch to
the new json_get_str helper added in the last commit. It generates
more informative exception message.
Also has dump was using get_value calls with fallback to default,
so it wasnt identifying the missed field.
Have fixed both of those. Also reconverted meta json file.
Misc: interesting avesham and aattam
This should be ok, given that there is a version of the chat tmpl
meta data already included with the library.
So only if user wants to change the chat template info wrt a existing
model/template-standard or add a new one, then there is need to
pass a json file with info for that model/standard.
This should allow for using this generic chat templating code flow
along with the included chat template data, without needing to
load any json file at runtime.
However If user wants to change the already included chat template
data, or add new chat template standard/model related data, one can
explicitly load json file.
TODO: Need to cross check this flow once, but logically should work
Rename kv helpers to match their semantic.
* whether working with string or bool value
* whether two keys or a single key
Add support for kv with bool value
inturn add the kv boolean pairs used in the chaton_meta.json file
Add the closing bracket
Use repr to retain the escape sequences in the read string.
And parallely skip the single quote around strings wrt repr.
Bring in more k-v pairs wrt chaton_meta.json
WIP:NOTE:
Initial go converting from json driven flow to ChatTemplatesGroupKV
related flow done. Needs to be tested.
A optional helper added to load ChatTemplates from a specified
json file.
Need to add a compile time initialized MapOfMapOfVariants wrt
the chat template details of models/standards already known
to the program. So that one can use the llama.cpp and this new
chat template logic, even without json dependency, if one doesnt
want to.
Manually iterate the json object items using begin-end explicitly,
because the implicit iteration for loop related helpers for the
used json lib gives only the values and not a key-value pair.
* A little documentation that shares my quick tips for working in the repository.
* Update startup-testing-debugging.md
* script that shows a menu of tests to pick from & run the debugger on
* debug-test.sh: Refactor CLI help message
* debug-test.sh: documentation update
* debug-test.sh: CLI Help output corrections
* debug-test.sh: minor doc fix
---------
authored-by: Josh Ramer <ubuntu@ip-172-31-32-53.ec2.internal>
Assisted-by: brian khuu <mofosyne@gmail.com>
* convert-hf : support bfloat16 conversion
* gguf-py : flake8 fixes
* convert-hf : add missing space after comma
* convert-hf : get bit-exact same output as ./quantize
The quantization version was missing.
* convert-hf : don't round bf16 NANs
* convert-hf : save some memory with np.int16 intermediate bf16 weights
* convert-hf : more closely match llama.cpp with which weights to keep in f32
* convert-hf : add --outtype auto-f16
A reason for this to exist is for model quantizers who want an initial
GGUF with the most fidelity to the original model while still using
a 16-bit float type instead of 32-bit floats.
* convert-hf : remove a semicolon because flake8 doesn't like it
It's a reflex from when programming in C/C++, I guess.
* convert-hf : support outtype templating in outfile name
* convert-hf : rename --outtype auto-f16 to --outtype auto