llama.cpp

History

Pierrick Hymbert 4bd0f93e4a model: support arch `DbrxForCausalLM` (#6515 ) * model: dbrx convert to gguf #6344 * llama: support dbrx #6344 * doc: dbrx: add the model as supported * scripts: get-wikitext-2 add unzip * llama: increase maximum experts allowed * llama: factorize moe graph implementation between grok, mixtral and dbrx --------- Co-authored-by: Megha Agarwal <16129366+megha95@users.noreply.github.com>		2024-04-13 11:33:52 +02:00
..
__init__.py	gguf-py: Refactor and allow reading/modifying existing GGUF files (#3981 )	2023-11-11 08:04:50 +03:00
constants.py	model: support arch `DbrxForCausalLM` (#6515 )	2024-04-13 11:33:52 +02:00
gguf.py	gguf-py: Refactor and allow reading/modifying existing GGUF files (#3981 )	2023-11-11 08:04:50 +03:00
gguf_reader.py	gguf : add support for I64 and F64 arrays (#6062 )	2024-03-15 10:46:51 +02:00
gguf_writer.py	gguf.py : add licence and version to gguf writer (#6504 )	2024-04-05 21:41:38 +03:00
py.typed	convert : various script cleanups/fixes + merges and special token handling (#2842 )	2023-08-30 11:25:50 +03:00
tensor_mapping.py	model: support arch `DbrxForCausalLM` (#6515 )	2024-04-13 11:33:52 +02:00
vocab.py	fix(gguf-py): special tokens are no longer skipped when add_<token>_token is set to false (#5487 )	2024-02-15 14:14:37 +01:00