llama : add support for SmolLm pre-tokenizer (#8609)

* Adding SmolLM Pre Tokenizer

* Update convert_hf_to_gguf_update.py

Co-authored-by: compilade <git@compilade.net>

* Update src/llama.cpp

Co-authored-by: compilade <git@compilade.net>

* handle regex

* removed .inp and out .out ggufs

---------

Co-authored-by: compilade <git@compilade.net>
This commit is contained in:
Jason Stillerman 2024-07-22 10:43:01 -04:00 committed by GitHub
parent 566daa5a5b
commit d94c6e0ccb
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
4 changed files with 10 additions and 0 deletions

View file

@ -93,6 +93,7 @@ extern "C" {
LLAMA_VOCAB_PRE_TYPE_VIKING = 18,
LLAMA_VOCAB_PRE_TYPE_JAIS = 19,
LLAMA_VOCAB_PRE_TYPE_TEKKEN = 20,
LLAMA_VOCAB_PRE_TYPE_SMOLLM = 21,
};
// note: these values should be synchronized with ggml_rope