llama : add support for SmolLm pre-tokenizer (#8609)

* Adding SmolLM Pre Tokenizer * Update convert_hf_to_gguf_update.py Co-authored-by: compilade <git@compilade.net> * Update src/llama.cpp Co-authored-by: compilade <git@compilade.net> * handle regex * removed .inp and out .out ggufs --------- Co-authored-by: compilade <git@compilade.net>
2024-07-22 10:43:01 -04:00 · 2024-07-22 10:43:01 -04:00 · d94c6e0ccb
commit d94c6e0ccb
parent 566daa5a5b
4 changed files with 10 additions and 0 deletions
--- a/include/llama.h
+++ b/include/llama.h
@ -93,6 +93,7 @@ extern "C" {
        LLAMA_VOCAB_PRE_TYPE_VIKING         = 18,
        LLAMA_VOCAB_PRE_TYPE_JAIS           = 19,
        LLAMA_VOCAB_PRE_TYPE_TEKKEN         = 20,
+        LLAMA_VOCAB_PRE_TYPE_SMOLLM         = 21,
    };

    // note: these values should be synchronized with ggml_rope