llama : infill sampling handle very long tokens (#9924)

* llama : infill sampling handle very long tokens

ggml-ci

* cont : better indices

ggml-ci
This commit is contained in:
Georgi Gerganov 2024-10-17 22:32:47 +03:00 committed by GitHub
parent 3752217ed5
commit 99bd4ac28c
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
4 changed files with 35 additions and 43 deletions

View file

@ -953,12 +953,6 @@ extern "C" {
int32_t lstrip,
bool special);
// check if token0 is contained as a prefix in token1
LLAMA_API bool llama_token_is_prefix(
const struct llama_model * model,
llama_token token0,
llama_token token1);
/// @details Convert the provided tokens into text (inverse of llama_tokenize()).
/// @param text The char pointer must be large enough to hold the resulting text.
/// @return Returns the number of chars/bytes on success, no more than text_len_max.