squash! llama : add early return for empty range

Remove the setting of cache.head to 0 when the range is empty.

Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>
This commit is contained in:
Daniel Bevenius 2024-07-05 18:36:55 +02:00
parent 4eb8073c54
commit eb572f9ac6
Failed to extract signature

View file

@ -3259,10 +3259,7 @@ static void llama_kv_cache_seq_add(
if (p0 < 0) p0 = 0;
if (p1 < 0) p1 = std::numeric_limits<llama_pos>::max();
// If there is no range then return early to avoid looping over the cache.
if (p0 == p1) {
cache.head = 0;
return;
}
if (p0 == p1) return;
if (cache.recurrent) {
// for Mamba-like models, only the pos needs to be shifted