Commit graph

11 commits

Author SHA1 Message Date
Georgi Gerganov
f0713498fd
context : add get_ctx_padding()
ggml-ci
2025-01-26 20:16:22 +02:00
Georgi Gerganov
b4ec1d4429
cont : move kv_self update to llama_context
ggml-ci
2025-01-26 20:16:21 +02:00
Georgi Gerganov
f2524c0e41
llama : remove references to llama_kv_cache (wip)
Intermediate step necessary to abstract the `llama_context` and
`llama_kv_cache`.

ggml-ci
2025-01-26 20:16:21 +02:00
Georgi Gerganov
a19f671fe0
context : minor
ggml-ci
2025-01-26 20:16:21 +02:00
Georgi Gerganov
17b363afd3
llama : update llama_kv_self API
ggml-ci
2025-01-26 20:16:20 +02:00
Georgi Gerganov
fd05ab87aa
kv_cache : move state read/write to llama_kv_cache
ggml-ci
2025-01-26 20:14:36 +02:00
Georgi Gerganov
4cd1b6fa4c
context : prepare kv_cache_read/write to be moved to kv_cache
ggml-ci
2025-01-26 20:14:36 +02:00
Georgi Gerganov
4d7bd03e65
kv_cache : functions -> members
ggml-ci
2025-01-26 20:14:36 +02:00
Georgi Gerganov
f78b396ee7
llama : add struct llama_kv_cache (wip) [no ci] 2025-01-26 20:12:06 +02:00
Georgi Gerganov
afa8a9ec9b
llama : add llama_vocab, functions -> methods, naming (#11110)
* llama : functions -> methods (#11110)

* llama : add struct llama_vocab to the API (#11156)

ggml-ci

* hparams : move vocab params to llama_vocab (#11159)

ggml-ci

* vocab : more pimpl (#11165)

ggml-ci

* vocab : minor tokenization optimizations (#11160)

ggml-ci

Co-authored-by: Diego Devesa <slarengh@gmail.com>

* lora : update API names (#11167)

ggml-ci

* llama : update API names to use correct prefix (#11174)

* llama : update API names to use correct prefix

ggml-ci

* cont

ggml-ci

* cont

ggml-ci

* minor [no ci]

* vocab : llama_vocab_add_[be]os -> llama_vocab_get_add_[be]os (#11174)

ggml-ci

* vocab : llama_vocab_n_vocab -> llama_vocab_n_tokens (#11174)

ggml-ci

---------

Co-authored-by: Diego Devesa <slarengh@gmail.com>
2025-01-12 11:32:42 +02:00
Georgi Gerganov
f66f582927
llama : refactor src/llama.cpp (#10902)
* llama : scatter llama.cpp into multiple modules (wip)

* llama : control-vector -> adapter

* llama : arch

* llama : mmap

ggml-ci

* ci : remove BUILD_SHARED_LIBS=OFF

ggml-ci

* llama : arch (cont)

ggml-ci

* llama : chat

ggml-ci

* llama : model

ggml-ci

* llama : hparams

ggml-ci

* llama : adapter

ggml-ci

* examples : fix

ggml-ci

* rebase

ggml-ci

* minor

* llama : kv cache

ggml-ci

* llama : impl

ggml-ci

* llama : batch

ggml-ci

* cont

ggml-ci

* llama : context

ggml-ci

* minor

* llama : context (cont)

ggml-ci

* llama : model loader

ggml-ci

* common : update lora

ggml-ci

* llama : quant

ggml-ci

* llama : quant (cont)

ggml-ci

* minor [no ci]
2025-01-03 10:18:53 +02:00