update doc
This commit is contained in:
parent
bbcbf47b6d
commit
8b5ae299ec
2 changed files with 4 additions and 1 deletions
|
@ -10,6 +10,7 @@ Inference of Meta's [LLaMA](https://arxiv.org/abs/2302.13971) model (and others)
|
||||||
|
|
||||||
### Recent API changes
|
### Recent API changes
|
||||||
|
|
||||||
|
- [2024 Mar 30] State and session file functions reorganized under `llama_state_*` https://github.com/ggerganov/llama.cpp/pull/6341
|
||||||
- [2024 Mar 26] Logits and embeddings API updated for compactness https://github.com/ggerganov/llama.cpp/pull/6122
|
- [2024 Mar 26] Logits and embeddings API updated for compactness https://github.com/ggerganov/llama.cpp/pull/6122
|
||||||
- [2024 Mar 13] Add `llama_synchronize()` + `llama_context_params.n_ubatch` https://github.com/ggerganov/llama.cpp/pull/6017
|
- [2024 Mar 13] Add `llama_synchronize()` + `llama_context_params.n_ubatch` https://github.com/ggerganov/llama.cpp/pull/6017
|
||||||
- [2024 Mar 8] `llama_kv_cache_seq_rm()` returns a `bool` instead of `void`, and new `llama_n_seq_max()` returns the upper limit of acceptable `seq_id` in batches (relevant when dealing with multiple sequences) https://github.com/ggerganov/llama.cpp/pull/5328
|
- [2024 Mar 8] `llama_kv_cache_seq_rm()` returns a `bool` instead of `void`, and new `llama_n_seq_max()` returns the upper limit of acceptable `seq_id` in batches (relevant when dealing with multiple sequences) https://github.com/ggerganov/llama.cpp/pull/5328
|
||||||
|
|
4
llama.h
4
llama.h
|
@ -646,16 +646,18 @@ extern "C" {
|
||||||
size_t n_token_count),
|
size_t n_token_count),
|
||||||
"use llama_state_save_file instead");
|
"use llama_state_save_file instead");
|
||||||
|
|
||||||
|
// Get the exact size needed to copy the KV cache of a single sequence
|
||||||
LLAMA_API size_t llama_state_seq_get_size(
|
LLAMA_API size_t llama_state_seq_get_size(
|
||||||
struct llama_context * ctx,
|
struct llama_context * ctx,
|
||||||
llama_seq_id seq_id);
|
llama_seq_id seq_id);
|
||||||
|
|
||||||
|
// Copy the KV cache of a single sequence into the specified buffer
|
||||||
LLAMA_API size_t llama_state_seq_get_data(
|
LLAMA_API size_t llama_state_seq_get_data(
|
||||||
struct llama_context * ctx,
|
struct llama_context * ctx,
|
||||||
uint8_t * dst,
|
uint8_t * dst,
|
||||||
llama_seq_id seq_id);
|
llama_seq_id seq_id);
|
||||||
|
|
||||||
// Copy the sequence data (originally copied with `llama_state_seq_get_data`) into a sequence.
|
// Copy the sequence data (originally copied with `llama_state_seq_get_data`) into the specified sequence
|
||||||
// Returns:
|
// Returns:
|
||||||
// - Positive: Ok
|
// - Positive: Ok
|
||||||
// - Zero: Failed to load
|
// - Zero: Failed to load
|
||||||
|
|
Loading…
Add table
Add a link
Reference in a new issue