llama.cpp

History

Xuan Son Nguyen 57bb2c40cd server : fix logprobs, make it OAI-compatible (#10783 ) * server : fix logprobs, make it openai-compatible * update docs * add std::log * return pre-sampling p * sort before apply softmax * add comment * fix test * set p for sampled token * update docs * add --multi-token-probs * update docs * add `post_sampling_probs` option * update docs [no ci] * remove --multi-token-probs * "top_probs" with "post_sampling_probs" * resolve review comments * rename struct token_prob to prob_info * correct comment placement * fix setting prob for sampled token		2024-12-19 15:40:08 +01:00
..
test_basic.py	server : add flag to disable the web-ui (#10762 ) (#10751 )	2024-12-10 18:22:34 +01:00
test_chat_completion.py	server : fix logprobs, make it OAI-compatible (#10783 )	2024-12-19 15:40:08 +01:00
test_completion.py	server : fix logprobs, make it OAI-compatible (#10783 )	2024-12-19 15:40:08 +01:00
test_ctx_shift.py	server : replace behave with pytest (#10416 )	2024-11-26 16:20:18 +01:00
test_embedding.py	server : fix logprobs, make it OAI-compatible (#10783 )	2024-12-19 15:40:08 +01:00
test_infill.py	server : fix format_infill (#10724 )	2024-12-08 23:04:29 +01:00
test_lora.py	server : replace behave with pytest (#10416 )	2024-11-26 16:20:18 +01:00
test_rerank.py	server : fill usage info in embeddings and rerank responses (#10852 )	2024-12-17 18:00:24 +02:00
test_security.py	server : replace behave with pytest (#10416 )	2024-11-26 16:20:18 +01:00
test_slot_save.py	server : replace behave with pytest (#10416 )	2024-11-26 16:20:18 +01:00
test_speculative.py	server : fix speculative decoding with context shift (#10641 )	2024-12-04 22:38:20 +02:00
test_tokenize.py	server : replace behave with pytest (#10416 )	2024-11-26 16:20:18 +01:00