llama.cpp

vbatts/llama.cpp

Fork 0

Commit graph

Author	SHA1	Message	Date
Xuan Son Nguyen	958367bf53	server : refactor slot input data, move tokenizer to HTTP thread (#10023 ) * server : refactor slot input data, move tokenizer to HTTP thread * move prompt_tokens.empty() check * fix incorrect if branch * fix infinite generation loop * bring back infill validation * add infill test * try fixing format_infill * fix test * remove redundant code * rename completion to inference * update docs * use llama_tokens everywhere	2024-10-24 21:51:22 +02:00

Author

SHA1

Message

Date

Xuan Son Nguyen

958367bf53

server : refactor slot input data, move tokenizer to HTTP thread (#10023 )

* server : refactor slot input data, move tokenizer to HTTP thread

* move prompt_tokens.empty() check

* fix incorrect if branch

* fix infinite generation loop

* bring back infill validation

* add infill test

* try fixing format_infill

* fix test

* remove redundant code

* rename completion to inference

* update docs

* use llama_tokens everywhere

2024-10-24 21:51:22 +02:00

1 commit