| 
								
								
									 Xuan Son Nguyen | 958367bf53 | server : refactor slot input data, move tokenizer to HTTP thread (#10023) * server : refactor slot input data, move tokenizer to HTTP thread
* move prompt_tokens.empty() check
* fix incorrect if branch
* fix infinite generation loop
* bring back infill validation
* add infill test
* try fixing format_infill
* fix test
* remove redundant code
* rename completion to inference
* update docs
* use llama_tokens everywhere | 2024-10-24 21:51:22 +02:00 |  |