https://github.com/ggerganov/llama.cpp/pull/9418
build
Simplified simulation of serving incoming requests in parallel