rpc : early register backend devices (#11262)

Early register RPC devices and do not propagate RPC specifics in the llama model structures. ref: #10609
2025-01-17 10:57:09 +02:00 · 2025-01-17 10:57:09 +02:00 · 667d72846c
commit 667d72846c
parent a133566d34
10 changed files with 61 additions and 55 deletions
--- a/include/llama.h
+++ b/include/llama.h
@ -288,9 +288,6 @@ extern "C" {
        // proportion of the model (layers or rows) to offload to each GPU, size: llama_max_devices()
        const float * tensor_split;

-        // comma separated list of RPC servers to use for offloading
-        const char * rpc_servers;
-
        // Called with a progress value between 0.0 and 1.0. Pass NULL to disable.
        // If the provided progress_callback returns true, model loading continues.
        // If it returns false, model loading is immediately aborted.