llama : accept a list of devices to use to offload a model (#10497)

* llama : accept a list of devices to use to offload a model

* accept `--dev none` to completely disable offloading

* fix dev list with dl backends

* rename env parameter to LLAMA_ARG_DEVICE for consistency
This commit is contained in:
Diego Devesa 2024-11-25 19:30:06 +01:00 committed by GitHub
parent 1f922254f0
commit 10bce0450f
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
9 changed files with 104 additions and 27 deletions

View file

@ -76,6 +76,7 @@ int main(int argc, char ** argv) {
ctx_tgt = llama_init_tgt.context;
// load the draft model
params.devices = params.speculative.devices;
params.model = params.speculative.model;
params.n_gpu_layers = params.speculative.n_gpu_layers;
if (params.speculative.cpuparams.n_threads > 0) {