Support multiple GPUs (split mode) on SYCL backend (#5806)

* suport multiple cards: split-mode - layer|row * rm warning * rebase with master, support tow new OPs, close feature for -sm=row, fix for unit test * update news * fix merge error * update according to review comments
2024-03-02 19:49:30 +08:00 · 2024-03-02 19:49:30 +08:00 · 715641391d
commit 715641391d
parent 9bf297a02b
8 changed files with 1506 additions and 814 deletions
--- a/README-sycl.md
+++ b/README-sycl.md
@ -1,6 +1,7 @@
 # llama.cpp for SYCL

 - [Background](#background)
+- [News](#news)
 - [OS](#os)
 - [Intel GPU](#intel-gpu)
 - [Docker](#docker)
@ -25,6 +26,21 @@ The llama.cpp for SYCL is used to support Intel GPUs.

 For Intel CPU, recommend to use llama.cpp for X86 (Intel MKL building).

+## News
+
+- 2024.3
+  - Support multiple cards: **--split-mode**: [none|layer]; not support [row], it's on developing.
+  - Support to assign main GPU by **--main-gpu**, replace $GGML_SYCL_DEVICE.
+  - Support detecting all GPUs with level-zero and same top **Max compute units**.
+  - Support OPs
+    - hardsigmoid
+    - hardswish
+    - pool2d
+
+- 2024.1
+  - Create SYCL backend for Intel GPU.
+  - Support Windows build
+
 ## OS

 |OS|Status|Verified|
@ -449,6 +465,7 @@ Using device **0** (Intel(R) Arc(TM) A770 Graphics) as main device
 |-|-|-|
 |GGML_SYCL_DEVICE|0 (default) or 1|Set the device id used. Check the device ids by default running output|
 |GGML_SYCL_DEBUG|0 (default) or 1|Enable log function by macro: GGML_SYCL_DEBUG|
+|ZES_ENABLE_SYSMAN| 0 (default) or 1|Support to get free memory of GPU by sycl::aspect::ext_intel_free_memory.<br>Recommended to use when --split-mode = layer|

 ## Known Issue

@ -458,6 +475,10 @@ Using device **0** (Intel(R) Arc(TM) A770 Graphics) as main device

  Solution: add **--no-mmap** or **--mmap 0**.

+- Split-mode: [row] is not supported
+
+  It's on developing.
+
 ## Q&A

 - Error:  `error while loading shared libraries: libsycl.so.7: cannot open shared object file: No such file or directory`.