From 1b2d22f3de5f01be853f56912f7d93252c5281a2 Mon Sep 17 00:00:00 2001 From: ngxson Date: Wed, 31 Jan 2024 00:10:44 +0100 Subject: [PATCH] docs: sycl build in docker --- README_sycl.md | 64 ++++++++++++++++++++++++++++++++++++-------------- 1 file changed, 46 insertions(+), 18 deletions(-) diff --git a/README_sycl.md b/README_sycl.md index d5a1818f5..94722489c 100644 --- a/README_sycl.md +++ b/README_sycl.md @@ -44,7 +44,7 @@ For Intel CPU, recommend to use llama.cpp for X86 (Intel MKL building). |Intel Data Center Flex Series| Support| Flex 170| |Intel Arc Series| Support| Arc 770| |Intel built-in Arc GPU| Support| built-in Arc GPU in Meteor Lake| -|Intel iGPU| Support| iGPU in i5-1250P, i7-1165G7| +|Intel iGPU| Support| iGPU in i5-1250P, i7-1260P, i7-1165G7| ## Linux @@ -59,7 +59,7 @@ Note: for iGPU, please install the client GPU driver. b. Add user to group: video, render. -``` +```sh sudo usermod -aG render username sudo usermod -aG video username ``` @@ -68,7 +68,7 @@ Note: re-login to enable it. c. Check -``` +```sh sudo apt install clinfo sudo clinfo -l ``` @@ -86,6 +86,7 @@ Platform #0: Intel(R) OpenCL HD Graphics 2. Install IntelĀ® oneAPI Base toolkit. +Note: You can skip step this if you want to build inside docker container a. Please follow the procedure in [Get the IntelĀ® oneAPI Base Toolkit ](https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit.html). @@ -95,7 +96,7 @@ Following guide use the default folder as example. If you use other folder, plea b. Check -``` +```sh source /opt/intel/oneapi/setvars.sh sycl-ls @@ -114,35 +115,48 @@ Output (example): 2. Build locally: +Note: +- You can choose between **F16** and **F32** build. F16 is faster for long-prompt inference. +- By default, it will build for all binary files. It will take more time. To reduce the time, we recommend to build for **example/main** only. + +Method using **docker**: + +```sh +# For F16: +#docker build -t llama-cpp-sycl:latest --build-arg="LLAMA_SYCL_F16=ON" -f .devops/main-intel.Dockerfile . + +# Or, for F32: +docker build -t llama-cpp-sycl -f .devops/main-intel.Dockerfile . + +# Note: you can also use the ".devops/main-server.Dockerfile", which compiles the "server" example ``` + +or, without docker: + +```sh mkdir -p build cd build source /opt/intel/oneapi/setvars.sh -#for FP16 -#cmake .. -DLLAMA_SYCL=ON -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx -DLLAMA_SYCL_F16=ON # faster for long-prompt inference +# For FP16: +#cmake .. -DLLAMA_SYCL=ON -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx -DLLAMA_SYCL_F16=ON -#for FP32 +# Or, for FP32: cmake .. -DLLAMA_SYCL=ON -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx -#build example/main only +# Build example/main only #cmake --build . --config Release --target main -#build all binary +# Or, build all binary cmake --build . --config Release -v - ``` or -``` +```sh ./examples/sycl/build.sh ``` -Note: - -- By default, it will build for all binary files. It will take more time. To reduce the time, we recommend to build for **example/main** only. - ### Run 1. Put model file to folder **models** @@ -155,12 +169,14 @@ source /opt/intel/oneapi/setvars.sh 3. List device ID +(Skip this step if you're using docker) + Run without parameter: -``` +```sh ./build/bin/ls-sycl-device -or +# or running the "main" executable and look at the output log: ./build/bin/main ``` @@ -189,12 +205,24 @@ found 4 SYCL devices: Set device ID = 0 by **GGML_SYCL_DEVICE=0** +Using docker image built from step 2: + +```sh +# Firstly, find all the DRI cards: +ls -la /dev/dri + +# Then, pick the card that you want to use. For example "/dev/dri/card1" +docker run -it --rm -v "$(pwd):/app:Z" --device /dev/dri/renderD128:/dev/dri/renderD128 --device /dev/dri/card1:/dev/dri/card1 llama-cpp-sycl -m "/app/models/YOUR_MODEL_FILE" -p "Building a website can be done in 10 simple steps:" -n 400 -e -ngl 33 ``` + +or, without docker: + +```sh GGML_SYCL_DEVICE=0 ./build/bin/main -m models/llama-2-7b.Q4_0.gguf -p "Building a website can be done in 10 simple steps:" -n 400 -e -ngl 33 ``` or run by script: -``` +```sh ./examples/sycl/run_llama2.sh ```