Commit graph

  • 97cbe18dd2 rename macro to intel hardware jianyuzh 2024-01-23 14:35:33 +08:00
  • 27c08c0429 Merge branch 'sycl' of https://github.com/abhilash1910/llama.cpp into sycl jianyuzh 2024-01-23 14:16:24 +08:00
  • a0a1304b0c add build&run script, clean CMakefile, update guide by review comments jianyuzh 2024-01-23 14:16:01 +08:00
  • b403784228 remove extra endif Meng, Hengyu 2024-01-23 06:09:19 +00:00
  • dd7f1396f9 cleanup 1 abhilash1910 2024-01-22 21:37:16 -08:00
  • 533c647d0e check for sycl blas, better performance jianyuzh 2024-01-23 13:34:05 +08:00
  • ffcd8e7fbd
    ws John 2024-01-23 05:52:12 +01:00
  • 8f39716d8e
    Support for Yi-VL, templating fix for mobileVLM John 2024-01-23 05:40:13 +01:00
  • 67e6b3cb7d align pr4766 Meng, Hengyu 2024-01-23 03:32:09 +00:00
  • f008cc7b68 enable SYCL_F16 support luoyu-intel 2024-01-22 13:41:09 +08:00
  • f396a3b65e add know issue for pvc hang issue jianyuzh 2024-01-20 14:54:20 +08:00
  • 623d8031cb fix code err luoyu-intel 2024-01-19 10:26:19 +08:00
  • e3481faa2f rm original sycl code before refactor jianyuzh 2024-01-19 09:53:48 +08:00
  • ae941b1b57 add syc and link for sycl readme jianyuzh 2024-01-19 09:52:04 +08:00
  • 35a0daaaa1 restore rm code to fix hang issue jianyuzh 2024-01-18 20:13:58 +08:00
  • d5f7d364f6 remove sycl version from include path luoyu-intel 2024-01-18 16:37:25 +08:00
  • 57e9fbadb2 fix return type luoyu-intel 2024-01-18 15:23:27 +08:00
  • 593ce001e2 Update README_sycl.md Neo Zhang Jianyu 2024-01-18 10:58:56 +08:00
  • d80dd65f42 dos2unix jianyuzh 2024-01-15 16:07:31 +08:00
  • 09b5619df4 rm rear space jianyuzh 2024-01-15 16:05:47 +08:00
  • 7350fd48ef add ls-sycl-device, rm unused files jianyuzh 2024-01-15 16:03:38 +08:00
  • 0d6e7219b6 add ls-sycl-device tool jianyuzh 2024-01-15 15:59:43 +08:00
  • 79d30d7713 add run script, comment debug code jianyuzh 2024-01-15 15:18:03 +08:00
  • a8936f4902 set nthread=1 when sycl, increase performance jianyuzh 2024-01-15 14:33:52 +08:00
  • 95daece908 fix build with sycl jianyuzh 2024-01-15 14:18:19 +08:00
  • ca2cb6982a update readme, refactor build script jianyuzh 2024-01-15 13:42:24 +08:00
  • c3c5b20ac5 mv dpct definition from folder dpct to ggml-sycl.h jianyuzh 2024-01-15 10:01:32 +08:00
  • c67c2ab228 refactor device log jianyuzh 2024-01-13 21:14:46 +08:00
  • a47f5ec42e summary dpct definition in one header file to replace folder:dpct jianyuzh 2024-01-13 20:33:42 +08:00
  • 5b5389941e fix error: wrong result in 658746bb26702e50f2c59c0e4ada8e9da6010481 jianyuzh 2024-01-13 19:55:30 +08:00
  • bd38129aeb add print tensor function to debug jianyuzh 2024-01-12 10:15:06 +08:00
  • 3645f25d74 correct queue: rm dtct:get_queue jianyuzh 2024-01-10 22:27:26 +08:00
  • fa3a58605b clear CMAKE to rm unused lib and options jianyuzh 2024-01-09 09:37:54 +08:00
  • c709c3cb37 ren ggml-sycl.hpp -> ggml-sycl.h jianyuzh 2024-01-09 08:48:18 +08:00
  • 69d76c8b58 fix error of select non-zero device, format device list jianyuzh 2024-01-08 14:23:55 +08:00
  • c2ef7a9cb9 step 8, rename all macro & func from cuda by sycl jianyuzh 2024-01-07 16:55:55 +08:00
  • 3b1a743e82 step7 add debug for code path, rm log jianyuzh 2024-01-06 20:01:29 +08:00
  • 65f895d41b support main device is non-zero jianyuzh 2024-01-04 23:09:56 +08:00
  • 3a9d2c54ba step6, enhance error check, remove CUDA macro, enhance device id to fix none-zero id issue jianyuzh 2024-01-04 14:26:36 +08:00
  • 6dd32789b4 step 5 format device and print jianyuzh 2023-12-31 15:48:00 +08:00
  • da752edaf5 add GGML_LIST_DEVICE function jianyuzh 2023-12-31 14:40:51 +08:00
  • 43f2c35859 step3 add fp16, slower 31->28 jianyuzh 2023-12-31 12:38:57 +08:00
  • 02dffb68b8 step 2 jianyuzh 2023-12-29 17:43:22 +08:00
  • ff83711055 step 1 jianyuzh 2023-12-29 17:25:40 +08:00
  • 0c00b4f654 add debug functio, commit all help code jianyuzh 2023-12-29 14:58:07 +08:00
  • 233876936b update init_cublas jianyuzh 2023-12-28 16:40:42 +08:00
  • 7a4343df61 first update for migration jianyuzh 2023-12-27 11:19:46 +08:00
  • 0177431cb2
    ws John 2024-01-23 01:46:49 +01:00
  • 582bddc37a
    changed formating to adaptive exponential format John 2024-01-23 01:45:27 +01:00
  • 5122c82869
    bugfix John 2024-01-23 01:28:02 +01:00
  • 8690363183
    added inttypes.h John 2024-01-23 01:13:39 +01:00
  • 8ccb0d69cd
    trailing ws John 2024-01-23 01:09:19 +01:00
  • 0fa71d1760
    moved from ggml to ggml-backend - as backend retrieval needed but header not available in ggml.c John 2024-01-23 01:07:19 +01:00
  • 0360217e8f nix-shell: use addToSearchPath Michael Hueschen 2024-01-22 16:44:10 -07:00
  • 3dc6b95761 llama.vim: added api key support Michael Coppola 2024-01-22 18:34:28 -05:00
  • 31bfd4a52b
    Update ggml.c John 2024-01-23 00:26:48 +01:00
  • 54f825a61c CUDA: more info when no device code JohannesGaessler 2024-01-22 22:47:04 +01:00
  • f2c364a574 Disable unsupported ops to fix tests 0cc4m 2024-01-22 23:52:13 +01:00
  • 011e8ec577
    llama : fix not enough space in buffer with Qwen (#5086) b1954 slaren 2024-01-22 23:42:41 +01:00
  • 607fbe99c7
    Update ggml.h John 2024-01-22 23:31:24 +01:00
  • d85a629a6c
    Update ggml.c John 2024-01-22 23:28:52 +01:00
  • 6b97c71834 refactor multi buf slaren 2024-01-22 22:44:46 +01:00
  • d083c81761 server: only add back deferred tasks when one slot is available ngxson 2024-01-22 23:06:12 +01:00
  • 1bd867894d server: correct multitask response ngxson 2024-01-22 22:42:59 +01:00
  • d87b48fd55 server: move all mutexes away from server.cpp ngxson 2024-01-22 22:32:06 +01:00
  • 73fbbd1526
    Update llama.cpp John 2024-01-22 21:57:25 +01:00
  • bcf2a4488c Use min of maxMemoryAllocationSize and maxBufferSize for device max allocation size 0cc4m 2024-01-22 21:57:17 +01:00
  • 58fe9cf572 Merge branch 'master' into xsn/clean_up_server ngxson 2024-01-22 21:41:01 +01:00
  • f0bb1052c6 llama : fix not enough space in buffer with Qwen slaren 2024-01-22 21:05:25 +01:00
  • 84aa8899fb
    top-k sort speedup John 2024-01-22 20:46:13 +01:00
  • f652ebfd54 Implement max_size for backend buffer types to limit the size of a single allocation 0cc4m 2024-01-22 18:39:04 +01:00
  • 150af7ecf7 perplexity: add additional KL-divergence statistics Iwan Kawrakow 2024-01-22 18:02:50 +02:00
  • c05ee5bf4f Made hpp files easier to read and added helper titles in UI. John Boero 2024-01-22 15:50:08 +00:00
  • 8c9d953f2c Fixed segfault on invalid grammar. John Boero 2024-01-22 15:49:36 +00:00
  • 5785faff47 Merge branch 'master' into xsn/intel-oneapi Xuan Son Nguyen 2024-01-22 16:34:11 +01:00
  • 8128fa0bd3 perplexity: add top-token probability Iwan Kawrakow 2024-01-22 17:11:57 +02:00
  • 6f9939d119
    KL-divergence (#5076) b1953 Kawrakow 2024-01-22 16:10:14 +02:00
  • f7fbeb5ece Merge branch 'dynamic-temp' of https://github.com/l3utterfly/llama.cpp into dynamic-temp l3utterfly 2024-01-22 22:16:30 +09:00
  • 4e97bdb497 return earlier if there is only 1 candiate (i.e. max_entropy == 0) l3utterfly 2024-01-22 22:16:05 +09:00
  • 780e24a22e
    ggml : parallelize FP32 conversion when using BLAS (#5045) b1952 Reinforce-II 2024-01-22 21:15:08 +08:00
  • 4817827a60
    simplify code Reinforce-II 2024-01-22 21:13:57 +08:00
  • a98a49836c
    use nullptr in llama_sample_softmax call during llama_sample_entropy l3utterfly 2024-01-22 22:12:33 +09:00
  • 3ce7e8f8e7
    llava : MobileVLM support (#4954) b1951 XiaotaoChen 2024-01-22 21:09:35 +08:00
  • b2d80e105a flake.nix: add a comment about flakes vs nix b1950 Someone Serge 2024-01-21 03:41:37 +00:00
  • 28603cd283 nix: add a comment on the many nixpkgs-with-cuda instances Someone Serge 2024-01-21 03:29:38 +00:00
  • 5e97ec91ae nix: add a comment about makeScope Someone Serge 2024-01-21 03:15:13 +00:00
  • 7251870780 nix: refactor the cleanSource rules Someone Serge 2024-01-13 17:45:01 +00:00
  • fe8b3c0d4b workflows: nix-ci: drop the redundant "paths" filter Someone Serge 2024-01-13 17:38:32 +00:00
  • f4dd059259 workflows: nix-build-aarch64: rate limit Someone Serge 2024-01-13 17:16:54 +00:00
  • f7276f7500 workflows: nix-ci: rebuild on flake.lock updates Someone Serge 2024-01-13 17:10:19 +00:00
  • 15bceec2d7
    imatrix : keep intermediate imatrix results (#5077) b1943 Kawrakow 2024-01-22 14:18:43 +02:00
  • 21deba4921 nix: add cc to devShell LD_LIBRARY_PATH Michael Hueschen 2024-01-22 03:17:05 -07:00
  • 0e57eb875e Fix the editor config checks Chenxiaotao03 2024-01-22 19:51:14 +08:00
  • c53a10cac4 Be able to keep intermediate imatrix results Iwan Kawrakow 2024-01-22 13:38:19 +02:00
  • bbb578b09d Capture softmax operations for sampler profiling kalomaze 2024-01-22 05:24:54 -06:00
  • d6bd4d46dd
    llama : support StableLM 2 1.6B (#5052) b1942 compilade 2024-01-22 06:21:52 -05:00
  • 3cb8e5b7ce
    flake.nix: add a comment about flakes vs nix Someone Serge 2024-01-21 03:41:37 +00:00
  • 2aebd7a47a
    nix: add a comment on the many nixpkgs-with-cuda instances Someone Serge 2024-01-21 03:29:38 +00:00
  • 3c9dcb7249
    nix: add a comment about makeScope Someone Serge 2024-01-21 03:15:13 +00:00
  • e72a0a78c7
    nix: refactor the cleanSource rules Someone Serge 2024-01-13 17:45:01 +00:00