Commit graph

  • 2eb76b2a5e
    flake.lock: Update (#10346) Georgi Gerganov 2024-11-18 16:08:20 +02:00
  • fbb2513704 sycl: Revert MUL_MAT_OP support changes Alberto Cabrera 2024-11-18 13:12:41 +00:00
  • 03acc8d39b
    Merge branch 'ggerganov:master' into fix-cmake-pkg-cross-compile bandoti 2024-11-18 09:05:16 -04:00
  • 3df375e1bd [fix] Modify the status of finish_reason if the stream value is False SeongBeomLEE 2024-11-18 20:26:19 +09:00
  • ac350df7b1
    Merge branch 'ggerganov:master' into vulkan-initialize-value FirstTimeEZ 2024-11-18 23:14:41 +13:00
  • bdc7b40273
    Merge branch 'ggerganov:master' into fix-sycl-ci FirstTimeEZ 2024-11-18 23:14:29 +13:00
  • 62cc60d3e4
    Merge a6fe84abac into 9b75f03cd2 FirstTimeEZ 2024-11-18 10:10:16 +00:00
  • a6fe84abac
    Merge branch 'ggerganov:master' into vulkan-assertion FirstTimeEZ 2024-11-18 23:10:13 +13:00
  • 9b75f03cd2
    Vulkan: Fix device info output format specifiers (#10366) b4122 0cc4m 2024-11-18 11:02:43 +01:00
  • c103f2f4ef
    GGML_OP_SOFT_MAX FirstTimeEZ 2024-11-18 21:35:01 +13:00
  • a8cab0c08a
    GGML_OP_SOFT_MAX FirstTimeEZ 2024-11-18 21:21:32 +13:00
  • e4c2c33206
    CI: fix windows-latest-cmake-sycl FirstTimeEZ 2024-11-18 21:14:36 +13:00
  • 97cf81a037
    CI: fix windows-latest-cmake-sycl FirstTimeEZ 2024-11-18 20:16:55 +13:00
  • 29081bb864
    CI: fix windows-latest-cmake-sycl FirstTimeEZ 2024-11-18 19:54:33 +13:00
  • ac59e10696 cmake: force MSVC compiler charset to utf-8 蕭澧邦 2024-11-18 14:01:07 +08:00
  • 940bdd8bc3 Vulkan: Use zu printf specifier for size_t instead of ld 0cc4m 2024-11-18 06:09:34 +00:00
  • 85fc2974f2 vulkan: Further soft_max optimizations Jeff Bolz 2024-11-17 18:48:33 -06:00
  • 907aef9132
    Merge branch 'ggerganov:master' into vulkan-initialize-value FirstTimeEZ 2024-11-18 17:28:17 +13:00
  • 50efb5886d vulkan: remove use of null initializer Jeff Bolz 2024-11-17 21:28:00 -06:00
  • ae7e0b009a
    CI: fix windows-latest-cmake-sycl FirstTimeEZ 2024-11-18 16:02:07 +13:00
  • 2906ee0240
    Merge branch 'ggerganov:master' into vulkan-assertion FirstTimeEZ 2024-11-18 15:29:54 +13:00
  • 408bffb751
    CI: fix windows-latest-cmake-sycl FirstTimeEZ 2024-11-18 13:17:56 +13:00
  • 75207b3a88
    docker: use GGML_NATIVE=OFF (#10368) Johannes Gäßler 2024-11-18 00:21:53 +01:00
  • 7f54f8e555 docker: use GGML_NATIVE=OFF Johannes Gäßler 2024-11-18 00:02:13 +01:00
  • 9d1ab28aed
    Merge branch 'vulkan-assertion' of https://github.com/FirstTimeEZ/llama.cpp into vulkan-assertion FirstTimeEZ 2024-11-18 11:56:39 +13:00
  • b75785d5b4
    vulkan: change an assertion and minify others FirstTimeEZ 2024-11-18 11:56:33 +13:00
  • c100e21a78
    Merge branch 'ggerganov:master' into vulkan-assertion FirstTimeEZ 2024-11-18 11:24:17 +13:00
  • 76e9e58b78
    CUDA: fix MMV kernel being used for FP16 src1 (#10357) b4120 Johannes Gäßler 2024-11-17 23:20:42 +01:00
  • 2ed70d8c8d
    vulkan: change an assertion and minify others FirstTimeEZ 2024-11-18 11:19:19 +13:00
  • b5d5af4cdb
    Merge branch 'vulkan-assertion' of https://github.com/FirstTimeEZ/llama.cpp into vulkan-assertion FirstTimeEZ 2024-11-18 11:08:28 +13:00
  • 4629b76d75
    vulkan: change an assertion and minify others FirstTimeEZ 2024-11-18 11:06:38 +13:00
  • 0c74e097da Vulkan: Fix device info output format specifiers 0cc4m 2024-11-17 19:51:43 +00:00
  • 883dc22d44
    Update test-tokenizer-random.py Robert 2024-11-17 08:35:07 -08:00
  • 08c977e37a Skip searching root path for cross-compile builds Mason M 2024-11-17 10:20:53 -04:00
  • da1aab0d4a
    Merge branch 'ggerganov:master' into vulkan-initialize-value FirstTimeEZ 2024-11-18 01:04:12 +13:00
  • a1e88f0bcc
    Merge branch 'ggerganov:master' into vulkan-assertion FirstTimeEZ 2024-11-18 01:04:05 +13:00
  • ce2e59ba10
    CMake: fix typo in comment [no ci] (#10360) Johannes Gäßler 2024-11-17 12:59:38 +01:00
  • cd62c5851a CMake: fix typo in comment [no ci] Johannes Gäßler 2024-11-17 12:42:17 +01:00
  • be5caccef9
    llama : only use default buffer types for the KV cache (#10358) b4118 Diego Devesa 2024-11-17 12:25:45 +01:00
  • 20a780c7b6
    gitignore : ignore local run scripts [no ci] Georgi Gerganov 2024-11-17 13:12:22 +02:00
  • 5c9e20beea CUDA: fix MMV kernel being used for FP16 src1 Johannes Gäßler 2024-11-17 11:10:53 +01:00
  • b7904dd728
    Merge branch 'vulkan-assertion' of https://github.com/FirstTimeEZ/llama.cpp into vulkan-assertion FirstTimeEZ 2024-11-17 23:56:45 +13:00
  • 855a685cc0
    vulkan-assertions FirstTimeEZ 2024-11-17 23:56:38 +13:00
  • 7345c2cccb
    Merge branch 'ggerganov:master' into vulkan-assertion FirstTimeEZ 2024-11-17 23:54:25 +13:00
  • 3db18a765f
    Merge branch 'vulkan-assertion' of https://github.com/FirstTimeEZ/llama.cpp into vulkan-assertion FirstTimeEZ 2024-11-17 23:52:36 +13:00
  • 281d629380
    vulkan: less assertions FirstTimeEZ 2024-11-17 23:52:30 +13:00
  • d2750fdf25 llama : only use default buffer types for the KV cache slaren 2024-11-17 11:13:24 +01:00
  • 3427a47259
    Merge def47780cc into cf32a9b93a Junil Kim 2024-11-17 23:04:07 +13:00
  • cf32a9b93a
    metal : refactor kernel args into structs (#10238) Georgi Gerganov 2024-11-17 11:23:01 +02:00
  • a43178299c
    ggml : fix undefined reference to 'getcpu' (#10354) b4115 FirstTimeEZ 2024-11-17 21:39:22 +13:00
  • c3ea58aca4
    CUDA: remove DMMV, consolidate F16 mult mat vec (#10318) b4114 Johannes Gäßler 2024-11-17 09:09:55 +01:00
  • 467576b6cc
    CMake: default to -arch=native for CUDA build (#10320) b4113 Johannes Gäßler 2024-11-17 09:06:34 +01:00
  • e31f5464c3
    ggml error: undefined reference to 'getcpu' FirstTimeEZ 2024-11-17 21:06:05 +13:00
  • a112eb45c4
    ggml : add ggml-metal-impl.h Georgi Gerganov 2024-11-10 18:29:09 +02:00
  • 1c603023ed
    metal : add TODOs for rest of ops Georgi Gerganov 2024-11-10 17:56:12 +02:00
  • f018669cf5
    metal : GGML_OP_NORM Georgi Gerganov 2024-11-10 17:17:18 +02:00
  • b438ff7e7c
    metal : GGML_OP_RMS_NORM Georgi Gerganov 2024-11-10 15:31:43 +02:00
  • 2b86f84839
    metal : GGML_OP_CPY Georgi Gerganov 2024-11-10 13:55:26 +02:00
  • d7488ba09c
    metal : GGML_OP_REPEAT Georgi Gerganov 2024-11-10 13:21:59 +02:00
  • 281fa05e83
    metal : GGML_OP_ADD, GGML_OP_SUB, GGML_OP_MUL, GGML_OP_DIV Georgi Gerganov 2024-11-10 13:16:54 +02:00
  • 4c1c7213e2
    metal : GGML_OP_CONCAT Georgi Gerganov 2024-11-10 13:03:25 +02:00
  • 1a8f8df35d
    cont : int safety + register optimizations Georgi Gerganov 2024-11-10 11:05:10 +02:00
  • ec18f96891
    cont : mul mm id Georgi Gerganov 2024-11-10 10:32:15 +02:00
  • cd89d1a877
    cont : thread counters style Georgi Gerganov 2024-11-10 09:57:41 +02:00
  • f759814c66
    cont : shmem style Georgi Gerganov 2024-11-10 09:45:06 +02:00
  • d2a055059e
    cont : use char ptr Georgi Gerganov 2024-11-10 09:26:53 +02:00
  • 481b05df22
    cont : args is first argument Georgi Gerganov 2024-11-10 08:47:30 +02:00
  • 4af3a87962
    cont : pass by reference Georgi Gerganov 2024-11-10 08:10:22 +02:00
  • 07bc7610ad
    cont : mul mat vec Georgi Gerganov 2024-11-09 22:56:39 +02:00
  • 0d0c54fc5a
    metal : mul mat struct (wip) Georgi Gerganov 2024-11-09 17:54:40 +02:00
  • cbae088721
    metal : cont + avoid potential int overflow [no ci] Georgi Gerganov 2024-11-09 16:39:36 +02:00
  • 362a3f3433
    metal : fattn args Georgi Gerganov 2024-11-09 16:09:31 +02:00
  • 051ff11140
    metal : add kernel arg structs (wip) Georgi Gerganov 2024-11-09 15:28:55 +02:00
  • eda7e1d4f5
    ggml : fix possible buffer use after free in sched reserve (#9930) b4112 Diego Devesa 2024-11-17 07:31:17 +01:00
  • 24203e9dd7 ggml : inttypes.h -> cinttypes (#0) b4111 Georgi Gerganov 2024-11-16 23:40:39 +02:00
  • 5d9e59979c ggml : adapt AMX to tensor->grad removal (#0) Georgi Gerganov 2024-11-16 21:38:01 +02:00
  • a4200cafad make : add ggml-opt (#0) Georgi Gerganov 2024-11-16 21:35:31 +02:00
  • 84274a10c3 tests : remove test-grad0 Georgi Gerganov 2024-11-16 21:34:03 +02:00
  • 68fcb4759c ggml : fix compile warnings (#0) Georgi Gerganov 2024-11-16 21:32:41 +02:00
  • 8a43e940ab ggml: new optimization interface (ggml/988) Johannes Gäßler 2024-11-16 22:17:59 +02:00
  • 5c9a8b22b1 scripts : update sync Georgi Gerganov 2024-11-16 22:16:04 +02:00
  • f9e9792f1d common: compile shared lib, and export some c functions KenForever1 2024-11-17 13:36:14 +08:00
  • bc8648fbbe
    Update test-tokenizer-random.py Robert 2024-11-16 21:01:38 -08:00
  • 1d24ee94f8 flake.lock: Update github-actions[bot] 2024-11-17 00:23:32 +00:00
  • a3822fb59b update Makefile Djip007 2024-11-17 00:49:52 +01:00
  • 6b27075768
    Merge 413a19e25c into 0fff7fd798 Robert Collins 2024-11-17 00:34:52 +01:00
  • 0fff7fd798
    docs : vulkan build instructions to use git bash mingw64 (#10303) FirstTimeEZ 2024-11-17 12:29:18 +13:00
  • 82efaafe9d Apply suggestions from the PR: refactor test-vanilla-pca and remove unecessary allocations Lucas Nogueira 2024-11-16 20:16:05 -03:00
  • 62751a89e6 CMake: default to -arch=native for CUDA build Johannes Gäßler 2024-11-15 21:17:45 +01:00
  • dda8847636 some cleanup with tinyblas backend Djip007 2024-11-16 22:30:02 +01:00
  • 7dd261f3e9 extract llamafile in new tinyblas backend Djip007 2024-11-16 20:58:30 +01:00
  • 4e54be0ec6
    llama/ex: remove --logdir argument (#10339) b4103 Johannes Gäßler 2024-11-16 23:00:41 +01:00
  • f338cb0de7
    Merge branch 'ggerganov:master' into patch-4 FirstTimeEZ 2024-11-17 10:51:08 +13:00
  • 245f5d49e9
    Merge branch 'ggerganov:master' into vulkan-initialize-value FirstTimeEZ 2024-11-17 10:50:56 +13:00
  • 7a4ac544f6
    Merge branch 'ggerganov:master' into vulkan-assertion FirstTimeEZ 2024-11-17 10:50:46 +13:00
  • cd3b8db4e5
    ggml : inttypes.h -> cinttypes (#0) Georgi Gerganov 2024-11-16 23:40:39 +02:00
  • ce65dfe251
    ggml : adapt AMX to tensor->grad removal (#0) Georgi Gerganov 2024-11-16 21:38:01 +02:00
  • d6c7f2a669
    make : add ggml-opt (#0) Georgi Gerganov 2024-11-16 21:35:31 +02:00
  • eeb7c148e0
    tests : remove test-grad0 Georgi Gerganov 2024-11-16 21:34:03 +02:00
  • 1e95f0c58d
    ggml : fix compile warnings (#0) Georgi Gerganov 2024-11-16 21:32:41 +02:00