Commit graph

  • f6e6fc4d47 Address PR comments to increase readibility nscipione 2024-12-02 14:58:30 +00:00
  • 64ed2091b2
    server: Add "tokens per second" information in the backend (#10548) b4240 haopeng 2024-12-02 21:45:54 +08:00
  • 7bd044045f arg: print list of built-in templates Xuan Son Nguyen 2024-12-02 13:53:03 +01:00
  • 7f6e7570db use "built-in" instead of "supported" Xuan Son Nguyen 2024-12-02 13:41:09 +01:00
  • 28d8c91741 add test Xuan Son Nguyen 2024-12-02 13:34:31 +01:00
  • 47b0528ce9 llama : add enum for supported chat templates Xuan Son Nguyen 2024-12-02 12:33:11 +01:00
  • 591f515246 cmake: clean up generated files pre build Xuan Son Nguyen 2024-12-02 10:38:34 +01:00
  • f45c40e31c
    metal : small-batch mat-mul kernels Georgi Gerganov 2024-11-11 13:16:12 +02:00
  • dca39b08b0 Avoid using __fp16 on ARM with old nvcc Frankie Robertson 2024-12-01 18:57:55 +02:00
  • 328ded353b
    docs : remove obsolete make references, scripts, examples Georgi Gerganov 2024-12-02 10:24:54 +02:00
  • c536c07e1e
    ci : disable swift build Georgi Gerganov 2024-12-02 10:17:33 +02:00
  • ce4bab86b2
    docs : remove make references [no ci] Georgi Gerganov 2024-11-29 23:41:02 +02:00
  • 60849b6204
    ci : disable Makefile builds Georgi Gerganov 2024-11-28 17:01:51 +02:00
  • f8b9645dc8
    make : deprecate Georgi Gerganov 2024-11-26 12:45:52 +02:00
  • 0b15d2d745
    fix conficts (#32) T 2024-12-02 15:41:12 +08:00
  • 991f8aabee
    SYCL: Fix and switch to GGML_LOG system instead of fprintf (#10579) b4239 Akarshan Biswas 2024-12-02 12:34:11 +05:30
  • 4cb003dd8d
    contrib : refresh (#10593) Georgi Gerganov 2024-12-02 08:53:27 +02:00
  • 661b3f718c
    Merge pull request #31 from NexaAI/teliu/dev Zack Li 2024-12-01 22:40:13 -08:00
  • a2c53052bd merge from master Te993 2024-12-02 14:38:20 +08:00
  • 809db95990 ugrade to llama.cpp 74d73dc Te993 2024-12-02 14:24:50 +08:00
  • 26a8801ce0
    Merge branch 'master' into log_switch Akarshan Biswas 2024-12-02 10:15:40 +05:30
  • 496d4efeee Update deprecation-warning.cpp aryantandon01 2024-12-02 10:02:24 +05:30
  • a3b893c6c2
    Merge 90efb346a4 into 917786f43d aryantandon01 2024-12-02 04:08:05 +00:00
  • 90efb346a4
    Merge branch 'ggerganov:master' into master aryantandon01 2024-12-02 09:38:03 +05:30
  • 550f4f0b0e Update deprecation-warning.cpp aryantandon01 2024-12-02 09:36:05 +05:30
  • 8545425976
    Merge branch 'ggerganov:master' into master Wang Qin 2024-12-01 15:42:42 -08:00
  • 917786f43d
    Add mistral-v1, mistral-v3, mistral-v3-tekken and mistral-v7 chat template types (#10572) Juk Armstrong 2024-12-01 22:09:49 +00:00
  • e116f59c53 force rebuild .hpp files Xuan Son Nguyen 2024-12-01 22:47:18 +01:00
  • 8ec8060b13 Refactor projector_type enum to enum class for type safety and clarity - Added explicit handling for ProjectorType::UNKNOWN for robustness. This should solve issue #7073. Wang Qin 2024-12-01 12:23:36 -08:00
  • 5e1ed95583
    grammars : add English-only grammar (#10612) Georgi Gerganov 2024-12-01 21:37:54 +02:00
  • ae9818e06c
    Merge branch 'ggerganov:master' into master Wang Qin 2024-12-01 10:54:44 -08:00
  • 5c7a5aa0c3
    ci: add error handling for Python venv creation in run.sh (#10608) Wang Qin 2024-12-01 10:11:42 -08:00
  • 3420909dff
    ggml : automatic selection of best CPU backend (#10606) b4234 Diego Devesa 2024-12-01 16:12:41 +01:00
  • 854eff8da5 add GGML_AVX_VNNI to enable avx-vnni, fix checks slaren 2024-12-01 15:51:25 +01:00
  • 6d78e0f335 add cpuid check for avx-vnni slaren 2024-12-01 14:57:12 +01:00
  • 905810f91a
    Merge aa6f413f43 into 86dc11c5bc Djip007 2024-12-01 16:38:13 +03:00
  • b14b9bf692 amx : minor opt slaren 2024-12-01 14:11:53 +01:00
  • 86dc11c5bc
    server : bind to any port when specified (#10590) b4233 alek3y 2024-12-01 12:33:12 +01:00
  • 74069c4a55
    prs : update template to not have checkbox [no ci] Georgi Gerganov 2024-12-01 12:22:02 +02:00
  • 869ea28b5d
    contrib : add CODEOWNERS Georgi Gerganov 2024-12-01 12:15:30 +02:00
  • f9bd569a9a
    grammars : add English-only grammar Georgi Gerganov 2024-12-01 11:38:35 +02:00
  • 6acce39710
    readme : update the usage section with examples (#10596) Georgi Gerganov 2024-12-01 11:25:17 +02:00
  • ec1f6079f9
    readme : more examples Georgi Gerganov 2024-12-01 11:24:12 +02:00
  • 43957ef203
    build: update Makefile comments for C++ version change (#10598) b4231 Wang Qin 2024-11-30 19:19:44 -08:00
  • f4b8ab0900 ci: add error handling for Python venv creation in run.sh Wang Qin 2024-11-30 16:05:38 -08:00
  • aa6f413f43 move to c++17 Djip007 2024-11-30 19:18:06 +01:00
  • 8bfef91b8b ggml : automatic selection of best CPU backend slaren 2024-11-30 20:45:03 +01:00
  • aa5453de94
    Update llama.cpp Juk Armstrong 2024-11-30 19:51:13 +00:00
  • 8759da4202
    fix typo of README.md Wang Ran (汪然) 2024-12-01 03:00:04 +08:00
  • 0c39f44d70
    ggml-cpu: replace AArch64 NEON assembly with intrinsics in ggml_gemv_q4_0_4x4_q8_0() (#10567) b4230 Adrien Gallouët 2024-11-30 18:13:18 +01:00
  • 1b301dbec3 remove empty line lhpqaq 2024-12-01 00:38:59 +08:00
  • 21f8b73d60 fix code lhpqaq 2024-12-01 00:35:41 +08:00
  • 038b5fa860 Add FP8 support to gguf/llama: Djip007 2024-11-05 01:20:30 +01:00
  • 43b5d9e838 some correction: Djip007 2024-11-05 01:02:11 +01:00
  • 753bccee33 fix ci Xuan Son Nguyen 2024-11-30 15:17:47 +01:00
  • b940cc8188 (test) add CI step for verifying build Xuan Son Nguyen 2024-11-30 15:12:04 +01:00
  • 6c4305fca1 sync build Xuan Son Nguyen 2024-11-30 15:02:19 +01:00
  • c26cf38439 fix more problems on mobile Xuan Son Nguyen 2024-11-30 14:27:18 +01:00
  • 9e7af6470a fix responsive on mobile Xuan Son Nguyen 2024-11-30 14:03:14 +01:00
  • 2eda9b434d fix build (2) Xuan Son Nguyen 2024-11-30 13:18:55 +01:00
  • c74d1d8438 fix build Xuan Son Nguyen 2024-11-30 13:13:10 +01:00
  • 059a755935 use npm as deps manager and vite as bundler Xuan Son Nguyen 2024-11-30 13:08:01 +01:00
  • a448fc071e Update deprecation-warning.cpp aryantandon01 2024-11-30 17:35:37 +05:30
  • 734bd82342 build: update Makefile comments for C++ version change Wang Qin 2024-11-30 03:33:43 -08:00
  • 116fc4ef1e hide buttons in dropdown menu Xuan Son Nguyen 2024-11-30 11:20:37 +01:00
  • c47c41cd35
    Merge branch 'ggerganov:master' into token haopeng 2024-11-30 17:50:54 +08:00
  • c66bc3e4fb
    contrib : expand test-backend-ops instructions Georgi Gerganov 2024-11-30 11:09:26 +02:00
  • fad62ea17a
    readme : update the usage section with examples Georgi Gerganov 2024-11-30 10:53:40 +02:00
  • 3e0ba0e604
    readme : remove old badge Georgi Gerganov 2024-11-30 10:09:21 +02:00
  • 04e26bc6cf
    contrib : expand [no ci] Georgi Gerganov 2024-11-30 09:55:38 +02:00
  • e8f2a2dac1
    contrib : refresh Georgi Gerganov 2024-11-30 09:50:54 +02:00
  • abadba05be
    readme : refresh (#10587) Georgi Gerganov 2024-11-30 09:47:07 +02:00
  • 3b4c551a25
    readme : clarify GGUF Georgi Gerganov 2024-11-30 09:41:23 +02:00
  • 0533e7fb38
    vulkan: Dynamic subgroup size support for Q6_K mat_vec (#10536) b4227 Eve 2024-11-30 07:00:02 +00:00
  • 5ff563257c
    Update src/llama.cpp for not contain <|end|> or </s> piDack 2024-11-30 10:29:58 +08:00
  • b65961bfba make 16 subgroup size a constant Eve 2024-11-29 20:02:59 -05:00
  • 6b84093643 Server (front): Improve mobile UI Stéphane du Hamel 2024-11-29 23:45:32 +01:00
  • b01096e903 server : bind to any port when specified alek3y 2024-11-29 23:31:18 +01:00
  • b223e7b097
    readme : simplify [no ci] Georgi Gerganov 2024-11-29 23:54:26 +02:00
  • 4b8ce77828
    readme : more fixes [no ci] Georgi Gerganov 2024-11-29 23:51:42 +02:00
  • 308c04130c
    readme : fixes [no ci] Georgi Gerganov 2024-11-29 23:35:32 +02:00
  • 7cc2d2c889
    ggml : move AMX to the CPU backend (#10570) b4226 Diego Devesa 2024-11-29 21:54:58 +01:00
  • b782e5c7d4
    server : add more test cases (#10569) Xuan Son Nguyen 2024-11-29 21:48:56 +01:00
  • 12115e2e0c
    Update ggml/src/ggml-cpu/amx/common.h Diego Devesa 2024-11-29 21:43:42 +01:00
  • e3c7b4f95c
    readme : clarify [no ci] Georgi Gerganov 2024-11-29 22:18:53 +02:00
  • e8338b3b4d
    readme : move section [no ci] Georgi Gerganov 2024-11-29 22:12:21 +02:00
  • 4ba28761e7
    readme : refresh Georgi Gerganov 2024-11-29 22:08:00 +02:00
  • 150d6e9232
    server : force F16 KV cache for the draft model Georgi Gerganov 2024-11-29 19:33:49 +02:00
  • 3a8e9af402
    imatrix : support combine-only (#10492) b4224 Robert Collins 2024-11-29 12:21:37 -05:00
  • a3a3048e7a
    cleanup UI link list (#10577) Diego Devesa 2024-11-29 17:45:08 +01:00
  • 98e6651e2f add more function into llama-cpp.h Xuan Son Nguyen 2024-11-29 16:48:52 +01:00
  • 715682d21a add support T5 in swift example zhao.lu 2024-11-29 23:21:06 +08:00
  • aaa6682ab4 ggml-cpu: replace AArch64 NEON assembly with intrinsics in ggml_gemv_q4_0_4x4_q8_0() Adrien Gallouët 2024-11-26 19:03:22 +01:00
  • 5d11848f36
    Merge 61d34f1911 into f0678c5ff4 l3utterfly 2024-11-29 15:51:51 +01:00
  • 0ae587acd6
    Merge 5667b2acc8 into f0678c5ff4 GPTLocalhost (Word Add-in) 2024-11-29 22:49:03 +08:00
  • ffd0a998c7 Formatting nscipione 2024-11-29 14:40:35 +00:00
  • 1f7a0c12b2 Removed all references to 'v2' template from comments juk 2024-11-29 14:39:59 +00:00
  • 0427e6563b Removed 'mistral-v2' option as no (open) models ever used it juk 2024-11-29 14:30:01 +00:00
  • f0678c5ff4
    ggml : fix I8MM Q4_1 scaling factor conversion (#10562) b4222 Georgi Gerganov 2024-11-29 16:25:39 +02:00
  • cbd08b4204 resolve linter, test errors HimariO 2024-11-29 22:18:15 +08:00