Commit graph

  • 163d50adaf
    fixes #7999 (adds control vectors to all build_XXX() functions in llama.cpp [needs testing] (#8060) b3230 jukofyork 2024-06-25 21:47:40 +01:00
  • ae6a22b2a6 Merge branch 'embed_files' of https://github.com/katsu560/llama.cpp into embed_files katsu560 2024-06-26 05:03:18 +09:00
  • fba3a86474 Merge remote-tracking branch 'origin/master' into json-additional ochafik 2024-06-25 21:02:44 +01:00
  • e4562042a4 sync ggml katsu560 2024-06-26 04:55:13 +09:00
  • b096272f75 sync ggml katsu560 2024-06-26 04:54:01 +09:00
  • 3c1532e062
    make,cmake : fix LLAMA_CUDA + replace GGML_CDEF_PRIVATE Georgi Gerganov 2024-06-25 22:33:47 +03:00
  • af421cab3e
    cmake : fix kompute build Georgi Gerganov 2024-06-25 15:13:03 +03:00
  • 15c1c79449
    cmake : build normal ggml library (not object library) [no ci] Georgi Gerganov 2024-06-25 15:09:18 +03:00
  • ec0a609067
    cmake : link math library [no ci] Georgi Gerganov 2024-06-25 12:02:26 +03:00
  • c80108fce7
    cmake : minor [no ci] Georgi Gerganov 2024-06-25 11:30:47 +03:00
  • 202e0d515b
    server : fix mingw build Georgi Gerganov 2024-06-25 09:56:47 +03:00
  • 8627cd1307
    cmake : fixes [no ci] Georgi Gerganov 2024-06-22 16:51:59 +03:00
  • c36fb3aac4
    ci : disable kompute build [no ci] Georgi Gerganov 2024-06-24 10:07:11 +03:00
  • be08c5af8a
    files : relocate [no ci] Georgi Gerganov 2024-06-24 19:21:23 +03:00
  • 40593025aa
    scripts : update sync [no ci] Georgi Gerganov 2024-06-21 12:29:55 +03:00
  • b1a70fc5d1
    Merge branch 'ggerganov:master' into vulkan-build-integration bandoti 2024-06-25 16:28:07 -03:00
  • 37bcad7d6d
    Update README.md bandoti 2024-06-25 16:27:24 -03:00
  • 6fcbf68235
    llama : implement Unigram tokenizer needed by T5 and FLAN-T5 model families (#5763) b3229 fairydreaming 2024-06-25 21:14:35 +02:00
  • cb3ec8887d
    Merge branch 'ggerganov:master' into vulkan-build-integration bandoti 2024-06-25 16:11:35 -03:00
  • e6bf007744
    llama : return nullptr from llama_grammar_init (#8093) b3228 Daniel Bevenius 2024-06-25 21:07:28 +02:00
  • 84631fe150
    json: support integer minimum, maximum, exclusiveMinimum, exclusiveMaximum (#7797) b3227 Olivier Chafik 2024-06-25 20:06:20 +01:00
  • 1b4759f9cf Add CMake target for Vulkan shaders Mason M 2024-06-25 15:38:10 -03:00
  • 89f764555e
    fix ci sasha0552 2024-06-25 18:26:23 +00:00
  • a7e1725e8f
    fix ci sasha0552 2024-06-25 17:58:57 +00:00
  • 450eafc7b8
    llama : NvAPI performance state change support sasha0552 2024-06-25 16:31:06 +00:00
  • dd047b476c
    disable docker CI on pull requests (#8110) b3226 slaren 2024-06-25 19:20:06 +02:00
  • b6cd699dc5
    Add message about int8 support Isaac McFadyen 2024-06-25 12:54:50 -04:00
  • a5a53194ff
    Moved control vector logic to llama_control_vector:apply_to() jukofyork 2024-06-25 17:22:05 +01:00
  • f23ff913d0 llama : fix whitespace formatting Stanisław Szymczyk 2024-06-25 18:08:14 +02:00
  • 21d36842e8
    Merge branch 'ggerganov:master' into t5-clean-2 fairydreaming 2024-06-25 17:37:59 +02:00
  • 68220feaf8 Update bruteforce test jaime-m-p 2024-06-25 17:36:44 +02:00
  • 87b7dd2322 llama : replace allocated precompiled_charsmap buffer with std::vector to avoid memory leak Stanisław Szymczyk 2024-06-25 17:36:42 +02:00
  • 107923cdd2 Better leading space removal jaime-m-p 2024-06-25 17:33:56 +02:00
  • 9854a9cde9 Symetric params for llama_tokenize() and llama_detokenize() jaime-m-p 2024-06-25 17:28:53 +02:00
  • 925c30956d
    Add healthchecks to llama-server containers (#8081) joecryptotoo 2024-06-25 08:13:27 -07:00
  • ed90e43c70
    Merge branch 'ggerganov:master' into jukofyork-command_r-control-vector-fix jukofyork 2024-06-25 15:47:38 +01:00
  • 29ba5a83d7 Split generated shader file into separate translation unit Mason M 2024-06-25 11:40:59 -03:00
  • 3b957221a1 disable docker CI on pull requests slaren 2024-06-25 15:23:18 +02:00
  • 36bf00369a defensive code against string out of bounds (apparently different behaviour of libstdc++ vs. clang's libc++, can't read final NULL char w/ former) Olivier Chafik 2024-06-25 14:09:22 +01:00
  • 4c67d7cef5 add space in "-1" caitianchi 2024-06-25 20:06:55 +08:00
  • e68c8bc1e3 change n_layer caitianchi 2024-06-25 20:05:52 +08:00
  • c8ad35955a
    Gguf dump start data offset via --data-offset and some extra refactor (#8054) Brian 2024-06-25 22:03:25 +10:00
  • 49c03c79cd
    cvector: better prompt handling, add "mean vector" method (#8069) b3223 Xuan Son Nguyen 2024-06-25 13:59:54 +02:00
  • 48e6b92cc3
    Add chat template support for llama-cli (#8068) b3222 Xuan Son Nguyen 2024-06-25 13:56:49 +02:00
  • 3791ad2193
    SimpleChat v3.1: Boolean chat request options in Settings UI, cache_prompt (#7950) HanishKVC 2024-06-25 16:57:35 +05:30
  • 8f0350578d fix quality problem in pr code caitianchi 2024-06-25 18:51:06 +08:00
  • bc9c9a8a82 squash! clip : suppress unused variable warnings Daniel Bevenius 2024-06-25 11:50:16 +01:00
  • 89e8aaf960 Revert "add eos_id_list to llama.cpp" toyer 2024-06-25 09:23:57 +00:00
  • 8bffa853aa Merge branch 'master' into xsn/cvector-better-prompt ngxson 2024-06-25 10:57:24 +02:00
  • 0d4ecfd9f4 remove inverted pca hotfix ngxson 2024-06-25 10:54:17 +02:00
  • 895bb2a697
    Update examples/main/main.cpp Xuan Son Nguyen 2024-06-25 10:50:25 +02:00
  • f702a90e24
    Update control vector help (#8104) b3220 HatsuneMikuUwU33 2024-06-25 10:44:48 +02:00
  • a2b46fbda6 Merge branch 'ggerganov-master' Xiang 2024-06-25 07:05:25 +00:00
  • 176a2454ce Add YX UI for llama-server Aliebc 2024-06-15 17:50:00 +08:00
  • 74b020b068 Merge with conflict Aliebc 2024-06-15 10:45:01 +08:00
  • 3557944893 Merge branch 'glm_support' toyer 2024-06-25 06:26:49 +00:00
  • 1702a61ba5
    Merge d674812474 into 083bacce14 Sigbjørn Skjæret 2024-06-25 09:08:01 +03:00
  • a67bc8f5a8 fix conflicts toyer 2024-06-25 06:00:43 +00:00
  • c70787d841
    Merge branch 'master' into ffn_change Eddie-Wang 2024-06-25 13:07:04 +08:00
  • c32bad7e65
    clip : suppress unused variable warnings Daniel Bevenius 2024-06-24 14:10:23 +02:00
  • 611238ceee
    Update control vector help HatsuneMikuUwU33 2024-06-25 07:01:37 +02:00
  • 35d96805f7
    squash! llama : return nullptr from llama_grammar_init Daniel Bevenius 2024-06-25 05:35:39 +02:00
  • f8d4fc987e fix conflicts toyer 2024-06-25 03:09:49 +00:00
  • 6189bce423
    Merge branch 'master' into grammar-init-return-null Clint Herron 2024-06-24 22:58:38 -04:00
  • 5f8f465d0d fix code style toyer 2024-06-25 02:29:09 +00:00
  • 3b67ff808a fix code style toyer 2024-06-25 02:22:55 +00:00
  • 083bacce14
    [SYCL] Re-enabled mul_mat_batched_sycl (#8095) b3219 Meng, Hengyu 2024-06-25 10:19:20 +08:00
  • 95708067c9 fix conflicts toyer 2024-06-25 02:15:34 +00:00
  • c2512ce39a json: update grammars/README ochafik 2024-06-25 03:04:30 +01:00
  • a671d56e22 change llm_build_ffn Eddie-Wang1120 2024-06-25 09:37:25 +08:00
  • 48f417db32 Merge remote-tracking branch 'origin/master' into json-bounds2 ochafik 2024-06-25 02:11:09 +01:00
  • ed762f8163 SimpleChat:ReadMe: Switch to webp screen image to reduce size HanishKVC 2024-06-22 18:21:14 +05:30
  • 5bca19783f SimpleChat: Update image included with readme wrt settings ui HanishKVC 2024-06-16 21:13:27 +05:30
  • bc336248bc SimpleChat: Rename to apiRequestOptions from chatRequestOptions HanishKVC 2024-06-16 20:15:15 +05:30
  • e4aeafc54f SimpleChat: RePosition contents of the Info and Settings UI HanishKVC 2024-06-16 20:10:14 +05:30
  • 030c09d56a SimpleChat:Readme: Add quickstart block, title to image, cleanup HanishKVC 2024-06-16 00:11:48 +05:30
  • daafaefaf1 SimpleChat: Add sample GUI images to readme file HanishKVC 2024-06-15 23:07:46 +05:30
  • e3e786f434 SimpleChat: Allow user to control cache_prompt flag in request HanishKVC 2024-06-15 21:20:59 +05:30
  • abe6c54c4c SimpleChat: Allow for chat req bool options to be user controlled HanishKVC 2024-06-15 20:34:35 +05:30
  • c6df6ceea6 Merge remote-tracking branch 'origin/master' into json-additional ochafik 2024-06-25 01:42:51 +01:00
  • de26543afb Re-enabled mul_mat_batched_sycl Meng, Hengyu 2024-06-24 12:54:35 +00:00
  • 2df373ac40
    CUDA: fix matrix multiplication algorithm choice (#8102) b3218 Johannes Gäßler 2024-06-25 01:22:33 +02:00
  • 09a9b7565e nits / cleanups ochafik 2024-06-24 23:44:02 +01:00
  • 36b081ae49 Add OpenMP to CMake pkg Mason M 2024-06-24 17:51:41 -03:00
  • 85f60a0fe9 CUDA: fix matrix multiplication algorithm choice Johannes Gäßler 2024-06-24 22:33:33 +02:00
  • 35bbac1e45 Merge remote-tracking branch 'origin/master' into json-additional ochafik 2024-06-24 21:31:03 +01:00
  • 3a80d1e1b3 reshuffle/merge min/max integ test cases ochafik 2024-06-24 21:28:58 +01:00
  • d7d957dbe9 Merge remote-tracking branch 'origin/master' into json-bounds2 ochafik 2024-06-24 21:21:57 +01:00
  • 72282e7b13 Merge remote-tracking branch 'origin/master' into json-type ochafik 2024-06-24 21:19:49 +01:00
  • 3b099bcd9c
    CUDA: fix MMQ writeback for int8 tensor cores (#8100) Johannes Gäßler 2024-06-24 22:15:33 +02:00
  • c46a789d51 Add Sycl to CMake pkg Mason M 2024-06-24 17:11:00 -03:00
  • f4c03c0966 llama : add handling of byte tokens in UGM tokenizer (same as in SPM) Stanisław Szymczyk 2024-06-24 17:39:41 +02:00
  • 4a28063b1f Update brute force test: jaime-m-p 2024-06-24 20:56:26 +02:00
  • 0402d4f8a0 CUDA: fix MMQ writeback for int8 tensor cores Johannes Gäßler 2024-06-24 20:50:41 +02:00
  • 95a0df5578 Bugfix: custom regexs splits undefined unicode codepoints jaime-m-p 2024-06-24 20:47:28 +02:00
  • 4eadfb11ee Add Vulkan to CMake pkg Mason M 2024-06-24 15:42:49 -03:00
  • 12e2c317c8 style: remove trailing whitespace jaime-m-p 2024-06-24 20:39:54 +02:00
  • 9eb0fca027 Do not remove space when decoding special tokens jaime-m-p 2024-06-24 20:37:48 +02:00
  • a818f3028d
    CUDA: use MMQ instead of cuBLAS by default (#8075) b3216 Johannes Gäßler 2024-06-24 17:43:42 +02:00
  • c9e99bd603 split qnn ops into file hongruichen 2024-06-24 22:11:28 +08:00