Commit graph

  • 437e05f714
    server : (UI) Support for RTL text as models input or output (#11208) ebraminio 2025-01-13 17:16:39 +03:30
  • ca001f6656
    contrib : add naming guidelines (cont) (#11177) Georgi Gerganov 2025-01-13 15:08:44 +02:00
  • 00b4c3da62
    common : support tag-based --hf-repo like on ollama (#11195) Xuan Son Nguyen 2025-01-13 13:56:23 +01:00
  • 5b1f710865 add warn on bad template Xuan Son Nguyen 2025-01-13 13:54:53 +01:00
  • 257917d98d
    Merge a97b3621cf into 7426a26b24 Georgi Gerganov 2025-01-13 14:46:39 +02:00
  • 7426a26b24
    contrib : add naming guidelines (#11177) Georgi Gerganov 2025-01-13 14:46:36 +02:00
  • 8f70fc3d1b
    llama : remove 'd' from bad special token log (#11212) b4468 Daniel Bevenius 2025-01-13 13:38:20 +01:00
  • 34223a21bc
    contrib : fix notes [no ci] Georgi Gerganov 2025-01-13 14:32:55 +02:00
  • db3ded2c8a
    gla: Put the barrier inside the main logic loop Akarshan Biswas 2025-01-13 17:44:07 +05:30
  • 8bd5b18ce1 fix complain with noreturn Xuan Son Nguyen 2025-01-13 12:45:32 +01:00
  • 9d7d5f21f8 cli : auto activate conversation mode if chat template is detected Xuan Son Nguyen 2025-01-13 12:42:47 +01:00
  • 9f4f6e2ab0 llama : remove 'd' from bad special token log Daniel Bevenius 2025-01-13 12:30:26 +01:00
  • 1244cdcf14
    ggml : do not define GGML_USE_CUDA when building with GGML_BACKEND_DL (#11211) b4467 Radoslav Gerganov 2025-01-13 13:31:41 +02:00
  • 22927b1c0a move common_get_hf_file to common.cpp Xuan Son Nguyen 2025-01-13 12:08:18 +01:00
  • 6ffb590e15 fix windows build? Xuan Son Nguyen 2025-01-13 12:03:19 +01:00
  • c03d5cc11a Merge branch 'master' into xsn/tag_based_hf_repo Xuan Son Nguyen 2025-01-13 11:55:20 +01:00
  • ff484f77e3 fix style Xuan Son Nguyen 2025-01-13 11:46:42 +01:00
  • d7b5bf8e94 small fixes Xuan Son Nguyen 2025-01-13 11:44:38 +01:00
  • 62f2f62453 Refactor: Makes 'cuda_graph_update_required' a local variable Andreas Kieslinger 2025-01-13 09:35:36 +00:00
  • be0e7b68ba ggml : do not define GGML_USE_CUDA when building with GGML_BACKEND_DL Radoslav Gerganov 2025-01-13 11:24:08 +02:00
  • fcd62d9de6 Style: Consolidates several neighboring '#ifdef USE_CUDA_GRAPH' into a single one Andreas Kieslinger 2025-01-13 09:19:07 +00:00
  • 03c843fc8b Support for RTL text as models input or output Ebrahim Byagowi 2025-01-13 02:03:41 +03:30
  • 37d0cb6e84 fix: Remove unnecessary Vulkan library linkage in CMakeLists.txt Junil Kim 2025-01-13 16:09:11 +09:00
  • efe4b14e60 refactor: Simplify CMake function for detecting host compiler Junil Kim 2025-01-13 13:51:15 +09:00
  • 9201de2b49
    [swift] add module omnivlm (#41) T 2025-01-13 11:39:17 +08:00
  • 21999481ea cuda: Use common gfx8 value for GCN4 Jon Haus 2025-01-12 17:42:24 -05:00
  • 4ae3fc0155 more barriers Eve 2025-01-12 17:21:57 -05:00
  • 61e5d8d560 vulkan: optimize coopmat2 q4_k/q5_k dequant functions. Jeff Bolz 2025-01-12 10:40:01 -06:00
  • 924518e2e5
    Reset color before we exit (#11205) b4466 Eric Curtin 2025-01-12 18:23:10 +00:00
  • df65154415
    contrib : filename guidelines [no ci] Georgi Gerganov 2025-01-12 19:12:18 +02:00
  • f9dae61ace Reset color before we exit Eric Curtin 2025-01-12 16:55:49 +00:00
  • a97b3621cf
    ggml : ggml_backend_graph_copy -> ggml_backend_graph_copy_state gg/llama-shadow-on Georgi Gerganov 2025-01-12 17:57:51 +02:00
  • d974cae286
    contrib : clarify _context suffix usage [no ci] Georgi Gerganov 2025-01-12 17:37:36 +02:00
  • afd40ea206
    minor : better names Georgi Gerganov 2025-01-12 17:22:16 +02:00
  • 36803b1902
    common : cont Georgi Gerganov 2025-01-12 16:53:44 +02:00
  • a59ee7c4eb
    common : cont Georgi Gerganov 2025-01-12 16:19:18 +02:00
  • 10eb87409e
    shadow : cont gcc Georgi Gerganov 2025-01-12 16:09:49 +02:00
  • 9af90481d0 Vulkan: Add renderdoc tracing support 0cc4m/vulkan-renderdoc 0cc4m 2025-01-12 13:47:36 +00:00
  • f65e3d324d
    ggml : ggml_backend_graph_copy -> ggml_backend_graph_copy_init Georgi Gerganov 2025-01-12 15:34:48 +02:00
  • 439e68c1e5
    cmake : re-enable GCC -Wshadow Georgi Gerganov 2025-01-12 15:29:33 +02:00
  • 34889bf810
    cmake : cont Georgi Gerganov 2025-01-12 15:11:52 +02:00
  • c2b26000c3 remove reference to params.conversation in main Xuan Son Nguyen 2025-01-12 13:54:25 +01:00
  • 43bc40ca71 Merge branch 'master' into xsn/chat_cli Xuan Son Nguyen 2025-01-12 13:47:10 +01:00
  • 9a483999a6
    llama : fix chat template gguf key (#11201) b4465 Xuan Son Nguyen 2025-01-12 13:45:14 +01:00
  • e159e7751c
    cmake : disable -Wshadow for GCC Georgi Gerganov 2025-01-12 14:35:29 +02:00
  • 492daa0d8e llama : fix chat template gguf key Xuan Son Nguyen 2025-01-12 13:30:14 +01:00
  • d07c9f6a7a adapt Xuan Son Nguyen 2025-01-12 13:27:27 +01:00
  • 9a735ae6d8
    examplse : de-shadow Georgi Gerganov 2025-01-12 14:25:32 +02:00
  • 6f56b57f0a Merge branch 'master' into xsn/chat_cli Xuan Son Nguyen 2025-01-12 13:16:47 +01:00
  • 95e0afb977 wip: chat cli Xuan Son Nguyen 2025-01-12 13:16:37 +01:00
  • 82caffa74e
    llama : de-shadow libllama [no ci] Georgi Gerganov 2025-01-12 13:22:16 +02:00
  • 32e7b9dc99
    llama : de-shadow (cont) [no ci] Georgi Gerganov 2025-01-12 12:30:54 +02:00
  • 0127774ae4
    llama : remove unused mutable n_tokens [no ci] Georgi Gerganov 2025-01-12 12:17:24 +02:00
  • 0bebe45a25
    llama : de-shadow (wip) [no ci] Georgi Gerganov 2025-01-12 12:15:19 +02:00
  • 168324a388
    cmake : enable -Wshadow for C++ code [no ci] Georgi Gerganov 2025-01-11 17:52:45 +02:00
  • 08f10f69c3
    llama : remove notion of CLS token (#11064) b4464 Georgi Gerganov 2025-01-12 12:15:53 +02:00
  • 00f2b4c5b2
    llama : remove notion of CLS token Georgi Gerganov 2025-01-03 14:50:49 +02:00
  • afa8a9ec9b
    llama : add llama_vocab, functions -> methods, naming (#11110) Georgi Gerganov 2025-01-12 11:32:42 +02:00
  • cbea4ba102
    vocab : llama_vocab_n_vocab -> llama_vocab_n_tokens (#11174) Georgi Gerganov 2025-01-12 10:52:17 +02:00
  • 7e1950d0bc
    minor [no ci] Georgi Gerganov 2025-01-12 10:08:46 +02:00
  • 95d87cbf65
    contrib : expand [no ci] Georgi Gerganov 2025-01-12 10:05:22 +02:00
  • 71de9f4046
    Merge 7006dd784c into c05e8c9934 Sumandora 2025-01-12 00:00:53 -08:00
  • ed1ad94c84 q2_k optimize scale calculation Eve 2025-01-11 21:46:53 -05:00
  • 46b4c8da44 fix: improve host compiler detection in vulkan shader build Junil Kim 2025-01-12 11:22:58 +09:00
  • fbddb26250 ggml-cuda : use i and j instead of i0 and i in vec_dot_tq2_0_q8_1 compilade/cuda-tq2_0 Francis Couture-Harpin 2025-01-11 20:02:08 -05:00
  • b6fc9f03ab ggml-metal : supports_op returns false for ternary types Francis Couture-Harpin 2025-01-11 19:50:08 -05:00
  • 946796fcec ggml-cuda : slight optimizations for TQ2_0 Francis Couture-Harpin 2025-01-11 19:48:08 -05:00
  • 30eacad290 use calc_superblock everywhere Eve 2025-01-11 17:14:22 -05:00
  • d63497b3a3 merge master Eve 2025-01-11 16:17:24 -05:00
  • 481d57f7c7
    Merge pull request #3 from sparkleholic/master_fix2 Junil Kim 2025-01-12 05:43:48 +09:00
  • ce14d9b7cb fix: vulkan-shaders-gen build and path handling Junil Kim 2025-01-12 03:42:44 +09:00
  • 242135eca4 various fixes Xuan Son Nguyen 2025-01-11 21:35:10 +01:00
  • ef089ca105 fix build Xuan Son Nguyen 2025-01-11 20:35:10 +01:00
  • 803031665a common : support tag-based hf_repo like on ollama Xuan Son Nguyen 2025-01-11 19:44:12 +01:00
  • 6540935bca
    vocab : llama_vocab_add_[be]os -> llama_vocab_get_add_[be]os (#11174) Georgi Gerganov 2025-01-11 17:43:46 +02:00
  • b6f9640157
    contrib : add TODO for preprocessor directives [no ci] Georgi Gerganov 2025-01-11 17:00:01 +02:00
  • 0a982a414e ggml: copy q->f32 assumes some contiguity in the destination Jeff Bolz 2025-01-11 08:57:32 -06:00
  • 6df37bc28b
    llama : update API names to use correct prefix (#11174) Georgi Gerganov 2025-01-11 16:41:56 +02:00
  • 6efee8cb88 lora : update API names (#11167) Georgi Gerganov 2025-01-09 22:23:27 +02:00
  • 31a44094ad
    contrib : minor reword coding guidelines [no ci] Georgi Gerganov 2025-01-11 16:32:53 +02:00
  • 10ef6c1853
    contrib : move coding guidelines to correct section [no ci] Georgi Gerganov 2025-01-11 16:28:27 +02:00
  • 7637216d3f
    minor [no ci] Georgi Gerganov 2025-01-11 16:23:58 +02:00
  • f44939a6eb
    contrib : cont [no ci] Georgi Gerganov 2025-01-11 16:22:58 +02:00
  • da47eb0650
    contrib : add _t suffix guideline [no ci] Georgi Gerganov 2025-01-11 16:17:18 +02:00
  • 7fd17ba7cc
    contrib : cont [no ci] Georgi Gerganov 2025-01-11 16:01:51 +02:00
  • e7bc61bc53
    contrib : expand naming guidelines [no ci] Georgi Gerganov 2025-01-11 15:50:59 +02:00
  • ea2a022c42
    Merge f75349c27d into c05e8c9934 Xuan Son Nguyen 2025-01-11 12:49:10 +02:00
  • c05e8c9934
    gguf-py: fixed local detection of gguf package (#11180) Vinesh Janarthanan 2025-01-11 03:42:31 -06:00
  • 8dfe3d8e97 Allow compiling ggml-cuda without mmq or flash attention Milot Mirdita 2025-01-11 15:28:57 +09:00
  • 6acdb265fc CUDA op getrows fails for long sequences Milot Mirdita 2025-01-11 15:26:23 +09:00
  • 894f260d30 Fix ggml-cuda using a driver symbol in NO_VMM mode Milot Mirdita 2025-01-11 15:25:03 +09:00
  • bd9c319515 ggml doesn't use sse42, specify only up to sse4.1 Milot Mirdita 2025-01-11 15:24:17 +09:00
  • c9d1eb3a06 Added the ability to use guide tokens for OuteTTS, greatly improving TTS recitation accuracy over long input sequences. Concedo 2025-01-11 16:53:20 +08:00
  • 3453401cfe Fix GGML not compiling on macOS with GCC Milot Mirdita 2025-01-11 15:22:44 +09:00
  • 2739a71e4b
    convert : sort print supported models [no ci] (#11179) Daniel Bevenius 2025-01-11 05:50:33 +01:00
  • f5fddb6d24 ggml-cuda : remove some superfluous comments for TQ2_0 tile loading Francis Couture-Harpin 2025-01-10 14:52:49 -05:00
  • 893cf69783 Bumped gguf version to 0.15.0 VJHack 2025-01-10 12:26:21 -06:00
  • 70031b1cbf added reader.py to readme VJHack 2025-01-10 12:25:02 -06:00
  • 6ddf49f5d1 updated path to gguf package for non-installed setups VJHack 2025-01-10 11:55:12 -06:00
  • e7ef3941fe convert : sort print supported models [no ci] Daniel Bevenius 2025-01-10 18:07:33 +01:00