Commit graph

  • 580111d42b
    llama : add gemma model (#5631) b2223 postmasters 2024-02-21 05:08:22 -08:00
  • 87ceb1a42b server: health: fix race condition on slots data using tasks queue Pierrick HYMBERT 2024-02-21 13:38:18 +01:00
  • bcd9530960
    ggml : fix conv_2d batch mode (ggml/737) bssrdf 2024-02-20 14:17:09 -05:00
  • 7dcb75d91e
    llava: add --skip-unknown to 1.6 convert.py Daniel Bevenius 2024-02-21 13:42:11 +01:00
  • c82f66acca Add gemma model Nam Nguyen 2024-02-21 04:17:47 -08:00
  • 21f8046e15 readme: add LocalAI to the available UI Ettore Di Giacinto 2024-02-21 12:20:32 +01:00
  • f7e29e5248 More requests and threads pudepiedj 2024-02-21 11:39:48 +00:00
  • f1d4138c13
    server : fix initialization thread issues Georgi Gerganov 2024-02-21 13:08:57 +02:00
  • b599545e67 add new chat template ngxson 2024-02-21 11:12:58 +01:00
  • 9576df906a server: fallback to chatml ngxson 2024-02-21 11:11:29 +01:00
  • 88c46cbdac
    [SYCL] conext add name (#5624) b2222 Meng, Hengyu 2024-02-21 17:52:06 +08:00
  • a14679cc30
    IQ4_NL: 4-bit non-linear quants with blocks of 32 (#5590) b2221 Kawrakow 2024-02-21 11:39:52 +02:00
  • 760b6d639b minor change pudepiedj 2024-02-21 09:32:29 +00:00
  • c1bad4a549 (WIP) Implement stochastic speculative decoding Minsoo Cheong 2024-02-21 16:49:27 +09:00
  • d464e4bfbe fix indent Meng, Hengyu 2024-02-21 06:44:30 +00:00
  • b565737660 name should start with SYCL* Meng, Hengyu 2024-02-20 22:40:16 -08:00
  • df43b27582 [SYCL] conext add name Meng, Hengyu 2024-02-20 22:22:23 -08:00
  • f921fc3ecd examples : do not assume BOS when shifting context Jared Van Bortel 2024-02-20 22:48:15 -05:00
  • 2a37bd6b86 server: tests: fix the multi users infinite loop test Pierrick HYMBERT 2024-02-21 02:29:50 +01:00
  • 469af4b4ec server: tests: change CI workflow trigger Pierrick HYMBERT 2024-02-21 02:20:44 +01:00
  • 3322bfa980 server: tests: add a small check to be sure all started threads have generated response Pierrick HYMBERT 2024-02-21 02:04:59 +01:00
  • 672d98f6f0 server: tests: CORS and api key checks scenario Pierrick HYMBERT 2024-02-21 01:49:39 +01:00
  • 6dcbcfe6ba server: tests: simplify completion scenario Pierrick HYMBERT 2024-02-21 00:43:50 +01:00
  • 19664b9f01 server: tests: detokenize endpoint issue reference added Pierrick HYMBERT 2024-02-21 00:17:38 +01:00
  • 1065f6d41b server: tests: add tokenize/detokenize scenario Pierrick HYMBERT 2024-02-21 00:13:53 +01:00
  • e6d482088d server: tests: add embeddings scenario Pierrick HYMBERT 2024-02-21 00:02:30 +01:00
  • 1ecda0d13e server: tests: disable issue 3969 scenario Pierrick HYMBERT 2024-02-20 23:35:44 +01:00
  • b0b6d83c76 server: tests: add infinite loop scenario Pierrick HYMBERT 2024-02-20 23:17:00 +01:00
  • 68574c6f98 server: tests: add infinite loop scenario Pierrick HYMBERT 2024-02-20 23:11:59 +01:00
  • 6b9dc4f291 server: tests: add infinite loop Pierrick HYMBERT 2024-02-20 23:05:27 +01:00
  • 0772884b06 server: tests: add a constant seed in completion request Pierrick HYMBERT 2024-02-20 22:55:29 +01:00
  • b9f8390d28 server: tests: check for infinite loops Pierrick HYMBERT 2024-02-20 22:49:36 +01:00
  • 367b59a15c server: tests: check for infinite loops Pierrick HYMBERT 2024-02-20 22:45:30 +01:00
  • c355f76427 server: tests: slots endpoint checks Pierrick HYMBERT 2024-02-20 22:32:11 +01:00
  • 230fbbe9ed
    Merge pull request #1 from ashokgelal/ashokgelal-msty Ashok Gelal 2024-02-20 16:26:40 -05:00
  • 5a8d15bb8d
    readme: add Msty to UI list Ashok Gelal 2024-02-20 16:07:08 -05:00
  • 11adf1d864 server: tests: add OAI multi user scenario Pierrick HYMBERT 2024-02-20 22:00:09 +01:00
  • 9b7ea97979 server: tests: add OAI stream test, fix file end of line, fast fail behave Pierrick HYMBERT 2024-02-20 21:34:35 +01:00
  • 56583bee41 server: tests: refactor steps and vocabulary Pierrick HYMBERT 2024-02-20 20:52:24 +01:00
  • 6c95ec6587 server: tests: change model to: @karpathy's tinyllamas Pierrick HYMBERT 2024-02-20 20:50:14 +01:00
  • e500a14ab0 Merge branch 'server_branch' of https://github.com/pudepiedj/llama.cpp into server_branch pudepiedj 2024-02-20 19:31:33 +00:00
  • 4904b0a06e kvgraphics with interactio pudepiedj 2024-02-20 19:31:29 +00:00
  • ecbb531b1f
    Merge branch 'ggerganov:master' into server_branch pudepiedj 2024-02-20 19:23:44 +00:00
  • 6560bed3f0
    server : support llava 1.6 (#5553) b2220 CJ Pais 2024-02-20 11:07:22 -08:00
  • 06bf2cf8c4
    make : fix debug build with CUDA (#5616) b2219 slaren 2024-02-20 20:06:17 +01:00
  • 1b8da8e0a6
    Fix punctuation split bobqianic 2024-02-20 19:03:51 +00:00
  • d21d11a8c9 make : fix debug build with CUDA slaren 2024-02-20 19:54:54 +01:00
  • 8bb586bf06 server: tests: add health check and concurrent request example Pierrick HYMBERT 2024-02-20 01:15:31 +01:00
  • 1680599b01 server: tests: build only the server Pierrick HYMBERT 2024-02-19 23:10:39 +01:00
  • fe9866a52d server: tests: use ngxson llama_xs_q4.bin Pierrick HYMBERT 2024-02-19 23:05:06 +01:00
  • 30aa323fb9 server: tests: fix ci workflow Pierrick HYMBERT 2024-02-19 23:01:13 +01:00
  • 4e5245e6b8 server: tests: fix ci workflow Pierrick HYMBERT 2024-02-19 22:52:56 +01:00
  • 6497755de5 server: tests: fix ci workflow Pierrick HYMBERT 2024-02-19 22:46:36 +01:00
  • 9b63d7057a server: tests: reduce number of files, all in one tests shell script Pierrick HYMBERT 2024-02-19 21:50:56 +01:00
  • 157bcf2286 server: init functional test Pierrick HYMBERT 2024-02-18 17:13:04 +01:00
  • 4ed8e4fbef
    llava : add explicit instructions for llava-1.6 (#5611) Daniel Bevenius 2024-02-20 18:30:27 +01:00
  • 93e2c73dba
    Fix parentheses error bobqianic 2024-02-20 17:08:19 +00:00
  • 5eebbf030e
    Add files via upload bobqianic 2024-02-20 16:47:17 +00:00
  • 54930024ce cabelo@opensuse.org - Build in openSUSE:compatible with gcc7 Alessandro de Oliveira Faria (A.K.A. CABELO) 2024-02-20 13:47:11 -03:00
  • 941de11759 convert : get general.name from model dir, not its parent Jared Van Bortel 2024-02-20 11:16:54 -05:00
  • ad60bece9c fix zig build CJ Pais 2024-02-20 07:50:43 -08:00
  • 9c405c9f9a
    Server: use llama_chat_apply_template (#5593) b2217 Xuan Son Nguyen 2024-02-20 15:58:27 +01:00
  • d990f6d785
    llava: add explicit instructions for llava-1.6 Daniel Bevenius 2024-02-20 15:43:39 +01:00
  • 235736b176 server: fix formatted_chat ngxson 2024-02-20 15:20:46 +01:00
  • 5912bb50bb
    server: fix help message Xuan Son Nguyen 2024-02-20 12:13:51 +01:00
  • c53b34d457 server: fix format_chat ngxson 2024-02-20 11:07:38 +01:00
  • 5207b3fbc5
    readme : update UI list (#5605) Dane Madsen 2024-02-20 21:00:23 +11:00
  • 8dbbd75754
    metal : add build system support for embedded metal library (#5604) b2215 Haoxiang Fei 2024-02-19 22:58:36 -11:00
  • 33d2df5aca
    Update Makefile Georgi Gerganov 2024-02-20 11:58:29 +02:00
  • 7e637552dd
    Merge pull request #2 from ReinForce-II/f/sakurallm/ppl sorasoras 2024-02-20 17:38:39 +08:00
  • b647530ad3 cabelo@opensuse.org - Build in openSUSE:compatible with gcc7 Alessandro de Oliveira Faria (A.K.A. CABELO) 2024-02-20 05:38:25 -03:00
  • acdec2fe25 cabelo@opensuse.org - Build in openSUSE:compatible with gcc7 Alessandro de Oliveira Faria (A.K.A. CABELO) 2024-02-20 05:22:38 -03:00
  • c0a8c6db37
    server : health endpoint configurable failure on no slot (#5594) b2214 Pierrick Hymbert 2024-02-20 08:48:19 +01:00
  • daacf6ca15 It was the ggml_vdotq thing missed inside the brackets Iwan Kawrakow 2024-02-20 09:37:42 +02:00
  • b376bbb21d Fix typo that makes several tests fail Iwan Kawrakow 2024-02-20 09:21:11 +02:00
  • b9111bd209
    Update ggml_sycl_op_mul_mat_vec_q (#5502) b2213 AidanBeltonS 2024-02-20 07:01:25 +00:00
  • b10a1191a3
    Specify licence Dane Madsen 2024-02-20 16:13:35 +10:00
  • 56dade0bb4
    Add maid to ui list Dane Madsen 2024-02-20 16:09:15 +10:00
  • d455029d3b add build support for embedded metal library Haoxiang Fei 2024-02-20 13:34:02 +08:00
  • 6f139980d4
    fix format Abhilash Majumder 2024-02-20 10:56:21 +05:30
  • b19f46a27e server: remove trailing space ngxson 2024-02-20 00:15:50 +01:00
  • 633782b8d9 nix: now that we can do so, allow MacOS to build Vulkan binaries b2212 Mathijs de Bruin 2024-02-13 20:28:02 +00:00
  • 22f83f0c38 Enable Vulkan MacOS CI 0cc4m 2024-02-10 22:18:33 +01:00
  • bb9dcd560a Refactor validation and enumeration platform checks into functions to clean up ggml_vk_instance_init() 0cc4m 2024-02-14 20:57:17 +01:00
  • f50db6ae0b Add check for VK_KHR_portability_enumeration for MoltenVK support 0cc4m 2024-02-10 22:14:52 +01:00
  • d8c054517d Add preprocessor checks for Apple devices. Mathijs de Bruin 2024-02-06 14:39:22 +00:00
  • 42f664a382 Resolve ErrorIncompatibleDriver with Vulkan on MacOS. Mathijs de Bruin 2024-02-03 18:00:11 +00:00
  • 5dde540897 Allow for Vulkan build with Accelerate. Mathijs de Bruin 2024-02-03 17:56:46 +00:00
  • 40c3a6c1e1
    cuda : ignore peer access already enabled errors (#5597) b2205 slaren 2024-02-19 23:40:26 +01:00
  • d261e7f8f8
    Merge branch 'ggerganov:master' into server_branch pudepiedj 2024-02-19 22:14:25 +00:00
  • b7b44e024f
    nix: now that we can do so, allow MacOS to build Vulkan binaries Mathijs de Bruin 2024-02-13 20:28:02 +00:00
  • 8c42c319ef
    Enable Vulkan MacOS CI 0cc4m 2024-02-10 22:18:33 +01:00
  • 790b700c45
    Refactor validation and enumeration platform checks into functions to clean up ggml_vk_instance_init() 0cc4m 2024-02-14 20:57:17 +01:00
  • 9c18f8c2f3
    Add check for VK_KHR_portability_enumeration for MoltenVK support 0cc4m 2024-02-10 22:14:52 +01:00
  • bdad1f2ec0
    Add preprocessor checks for Apple devices. Mathijs de Bruin 2024-02-06 14:39:22 +00:00
  • eecc149411
    Resolve ErrorIncompatibleDriver with Vulkan on MacOS. Mathijs de Bruin 2024-02-03 18:00:11 +00:00
  • f3db6c8bb6
    Allow for Vulkan build with Accelerate. Mathijs de Bruin 2024-02-03 17:56:46 +00:00
  • f24ed14ee0
    make : pass CPPFLAGS directly to nvcc, not via -Xcompiler (#5598) b2204 Jared Van Bortel 2024-02-19 15:54:12 -05:00
  • d8c5d619de make : pass CPPFLAGS directly to nvcc, not via -Xcompiler Jared Van Bortel 2024-02-19 15:45:51 -05:00
  • 62d3263fa4 fix hip slaren 2024-02-19 21:40:56 +01:00