Commit graph

  • 7589158595 expose omni_context_params struct 李为 2024-11-21 20:44:49 +08:00
  • a734da71ce
    Apply suggestions from code review Diego Devesa 2024-11-21 13:32:48 +01:00
  • fd2c58286a remove reference interface from extern C in qwen2audio examples 李为 2024-11-21 20:10:27 +08:00
  • 1bb30bf28c
    llama : handle KV shift for recurrent models (#10402) b4149 Georgi Gerganov 2024-11-21 10:22:47 +02:00
  • 5667b2acc8
    Integrating llama.cpp with Microsoft Word GPTLocalhost (Word Add-in) 2024-11-21 15:27:27 +08:00
  • 87a533be57
    sync : ggml b4148 Georgi Gerganov 2024-11-21 09:22:11 +02:00
  • 59b9172822
    ggml/sched : do not skip views in pre-assignments slaren 2024-11-20 13:25:08 +01:00
  • 02e4eaf22f
    ggml-opt: fix data corruption (ggml/1022) Johannes Gäßler 2024-11-20 14:56:04 +01:00
  • 1e9447a00b fixup : use full warps slaren 2024-11-21 02:55:22 +01:00
  • 0a737d213c remove unused parameter slaren 2024-11-21 01:59:38 +01:00
  • 35386e8904 cuda : optimize argmax slaren 2024-11-21 01:08:49 +01:00
  • 58cbcd2371 remove feature files Xuan Son Nguyen 2024-11-20 22:18:22 +01:00
  • c432a82295 fix parallel test Xuan Son Nguyen 2024-11-20 21:51:24 +01:00
  • 78e3cb3cf2 add parallel completion test Xuan Son Nguyen 2024-11-20 21:35:31 +01:00
  • 81e0dad03e vulkan: define all quant data structures in types.comp Jeff Bolz 2024-11-20 14:15:57 -06:00
  • 9abe9eeae9
    vulkan: predicate max operation in soft_max shaders/soft_max (#10437) Jeff Bolz 2024-11-20 13:47:36 -06:00
  • 1bc896fede server : (proposal) allow user to customize chat template Xuan Son Nguyen 2024-11-20 20:36:58 +01:00
  • de972f45d2 vulkan: predicate max operation in soft_max shaders/soft_max Jeff Bolz 2024-11-20 13:20:37 -06:00
  • 1c2f0f708c fix save slot test Xuan Son Nguyen 2024-11-20 19:24:24 +01:00
  • a0e27c1cd0 Preliminary work for UI and logging MaggotHATE 2024-11-20 22:51:26 +05:00
  • 6af3f95f6f fix coding style Xuan Son Nguyen 2024-11-20 17:58:14 +01:00
  • 472e128c0b added all sequential tests Xuan Son Nguyen 2024-11-20 17:57:20 +01:00
  • f95caa7954
    cmake: add link dependencies to cmake find pkg (#10433) bandoti 2024-11-20 12:22:19 -04:00
  • 0e69f27dfb
    Merge branch 'ggerganov:master' into fix-cmake-pkg-missing-deps bandoti 2024-11-20 11:57:40 -04:00
  • eb02373f76 log less, fix embd test Xuan Son Nguyen 2024-11-20 16:49:35 +01:00
  • 20a701343e Add more link deps. and set GGML_ vars Mason M 2024-11-20 11:44:43 -04:00
  • e34c9d78a4 styling Xuan Son Nguyen 2024-11-20 15:01:09 +01:00
  • f09a9b68e1 more tests Xuan Son Nguyen 2024-11-20 15:00:36 +01:00
  • 8f38ad6f5b refactor issue templates to be component-specific Johannes Gäßler 2024-11-20 13:13:01 +01:00
  • f4c4ce3767 sycl : offload of get_rows set to 0 Alberto Cabrera 2024-11-20 13:31:27 +00:00
  • 17a800be68
    metadata: use char* under the hood to avoid conversion round trips Karl-Johan Alm 2024-11-20 22:26:45 +09:00
  • dbc057b8e5
    ci: Update oneAPI runtime dll packaging 蕭澧邦 2024-11-20 20:08:31 +08:00
  • ec6212ee64 Reverted to a simple solution withing server only. MaggotHATE 2024-11-20 17:06:56 +05:00
  • fab5d30ff6
    llama : add .clang-format file (#10415) b4143 Diego Devesa 2024-11-20 12:57:53 +01:00
  • 3249aabc0b add more tests Xuan Son Nguyen 2024-11-20 12:49:18 +01:00
  • fc050381ef Fixed test-chat-template MaggotHATE 2024-11-20 16:12:25 +05:00
  • 9543d01a8a GitHub: ask for more info in issues [no ci] Johannes Gäßler 2024-11-20 11:43:49 +01:00
  • b3e343eae7 Fix for simple-chat MaggotHATE 2024-11-20 16:04:55 +05:00
  • dbe531ea9f Merge branch 'server-chat-templates-custom' of https://github.com/MaggotHATE/llama.cpp-greedy-rework into server-chat-templates-custom MaggotHATE 2024-11-20 15:41:10 +05:00
  • 9b58edf2d8 Fix trailing whitespace, reverted enable_chat_template in arg MaggotHATE 2024-11-20 15:40:59 +05:00
  • f4bd7cdd2b
    allocate c strings in metadata functions Karl-Johan Alm 2024-11-20 15:23:59 +09:00
  • d7de41302b misc Xuan Son Nguyen 2024-11-20 11:09:16 +01:00
  • 25835610e3
    add comment about extra null-term byte requirement Karl-Johan Alm 2024-11-20 19:03:32 +09:00
  • 84da80c4e0
    Merge branch 'ggerganov:master' into server-chat-templates-custom MaggotHATE 2024-11-20 14:42:39 +05:00
  • 53e0215053 Initial "prefix+suffix" chat template MaggotHATE 2024-11-20 14:42:03 +05:00
  • 4201656f87 Remove unused code leo-pony 2024-11-20 15:58:10 +08:00
  • 8fd4b7fa29
    vulkan: copy iq4_nl LUT into shared memory (#10409) b4142 Jeff Bolz 2024-11-20 01:40:18 -06:00
  • 1bacb9f625
    vulkan: further optimize mul_mat_vec using larger loads (#10387) b4141 Jeff Bolz 2024-11-20 01:11:00 -06:00
  • daab141670
    Merge branch 'ggerganov:master' into master haopeng 2024-11-20 15:07:29 +08:00
  • 1ee8d721bd Add compile option soc type macro ASCEND_310P to ggml-cann lib leo-pony 2024-11-20 14:34:09 +08:00
  • 07a64b9fb6
    bug-fix: snprintf prints NULL in place of the last character Karl-Johan Alm 2024-11-20 15:23:59 +09:00
  • ad21c9e1f1
    update rel to 4040 (#10395) Neo Zhang Jianyu 2024-11-20 13:54:25 +08:00
  • 63273690b5 CANN Support Ascend310P to accelerate F32 and F16 LLM Model leo-pony 2024-11-20 10:45:48 +08:00
  • 3f6406f9a2
    Merge branch 'ggerganov:master' into master haopeng 2024-11-20 10:23:37 +08:00
  • 49cdfd3fc2 fix test on windows Xuan Son Nguyen 2024-11-20 00:19:07 +01:00
  • 907b72d7aa Add BLAS link opts Mason M 2024-11-19 18:31:02 -04:00
  • 3acaf58e38 server : replace behave with pytest Xuan Son Nguyen 2024-11-19 23:29:46 +01:00
  • 3952a221af
    Fix missing file renames in Makefile due to changes in commit ae8de6d50a (#10413) b4139 Anthony Van de Gejuchte 2024-11-19 23:18:17 +01:00
  • 218b24b27e llama : add .clang-format file slaren 2024-11-19 23:14:43 +01:00
  • 19505ff776 try BLAS_LIBRARIES instead Mason M 2024-11-19 18:01:12 -04:00
  • 170786c20a cmake pkg: find BLAS libs Mason M 2024-11-19 17:43:57 -04:00
  • 77aceb885c cmake pkg: find accelerate, openmp, memkind libs Mason M 2024-11-19 17:13:32 -04:00
  • 98129ccde8 Fix missing file renames in Makefile due to changes in commit ae8de6d50a Anthony Van de Gejuchte 2024-11-19 22:09:14 +01:00
  • 42ae10bbcd
    add cmake rvv support (#10411) b4138 haopeng 2024-11-20 04:10:31 +08:00
  • 9fe0fb0626 sync : ggml b4137 Georgi Gerganov 2024-11-19 19:15:50 +02:00
  • 611fabd792 metal : fox offset integer overflows in im2col (ggml/1015) Plamen Minev 2024-11-18 15:02:27 +02:00
  • 12b0ad953a metal : add GGML_UNARY_OP_ELU kernel (ggml/1018) PAB 2024-11-18 10:02:49 +01:00
  • 342397dc7e
    cmake: force MSVC compiler charset to utf-8 (#9989) b4134 蕭澧邦 2024-11-20 01:42:00 +08:00
  • f87ba13336
    sync : ggml Georgi Gerganov 2024-11-19 19:15:50 +02:00
  • 3bc6ca1a46
    metal : fox offset integer overflows in im2col (ggml/1015) Plamen Minev 2024-11-18 15:02:27 +02:00
  • 4ac059838d
    metal : add GGML_UNARY_OP_ELU kernel (ggml/1018) PAB 2024-11-18 10:02:49 +01:00
  • 2a11b6b094
    Add required ggml-base and backend libs to cmake pkg (#10407) b4133 bandoti 2024-11-19 12:10:30 -04:00
  • 4278480a47 add cmake rvv support lhpqaq 2024-11-19 23:52:04 +08:00
  • 005c267481 sycl: Reroute permuted mul_mats through oneMKL Alberto Cabrera 2024-11-19 14:15:41 +00:00
  • 2a2f8e9754 Add required ggml-base and backend libs to cmake pkg Mason M 2024-11-19 10:49:09 -04:00
  • aa7f6a8e52 vulkan: copy iq4_nl LUT into shared memory Jeff Bolz 2024-11-19 08:41:26 -06:00
  • 3ee6382d48
    cuda : fix CUDA_FLAGS not being applied (#10403) b4132 Diego Devesa 2024-11-19 14:29:38 +01:00
  • 39dda977e4 cuda : fix CUDA_FLAGS not being applied slaren 2024-11-19 13:25:58 +01:00
  • bbeb6f0f57
    llama : handle KV shift for recurrent models Georgi Gerganov 2024-11-19 14:19:19 +02:00
  • 8e752a777b
    llama : add check for KV cache shifts (#10401) b4131 Georgi Gerganov 2024-11-19 13:29:26 +02:00
  • 029e60932f
    llama : restore comment [no ci] Georgi Gerganov 2024-11-19 13:29:06 +02:00
  • c0f1bb3942
    llama : add check for KV cache shifts Georgi Gerganov 2024-11-19 12:00:27 +02:00
  • a88ad007de
    llama : add OLMo November 2024 support (#10394) b4130 Shane A 2024-11-19 01:04:08 -08:00
  • 2a1507c162
    sycl : Add option to set the SYCL architecture for all targets (#10266) b4129 Romain Biessy 2024-11-19 09:02:23 +01:00
  • b3e585988f
    vulkan: Optimize soft_max (#10301) b4128 Jeff Bolz 2024-11-19 01:25:17 -06:00
  • e32da6f163 CANN: Add Ascend CANN build ci jiahao su 2024-11-08 17:13:54 +08:00
  • 55be6a772c update rel to 4040 arthw 2024-11-19 08:51:25 +08:00
  • 557924f222
    sycl: Revert MUL_MAT_OP support changes (#10385) b4127 Alberto Cabrera Pérez 2024-11-19 00:50:04 +00:00
  • ce2e4ff5c5 Add building of OLMo November 2024 model Shane A 2024-11-18 10:39:41 -08:00
  • 25a44153b8 Add loading of OLMo November 2024 tensors and hyper parameters Shane A 2024-11-18 10:39:16 -08:00
  • dc3ae59335 Add OLMo November 2024 converter Shane A 2024-11-18 10:37:01 -08:00
  • 8c4a9137f3 Add OLMo November 2024 constants Shane A 2024-11-18 10:34:50 -08:00
  • d3481e6316
    cuda : only use native when supported by cmake (#10389) b4126 Diego Devesa 2024-11-18 18:43:40 +01:00
  • b4b92ace47 cuda : only use native when supported by cmake slaren 2024-11-18 17:47:44 +01:00
  • 531cb1c233
    Skip searching root path for cross-compile builds (#10383) bandoti 2024-11-18 11:23:58 -04:00
  • 57b2bf23cc vulkan: use larger K step per iteration in mul_mat_vec. Jeff Bolz 2024-11-17 23:41:51 -06:00
  • 55f477b114 vulkan: use larger loads in q5_k and q6_k shaders. Jeff Bolz 2024-11-17 23:39:31 -06:00
  • 6c3ad9342d vulkan: Add GLSL structure aliases for quant types to allow larger loads Jeff Bolz 2024-11-17 23:34:45 -06:00
  • 000a03bb5b vulkan: Use pipeline_robustness to disable robustness in mul_mat_vec. Jeff Bolz 2024-11-17 23:11:59 -06:00
  • f139d2ea61
    vulkan: remove use of null initializer (#10372) Jeff Bolz 2024-11-18 08:28:42 -06:00