Commit graph

  • 16c22471e8 remove redundant omni-vlm-v2/ folder, all omni-vlm examples will be added to omni-vlm/ folder. 李为 2024-11-08 20:59:23 +08:00
  • 841f27abdb
    metal : optimize FA kernels (#10171) Georgi Gerganov 2024-11-08 13:47:22 +02:00
  • a2385da59c
    make : clean-up [no ci] gg/metal-fa-f16 Georgi Gerganov 2024-11-08 13:46:20 +02:00
  • d05b3127bd
    swift : exclude ggml-metal-embed.metal (#10211) b4050 Jhen-Jie Hong 2024-11-08 17:34:06 +08:00
  • b89e71b195
    metal : fix BF16 requirement for FA kernels Georgi Gerganov 2024-11-08 11:28:04 +02:00
  • 03914340cf swift : exclude build/ Jhen-Jie Hong 2024-11-08 16:53:51 +08:00
  • bc143ecf81
    cuda : disable BF16 FA Georgi Gerganov 2024-11-08 10:27:43 +02:00
  • 54ab7282e5 ggml: fix zero division in ‘dne’ calculation in CUDA COUNT_EQUAL operator when ‘ne’ is small SXX 2024-11-08 16:09:07 +08:00
  • 5d1a10d275
    metal : prevent int overflows [no ci] Georgi Gerganov 2024-11-07 22:11:24 +02:00
  • 486a5eb8c1
    build : remove obsolete compile flag [no ci] Georgi Gerganov 2024-11-07 21:51:28 +02:00
  • 120d51285c
    metal : compile-guard bf16 FA kernels Georgi Gerganov 2024-11-07 21:38:37 +02:00
  • 2fccc8ac2d
    metal : minor clean-up Georgi Gerganov 2024-11-07 21:29:22 +02:00
  • 7facc29d69
    metal : use F16 precision in FA kernels Georgi Gerganov 2024-11-06 15:33:30 +02:00
  • 25e877309a
    ggml : add ggml_flash_attn_ext_get_prec Georgi Gerganov 2024-11-06 15:09:47 +02:00
  • b17684efb3 add include llava.h liute110 2024-11-08 16:07:50 +08:00
  • 400fc2a4b0 add one more model liute110 2024-11-08 16:06:37 +08:00
  • 86c2233a38 add submodule llava for android liute110 2024-11-08 16:02:45 +08:00
  • 877a495245 Fix to guarantee K-Shift on the first step only MaggotHATE 2024-11-08 11:55:10 +05:00
  • def47780cc fix: add a new line to the end of the file Junil Kim 2024-11-08 15:41:26 +09:00
  • a249dc0fbb Merge branch 'master' of https://github.com/piDack/llama.cpp into support_glm_edge_model liyuhang 2024-11-08 03:53:05 +00:00
  • 677058f470 add glm edge chat model liyuhang 2024-11-08 03:33:43 +00:00
  • 2516e8c4a7 llama.swift : exclude ggml-metal-embed.metal Jhen 2024-11-08 09:57:46 +08:00
  • 13dfe631cb Merge https://github.com/ggerganov/llama.cpp into avx_opt Eve 2024-11-07 20:37:46 -05:00
  • 54e6c887ac Merge branch 'avx_opt' of https://github.com/netrunnereve/llama.cpp into avx_opt Eve 2024-11-07 20:37:28 -05:00
  • 05365761b9 docs: add doxygen documentation Junil Kim 2024-11-08 09:09:20 +09:00
  • 76c6e7f105
    server : minor UI fix (#10207) Xuan Son Nguyen 2024-11-07 18:44:38 -04:00
  • c7a37e93c9 server : minor UI fix Xuan Son Nguyen 2024-11-07 17:36:11 -04:00
  • a71d81cf8c
    server : revamp chat UI with vuejs and daisyui (#10175) b4048 Xuan Son Nguyen 2024-11-07 17:31:10 -04:00
  • eec4d71737
    scripts : add amx to sync-ggml.sh [no ci] Georgi Gerganov 2024-11-07 23:11:36 +02:00
  • 3b08828674
    sync : ggml Georgi Gerganov 2024-11-07 23:08:24 +02:00
  • a2c6fd747c
    scripts : sync update Georgi Gerganov 2024-11-07 23:07:55 +02:00
  • 94accca4c2
    vec move mask to shmem gg/metal-fa-f16-save Georgi Gerganov 2024-11-07 20:58:10 +02:00
  • 3b9625032c
    f16 vec Georgi Gerganov 2024-11-07 20:34:16 +02:00
  • 09b40828ea
    Merge 840a2b1c61 into 97404c4a03 MaggotHATE 2024-11-07 18:32:42 +00:00
  • 840a2b1c61
    Merge branch 'ggerganov:master' into k-shift2 MaggotHATE 2024-11-07 23:32:39 +05:00
  • 64ac28984d lighter bubble color (less distract when reading) Xuan Son Nguyen 2024-11-07 14:20:22 -04:00
  • 8f0ef15265
    clean-up Georgi Gerganov 2024-11-07 20:02:31 +02:00
  • 022e5e90e9
    remove compile flag Georgi Gerganov 2024-11-07 19:18:31 +02:00
  • 97404c4a03
    ggml : add ggml-cpu.h to the public headers (#10204) b4044 Diego Devesa 2024-11-07 18:16:08 +01:00
  • 0deabb0d72 ggml : add ggml-cpu.h to the public headers slaren 2024-11-07 18:00:32 +01:00
  • 60e17ce23c
    Remove identical wte/etw logic for jais (#10203) Faisal Zaghloul 2024-11-07 11:46:12 -05:00
  • 86ff9e4303 Remove identical wte/etw logic for jais fmz 2024-11-07 08:30:25 -08:00
  • a6c8dbfa5d
    wip Georgi Gerganov 2024-11-07 18:20:25 +02:00
  • 5107e8cea3
    DRY: Fixes clone functionality (#10192) b4042 wwoodsTM 2024-11-07 08:20:25 -07:00
  • 4abeb60a1a
    int64 dst Georgi Gerganov 2024-11-07 17:17:29 +02:00
  • 3ab47eb746
    float -> half regs Georgi Gerganov 2024-11-07 17:06:34 +02:00
  • e121d82f6a
    64-bit -> 32-bit Georgi Gerganov 2024-11-07 17:00:06 +02:00
  • a75cdcca60
    remove inner if mask Georgi Gerganov 2024-11-07 16:40:29 +02:00
  • 09086b60fb
    Merge branch 'ggerganov:master' into master momonga 2024-11-07 22:02:02 +09:00
  • 4a0d28efb7 fix style for <pre> element Xuan Son Nguyen 2024-11-07 07:36:40 -04:00
  • 61d05b57d9
    remove ms array Georgi Gerganov 2024-11-07 13:35:33 +02:00
  • 85987af7e8 remove console.log Xuan Son Nguyen 2024-11-07 07:22:24 -04:00
  • f51c78bbfa small fix Xuan Son Nguyen 2024-11-07 07:22:09 -04:00
  • f2268fad81 fix closeAndSaveConfigDialog Xuan Son Nguyen 2024-11-07 07:15:19 -04:00
  • 984928109c
    move mask to shared mem Georgi Gerganov 2024-11-07 13:12:10 +02:00
  • 2aedbb354e
    wip 5 Georgi Gerganov 2024-11-07 12:32:59 +02:00
  • df5841b6b8
    Merge pull request #13 from NexaAI/weili/master-release Zack Li 2024-11-07 00:48:21 -08:00
  • dc2a27f2a2
    wip 4 Georgi Gerganov 2024-11-07 09:26:10 +02:00
  • 3dfac7817f add returned string type (const char*) for nexa-omni-audio 李为 2024-11-07 11:19:50 +08:00
  • 2319126a70
    fix q4_0_8_8 format for corrupted tokens issue (#10198) b4041 snadampal 2024-11-07 02:02:08 -06:00
  • 3bcd40b3c5
    Optimize RWKV6 Operator Naming and Implement Multi-core CPU/ SYCL Acceleration (#10133) b4040 Zhiyuan Li 2024-11-07 18:19:10 +11:00
  • f7d3fe151c
    Merge branch 'ggerganov:master' into k-shift2 MaggotHATE 2024-11-07 12:06:45 +05:00
  • 26f3174c8f fix q4_0_8_8 format for corrupted tokens issue EC2 Default User 2024-11-07 05:57:52 +00:00
  • 20b9f02cee
    Merge pull request #12 from NexaAI/weili/master-release Zack Li 2024-11-06 19:28:46 -08:00
  • 5edadffd88 add returned string type (const char*) for nexa-omni-audio 李为 2024-11-07 11:19:50 +08:00
  • dc1b0775fa use collapse-arrow Xuan Son Nguyen 2024-11-06 22:03:57 -04:00
  • 9bd5ae09ae
    wip 3 Georgi Gerganov 2024-11-06 22:52:33 +02:00
  • 2335086fd3
    wip2 Georgi Gerganov 2024-11-06 22:04:07 +02:00
  • 01c7f11224
    wip Georgi Gerganov 2024-11-06 21:06:56 +02:00
  • 0f7e8f389d
    metal : add GGML_METAL_FORCE_FATTN_PREC_F16 Georgi Gerganov 2024-11-06 16:21:37 +02:00
  • eefc132bb7
    metal : use F16 precision in FA kernel Georgi Gerganov 2024-11-06 15:33:30 +02:00
  • 22a9311a1a
    ggml : add ggml_flash_attn_ext_get_prec Georgi Gerganov 2024-11-06 15:09:47 +02:00
  • 5c333e0140
    metal : add BF16 support (#8439) Georgi Gerganov 2024-11-06 19:53:51 +02:00
  • 670b8dbe7c
    metal : this should correctly check bfloat support Georgi Gerganov 2024-11-06 19:31:45 +02:00
  • 69698299ee
    metal : try to fix BF16 support check Georgi Gerganov 2024-11-06 19:19:06 +02:00
  • ad1226982f
    metal : do not build bfloat kernels when not supported Georgi Gerganov 2024-11-06 19:02:28 +02:00
  • a408f51906
    metal : better var names [no ci] Georgi Gerganov 2024-11-06 18:53:22 +02:00
  • 3ee077a7c8
    metal : check for bfloat support on the Metal device Georgi Gerganov 2024-11-06 18:48:34 +02:00
  • c915d0add5
    metal : add mul_mat_id BF16 support Georgi Gerganov 2024-11-06 18:29:24 +02:00
  • b97259e69b small fixes Xuan Son Nguyen 2024-11-06 16:24:54 +01:00
  • 6a21c4d020 better auto scroll Xuan Son Nguyen 2024-11-06 15:14:58 +01:00
  • e25be408a6 clean up a bit Xuan Son Nguyen 2024-11-06 14:58:08 +01:00
  • 6109cf151e
    ggml : add initial BF16 support Georgi Gerganov 2024-11-06 14:14:47 +02:00
  • b11f9ba9b8
    server : remove hack for extra parallel slot (#10187) b4038 Georgi Gerganov 2024-11-06 13:29:01 +02:00
  • 94d8cb8be1
    metal : fix from ptr buffer name (#10189) b4037 Diego Devesa 2024-11-06 12:10:07 +01:00
  • c3beb9b9dc
    server : remove hack for extra parallel slot Georgi Gerganov 2024-11-05 22:31:11 +02:00
  • 5ed18f9e78
    Merge branch 'ggerganov:master' into k-shift2 MaggotHATE 2024-11-06 14:53:39 +05:00
  • 34c01b9c51 fix tests Xuan Son Nguyen 2024-11-06 10:34:47 +01:00
  • 1dc04b2dee
    ggml : adjust is_first_call init value (#10193) b4036 Georgi Gerganov 2024-11-06 11:20:10 +02:00
  • 076c793e82
    Merge 3f7d85da1e into a1eaf6a960 Qingtao Li 2024-11-06 16:26:18 +08:00
  • a1eaf6a960
    metal : add quantized FA support (#10149) Georgi Gerganov 2024-11-06 10:24:23 +02:00
  • 2edbdc8ad4
    ggml : adjust is_first_call init Georgi Gerganov 2024-11-06 10:12:09 +02:00
  • 0483b6daa4 DRY: Fixes clone functionality wwoodsTM 2024-11-06 01:06:54 -07:00
  • c5d8bb5a81 leave only basic functions for SYCL CI fix_sycl_ci Meng, Hengyu 2024-11-06 07:47:50 +00:00
  • 6a4cf0b983
    Merge pull request #11 from NexaAI/weili/master-release Zack Li 2024-11-05 23:27:47 -08:00
  • b24a409e22 add returned string (const char*) for qwen2 audio 李为 2024-11-06 15:23:59 +08:00
  • b99e7f977f Clean up and add more correctness Jason Flax 2024-11-06 01:04:09 -05:00
  • 3f7d85da1e [fix] Put ggml_tmac_init at correct place. Qingtao Li 2024-11-06 13:39:12 +08:00
  • 5574bda471
    Merge pull request #10 from NexaAI/weili/master-release Zack Li 2024-11-05 19:41:03 -08:00
  • 22da7bc379 add returned string (pure c const char* type) for omni-vlm inference api 李为 2024-11-06 11:20:36 +08:00