Commit graph

  • 8007d0665f
    Merge cf59a8d50f into b60074f1c2 Henry Kroll III 2024-09-02 16:59:45 +02:00
  • 843d97b1c7
    Merge d25cd7f9e4 into b60074f1c2 Xuan Son Nguyen 2024-09-02 16:59:45 +02:00
  • b59daad6c2 docker : fix missing binaries in full-cuda image slaren 2024-09-02 16:48:58 +02:00
  • 3ba9d04c2a Fixed dmmv dequant for k<= GGML_SYCL_DMMV_X OuadiElfarouki 2024-09-02 14:06:58 +01:00
  • 86caa35343 use deque Xuan Son Nguyen 2024-09-02 14:58:46 +02:00
  • b60074f1c2
    llama-cli : remove duplicated log message (#9275) b3654 Guoliang Hua 2024-09-02 20:36:43 +08:00
  • 27f2c14aa9
    Apply suggestions from code review Xuan Son Nguyen 2024-09-02 14:16:14 +02:00
  • fa2210652a no more "mutable" lambda Xuan Son Nguyen 2024-09-02 13:58:56 +02:00
  • b56f321d47
    remove duplicated log message Guoliang Hua 2024-09-02 19:56:42 +08:00
  • 31a2d4a4a8 (try) fix test Xuan Son Nguyen 2024-09-02 13:36:18 +02:00
  • 9c1ba55733
    build(nix): Package gguf-py (#5664) Tushar 2024-09-02 16:51:01 +05:30
  • 954567dde8
    Merge 48507397d0 into c6d4cb4655 Someone 2024-09-02 10:46:59 +00:00
  • bd574ba487
    dev: Add tiktoken to -extra devShells ditsuke 2024-08-30 16:39:43 +05:30
  • fa0e7c9d5b
    chore: Remove some unused bindings ditsuke 2024-08-30 16:26:10 +05:30
  • dcee4754e5
    build(nix): Add pyyaml for gguf-py ditsuke 2024-08-20 03:05:48 +05:30
  • 06a8547a20
    dev: Simplify devShells, restore the -extra devShell ditsuke 2024-08-20 02:57:02 +05:30
  • cd8da29059
    revert: Bad changes ditsuke 2024-07-09 01:43:44 +05:30
  • 03b2afea24
    chore: Suggestions from review ditsuke 2024-07-08 23:33:43 +05:30
  • 4a71d86d85
    cleanup: Remove unncessary __init__.py ditsuke 2024-07-08 17:29:02 +05:30
  • 8e911d79fb
    style: nix fmt ditsuke 2024-07-08 17:27:06 +05:30
  • b7611bfb51
    fmt: Reconcile formatting with rebase ditsuke 2024-03-10 21:58:10 +05:30
  • a62a2c21b3
    chore: Move cmake to nativeBuildInputs for devShell ditsuke 2024-02-27 11:47:51 +05:30
  • d7b5776619
    build(python): Relax pytorch version constraint ditsuke 2024-02-27 00:52:51 +05:30
  • ef2dae9249
    dev(nix): Break up python/C devShells ditsuke 2024-02-27 00:51:12 +05:30
  • 51056d932c
    chore: Cleanup ditsuke 2024-02-27 00:50:01 +05:30
  • 10a49b9999
    build(python): Package python scripts with pyproject.toml ditsuke 2024-02-25 18:38:36 +05:30
  • 4b124fb2a4
    build(nix): Enable pytestCheckHook and pythonImportsCheck for gguf-py ditsuke 2024-02-25 17:14:33 +05:30
  • 11e581b45d
    build(nix): Refactor gguf-py derivation to take in exact deps ditsuke 2024-02-23 23:01:19 +05:30
  • 0b8ddf8694
    build(nix): Exclude gguf-py from devShells ditsuke 2024-02-23 15:12:13 +05:30
  • f363d308a4
    build(nix): Refactor to new scope for gguf-py ditsuke 2024-02-22 22:44:35 +05:30
  • c3bc2f6ddf
    build(nix): Package gguf-py ditsuke 2024-02-22 21:02:44 +05:30
  • 0126788271
    style: format with nixfmt/rfc101-style ditsuke 2024-02-22 23:55:55 +05:30
  • 24329aac1e use unordered_set everywhere Xuan Son Nguyen 2024-09-02 11:18:39 +02:00
  • 0b082d1346
    Merge 28670bfbc8 into c6d4cb4655 Jesse Noller 2024-09-02 09:53:30 +01:00
  • c6d4cb4655
    llama : minor style b3652 Georgi Gerganov 2024-09-02 11:52:04 +03:00
  • 086e7f6ebc
    llama : disambiguate API Georgi Gerganov 2024-09-02 10:06:42 +03:00
  • 4e8f9a04b5 src: make tail invalid when kv cell is intersection for mamba zhenweijin 2024-08-30 15:59:25 +08:00
  • 375de5b1f8 llama : use unused n_embd_k_gqa in k_shift compilade/refactor-kv-cache Francis Couture-Harpin 2024-09-01 21:59:24 -04:00
  • 5f62db790b llama : fix mixed signedness comparison Francis Couture-Harpin 2024-09-01 21:50:27 -04:00
  • 9d3f44dad4 convert_hf : fix Jamba conversion Francis Couture-Harpin 2024-09-01 21:46:27 -04:00
  • a03e32a3c9 Merge branch 'master' into compilade/refactor-kv-cache Francis Couture-Harpin 2024-09-01 20:47:59 -04:00
  • fcb889cf7f llama : session saving and reloading for hybrid models Francis Couture-Harpin 2024-09-01 20:31:30 -04:00
  • 83249aae0c small change for handle_slots_action Xuan Son Nguyen 2024-09-02 00:43:51 +02:00
  • 588b4bbad6 use res_ok everywhere Xuan Son Nguyen 2024-09-02 00:36:16 +02:00
  • 9f56c17669 fix embeddings Xuan Son Nguyen 2024-09-02 00:31:40 +02:00
  • 4a5dbd85b5 refactor completions handler Xuan Son Nguyen 2024-09-02 00:09:05 +02:00
  • 012d8d8cc0 server : remove multitask from server_task Xuan Son Nguyen 2024-09-01 18:09:58 +02:00
  • 71086fefba
    Merge b27f87d6da into 8f1d81a0b6 Marko Hostnik 2024-09-01 17:50:40 +02:00
  • 8f1d81a0b6
    llama : support RWKV v6 models (#8980) b3651 Molly Sophia 2024-09-01 22:38:17 +08:00
  • cf4fd9c61b ggml: fix build break for the vulkan-debug. Changyeon Kim 2024-09-01 22:50:53 +09:00
  • bd59dd9832 Add missing pthread_np.h include on FreeBSD Yuri Victorovich 2024-08-30 23:34:15 -07:00
  • bc320ef66d Merge branch 'master' into compilade/refactor-kv-cache Francis Couture-Harpin 2024-08-31 21:06:32 -04:00
  • 3cb2753f6b flake.lock: Update github-actions[bot] 2024-09-01 00:23:25 +00:00
  • 192d4dfa60 apply comments farbod 2024-08-31 17:15:31 +03:30
  • 23eba9bf55
    Update common/common.cpp Farbod Bijary 2024-08-31 17:09:29 +03:30
  • 3a957ec149
    Update common/common.cpp Farbod Bijary 2024-08-31 17:09:20 +03:30
  • 4badab2b6c
    SYCL: do not skip CPU or FPGA if explicitly selected Salvatore Mesoraca 2024-08-31 13:26:16 +02:00
  • a47667cff4 nix: fix CUDA build - replace deprecated autoAddOpenGLRunpathHook Echo Nolan 2024-08-22 17:19:14 -04:00
  • ea5d7478b1
    sgemm : improved Q4_0 and Q8_0 performance via 4xN and Mx4 gemm (#8908) b3649 Srihari-mcw 2024-08-31 13:50:35 +05:30
  • 49271efbaf
    llama : fix typo in xcda_array_view comment [no ci] (#9132) Daniel Bevenius 2024-08-31 09:50:22 +02:00
  • 846358d358 ggml: rwkv_wkv: Avoid copying the state Molly Sophia 2024-08-31 12:17:08 +08:00
  • 5175375715
    llama: rwkv6: Avoid division by zero Molly Sophia 2024-08-31 11:59:30 +08:00
  • 408c8402b6
    Merge 8c9784c65d into 0ab30f8d82 Johannes Gäßler 2024-08-30 21:08:15 -05:00
  • 388f15d36b nix: fix CUDA build - replace deprecated autoAddOpenGLRunpathHook Echo Nolan 2024-08-22 17:19:14 -04:00
  • cf59a8d50f
    Merge branch 'ggerganov:master' into hk Henry Kroll III 2024-08-30 14:36:27 -08:00
  • 951084cb9f feat: Implements retrying logic for downloading models using --model-url flag farbod 2024-08-31 00:20:43 +03:30
  • d25cd7f9e4 refactor Xuan Son Nguyen 2024-08-30 21:51:12 +02:00
  • 5f06d37baf add --tool-call argument Xuan Son Nguyen 2024-08-30 21:40:49 +02:00
  • 0ab30f8d82
    llama : fix llama_split_mode enum values in main_gpu document (#9057) b3647 Sutou Kouhei 2024-08-31 03:08:10 +09:00
  • 7e017cfbc8 server : add Hermes-3 tool call support Xuan Son Nguyen 2024-08-30 18:02:28 +02:00
  • 2e15e0fa21 Merge branch 'master' into xsn/full_image_less Xuan Son Nguyen 2024-08-30 14:15:42 +02:00
  • cddae4884c
    Correct typo run_llama2.sh > run-llama2.sh (#9149) 蕭澧邦 2024-08-30 20:10:01 +08:00
  • 59dc2e7099
    minor : style + indentation Georgi Gerganov 2024-08-30 13:30:52 +03:00
  • 7004323ecd
    rwkv : speed-up tokenization using trie Georgi Gerganov 2024-08-30 13:19:14 +03:00
  • 06e3e3bf51 Enable use to the rebar feature to upload buffers to the device. Markus Tavenrath 2024-08-30 11:36:18 +02:00
  • fe8b5d4e75
    chore: add llama.js to readme Matt 2024-08-30 09:29:50 +01:00
  • 7ea8d80d53
    llava : the function "clip" should be int (#9237) b3645 tc-mb 2024-08-30 13:21:57 +08:00
  • 7f2ef56639 llama: rwkv6: Add lora for some supported tensors Molly Sophia 2024-08-30 12:11:31 +08:00
  • c9be2ba124 Merge branch 'master' into dev-refactoring hongruichen 2024-08-30 10:41:04 +08:00
  • 42c76d1358
    Threadpool: take 2 (#8672) b3644 Faisal Zaghloul 2024-08-29 19:20:53 -04:00
  • 9f7d4bcf5c server : fix crash when error handler dumps invalid utf-8 json (#9195) b3643 Jan Boon 2024-08-27 18:28:06 +08:00
  • 4e16578393 remove not required self dependency Markus Tavenrath 2024-08-29 16:05:06 +02:00
  • cbf67e6b3d Improve Vulkan shader builds system Markus Tavenrath 2024-08-29 15:49:58 +02:00
  • 86ad82325c Merge branch 'the-function-"clip"-should-be-int' into support-video-understanding caitianchi 2024-08-29 19:41:06 +08:00
  • 8eefb1e413 the function "clip" should be int caitianchi 2024-08-29 19:32:38 +08:00
  • 52aa67723a build: set _GNU_SOURCE for Adroid Max Krasnyansky 2024-08-28 22:24:30 -07:00
  • 3b5f7c2a9b fix two more public APIs to use int32_t for n_threads Max Krasnyansky 2024-08-28 21:56:53 -07:00
  • c49d634071 threadpool: use _new and _free instead of _create and _release Max Krasnyansky 2024-08-28 21:34:16 -07:00
  • 1d1ccce676
    flake.lock: Update (#9162) Georgi Gerganov 2024-08-29 07:28:14 +03:00
  • cae35b9fb9 use int32_t for n_thread type in public llama.cpp API Max Krasnyansky 2024-08-28 21:17:11 -07:00
  • b97bd67e2b threadpool: fix indent in set_threadpool call Max Krasnyansky 2024-08-28 21:04:02 -07:00
  • c6c27b140a
    Update examples/llama-bench/llama-bench.cpp Max Krasnyansky 2024-08-28 20:54:42 -07:00
  • 63b6e73500 recommit for ci pass pidack 2024-08-29 11:17:12 +08:00
  • 99f2ac1a9d Merge branch 'master' of github.com:ggerganov/llama.cpp into mfalcon_mamba_cuda pidack 2024-08-29 10:36:51 +08:00
  • 316a049533 add restrict for dst pidack 2024-08-29 10:36:33 +08:00
  • 6c1f137ba5 another push Yutong Dai 2024-08-28 23:36:09 +00:00
  • f70fdf5a86 add merge type Yutong Dai 2024-08-28 22:50:16 +00:00
  • eb357d0822 Fix compilation w/o curl ochafik 2024-08-28 20:27:11 +01:00
  • 2cfba5a4d7 Merge remote-tracking branch 'origin/master' into json-refs ochafik 2024-08-28 20:25:27 +01:00
  • f645b0bc8c vit v_tokenizer integration root 2024-08-28 19:21:01 +00:00