Commit graph

  • 55ed36b3e4
    finetune: fix typo in README.md Daniel Bevenius 2024-01-02 08:59:28 +01:00
  • 9e02214e98 fix: update torch version namtranase 2024-01-02 14:58:40 +07:00
  • 6c46cb1da4 Merge branch 'master' of https://github.com/ggerganov/llama.cpp namtranase 2024-01-02 14:54:05 +07:00
  • 94e68fe474 added field to show recent seed Concedo 2024-01-02 15:35:04 +08:00
  • 1484483880
    Merge branch 'ggerganov:master' into sized-ints Marcus Dunn 2024-01-01 15:56:00 -08:00
  • 76f9d41dd6
    metal : optimizing ggml_mul_mat_id (wip) Georgi Gerganov 2023-12-31 18:14:02 +02:00
  • 6e9cacd692 devcontainer Liora Milbaum 2024-01-01 17:32:08 +02:00
  • 64b9e5f39d
    Merge 8f83ca592d into edd1ab7bc3 John 2024-01-01 00:52:33 -08:00
  • 9e0dee769b Merge branch 'master' into concedo_experimental Concedo 2024-01-01 16:04:17 +08:00
  • abecbfc6b5
    Add Conda Runtime support for -lcuda (#593) henk717 2024-01-01 05:15:10 +01:00
  • 50384689fb
    Merge b38a1e6b02 into edd1ab7bc3 Deepak Seth 2024-01-01 01:35:05 +00:00
  • 0ffe33f824
    Merge branch 'ggerganov:master' into server-features minarchist 2023-12-31 18:54:27 -06:00
  • 4c1c0d68f2 Fix llm_load_tensors: the asserts were not backcompat Nam Nguyen 2023-12-31 16:38:58 -08:00
  • e2e9f32be5 flake.lock: update Someone Serge 2023-12-31 17:42:22 +00:00
  • 9321fd7fd2 flake.nix: suggest the binary caches Someone Serge 2023-12-30 18:25:25 +00:00
  • 453c34930b workflows: nix-ci: add a qemu job for jetsons Someone Serge 2023-12-30 18:01:07 +00:00
  • e84fbed569 workflows: nix-flakestry: drop tag filters Someone Serge 2023-12-30 17:36:08 +00:00
  • 9c2b03618a workflows: weekly nix flake update Someone Serge 2023-12-30 16:38:36 +00:00
  • 422ad677d5 workflows: nix-ci: add a job for eval Someone Serge 2023-12-30 17:19:11 +00:00
  • 4482c708bb workflows: nix-ci: init; build flake outputs Someone Serge 2023-12-26 19:17:26 +00:00
  • 6867651329 flake.nix: expose checks Someone Serge 2023-12-29 16:21:50 +00:00
  • 2bae3b44f3 flake.nix: rocm not yet supported on aarch64, so hide the output Someone Serge 2023-12-26 23:34:40 +00:00
  • 05a2e1c34c flake.nix: expose full scope in legacyPackages Someone Serge 2023-12-29 16:15:37 +00:00
  • 4cb6855b97 documentation John 2023-12-31 18:26:26 -06:00
  • edd1ab7bc3 flake.lock: update b1742 Someone Serge 2023-12-31 17:42:22 +00:00
  • 198ed7ebfc flake.nix: suggest the binary caches Someone Serge 2023-12-30 18:25:25 +00:00
  • d836174731 workflows: nix-ci: add a qemu job for jetsons Someone Serge 2023-12-30 18:01:07 +00:00
  • 06f2a5d190 workflows: nix-flakestry: drop tag filters Someone Serge 2023-12-30 17:36:08 +00:00
  • c5239944ba workflows: weekly nix flake update Someone Serge 2023-12-30 16:38:36 +00:00
  • 1e9ae54cf2 workflows: nix-ci: add a job for eval Someone Serge 2023-12-30 17:19:11 +00:00
  • 7adedecbe3 workflows: nix-ci: init; build flake outputs Someone Serge 2023-12-26 19:17:26 +00:00
  • 356ea17e0f flake.nix: expose checks Someone Serge 2023-12-29 16:21:50 +00:00
  • a5c088d8c6 flake.nix: rocm not yet supported on aarch64, so hide the output Someone Serge 2023-12-26 23:34:40 +00:00
  • 1e3900ebac flake.nix: expose full scope in legacyPackages Someone Serge 2023-12-29 16:15:37 +00:00
  • 522e534903
    Update default values for n_embd_head_k and n_embd_head_v postmasters 2023-12-31 12:55:33 -08:00
  • acbc223b3a
    flake.lock: update Someone Serge 2023-12-31 17:42:22 +00:00
  • 8dd90131d3 try HIP fix JohannesGaessler 2023-12-31 17:51:53 +01:00
  • 44d626239f try HIP fix JohannesGaessler 2023-12-31 17:49:00 +01:00
  • e53e34a8b4
    flake.nix: suggest the binary caches Someone Serge 2023-12-30 18:25:25 +00:00
  • 74106bd842
    workflows: nix-ci: add a qemu job for jetsons Someone Serge 2023-12-30 18:01:07 +00:00
  • 1c9a6c5c4b fixup JohannesGaessler 2023-12-31 17:16:39 +01:00
  • c6047a0db5 4 cuda streams JohannesGaessler 2023-12-31 16:37:26 +01:00
  • 24a06aff14
    workflows: nix-flakestry: drop tag filters Someone Serge 2023-12-30 17:36:08 +00:00
  • 452749e792 try fix JohannesGaessler 2023-12-31 15:47:35 +01:00
  • f0a176bd43 multiple streams JohannesGaessler 2023-12-29 23:15:02 +01:00
  • 4b5dc6750d
    workflows: weekly nix flake update Someone Serge 2023-12-30 16:38:36 +00:00
  • d79b98eee5
    workflows: nix-ci: add a job for eval Someone Serge 2023-12-30 17:19:11 +00:00
  • 71d4c29c43
    workflows: nix-ci: init; build flake outputs Someone Serge 2023-12-26 19:17:26 +00:00
  • 7e5c925b0a
    Merge branch 'ggerganov:master' into server-features minarchist 2023-12-31 06:47:44 -06:00
  • 5865b18eeb
    metal : fix mat-vec Q4_K kernel for QK_K == 64 Georgi Gerganov 2023-12-31 13:52:34 +02:00
  • a8b9bb4566
    cmake : respect LLAMA_QKK_64 option Georgi Gerganov 2023-12-31 13:34:07 +02:00
  • 049a32fffa
    metal : normalize mat-vec kernel signatures Georgi Gerganov 2023-12-31 12:31:26 +02:00
  • ad7cf37fe8
    metal : fix mat-vec Q8_0 kernel for BS > 1 Georgi Gerganov 2023-12-31 12:26:21 +02:00
  • 6435a3de31
    cmake : rename option to LLAMA_METAL_SHADER_DEBUG Georgi Gerganov 2023-12-31 12:18:48 +02:00
  • 4c054d98d4
    metal : use uint64_t for strides Georgi Gerganov 2023-12-31 12:07:58 +02:00
  • b14b5a9eb3
    metal : fix compile warnings Georgi Gerganov 2023-12-31 12:04:05 +02:00
  • e39106c055
    ggml : add ggml_vdotq_s32 alias (#4715) b1732 Georgi Gerganov 2023-12-31 11:43:31 +02:00
  • 453ae052c3
    ggml : add ggml_vdotq_s32 alias Georgi Gerganov 2023-12-31 11:04:31 +02:00
  • 82033c9b18
    server : send token probs for "stream == false" Georgi Gerganov 2023-12-31 10:21:34 +02:00
  • 76362ae3c1 fix makefile for linux cuda Concedo 2023-12-31 11:45:36 +08:00
  • cead207888 add missing dependency for linux cuda Concedo 2023-12-31 11:10:40 +08:00
  • bd5ecc356e updated readme Concedo 2023-12-31 10:53:06 +08:00
  • f56cbce6b7 Fix llm_build_kqv to be more generic wrt n_embd_head_k Nam Nguyen 2023-12-30 14:26:46 -08:00
  • e147e1b76f
    Merge branch 'ggerganov:master' into server-features minarchist 2023-12-30 21:29:02 +00:00
  • c3b1d51d40 Changes to server to allow metadata override Minarchist 2023-12-30 15:26:27 -06:00
  • 9fbda719de
    clip : refactor + bug fixes (#4696) b1731 Georgi Gerganov 2023-12-30 23:24:42 +02:00
  • 5c77ce9e2d
    server : add log message Georgi Gerganov 2023-12-30 23:24:25 +02:00
  • 1580805fc6
    metal : fix API debug warnings Georgi Gerganov 2023-12-30 21:10:32 +02:00
  • a184e1050c
    cmake : add -fno-inline for Metal build (#4545) Georgi Gerganov 2023-12-30 21:10:13 +02:00
  • 515cfec44f
    metal : fix Metal API debug warnings Georgi Gerganov 2023-12-30 20:34:53 +02:00
  • ca0578fcab
    flake.nix: expose checks Someone Serge 2023-12-29 16:21:50 +00:00
  • 91d219f45c
    flake.nix: rocm not yet supported on aarch64, so hide the output Someone Serge 2023-12-26 23:34:40 +00:00
  • 9a0664b60c
    flake.nix: expose full scope in legacyPackages Someone Serge 2023-12-29 16:15:37 +00:00
  • 75c14f2608
    ggml : disable fast-math for Metal (cmake build only) Georgi Gerganov 2023-12-30 19:33:01 +02:00
  • f64e4f04e7
    ggml : testing GPU FP precision via quantized CPY gg/gpu-prec-tests Georgi Gerganov 2023-12-30 13:22:57 +02:00
  • eee674045e use native cl if found Concedo 2023-12-31 00:53:22 +08:00
  • fe7c200610 Merge branch 'master' into concedo_experimental Concedo 2023-12-31 00:42:59 +08:00
  • 5a02328d1f No second least squares pass Henrik Forstén 2023-12-30 18:31:46 +02:00
  • 24c3f3283a fixed numerical parsing for steps Concedo 2023-12-31 00:17:17 +08:00
  • 7ce32a151e updated lite, added background images and image gen support for custom step counts and cfg scales (+1 squashed commits) Concedo 2023-12-30 23:59:30 +08:00
  • 4c5da24ddb
    flake.nix: fix typo Ikko Eltociear Ashimine 2023-12-31 00:25:15 +09:00
  • 39d8bc71ed
    CUDA: fixed tensor cores not being used on RDNA3 (#4697) b1730 Johannes Gäßler 2023-12-30 13:52:01 +01:00
  • 7b36cea8a3 Add matmul shader support for running multiple calculations in parallel 0cc4m 2023-12-30 12:36:24 +01:00
  • e221adb3c4 CUDA: fixed tensor cores not being used on RDNA3 JohannesGaessler 2023-12-30 11:06:40 +01:00
  • 24a447e20a
    ggml : add ggml_cpu_has_avx_vnni() (#4589) b1729 automaticcat 2023-12-30 15:07:48 +07:00
  • ec92e78165
    clip : refactor + bug fixes Georgi Gerganov 2023-12-30 09:30:29 +02:00
  • d00d70cd3f
    Update clip.cpp John 2023-12-30 06:22:35 +01:00
  • f57b1d0f50
    offbyone ? John 2023-12-30 06:01:47 +01:00
  • bb13098206 add missing completion params to chat Peter Nagymathe 2023-12-30 03:35:12 +00:00
  • 6177196052 tweak tooltips Concedo 2023-12-30 11:02:30 +08:00
  • 9c3e7e0c77 apply custom stop tokens Peter Nagymathe 2023-12-30 02:57:23 +00:00
  • fe6630c23e remove prints Peter Nagymathe 2023-12-30 02:43:24 +00:00
  • 7ad92dbf4a cleaned up the quick tab based on the suggested removals from discord members. Concedo 2023-12-30 10:41:46 +08:00
  • 51e251a83c Rename variables Nam Nguyen 2023-12-29 16:54:12 -08:00
  • 6200da58fc Rebase Nam Nguyen 2023-12-29 08:48:02 -08:00
  • e96fad12c5 Fix llm_build_kqv to use n_value_gqa Nam Nguyen 2023-12-28 13:29:57 -08:00
  • 94d170b7e9 Add n_key_dim and n_value_dim Nam Nguyen 2023-12-27 10:01:34 -08:00
  • e92acebdf3
    Update ggml.c automaticcat 2023-12-30 07:13:39 +07:00
  • a20f3c7465
    CUDA: fix tensor core logic for Pascal and HIP (#4682) b1728 Johannes Gäßler 2023-12-29 23:12:53 +01:00
  • 5f12e26899 changes the ggml package url src to ggerganov Ashraful Islam 2023-12-29 14:21:34 -06:00