Commit graph

  • 894ed8d7b6 py : include imatrix converter requirements in toplevel requirements Francis Couture-Harpin 2024-09-09 22:20:18 -04:00
  • 9e6b0e9419 perplexity : revert changes Francis Couture-Harpin 2024-09-09 22:00:37 -04:00
  • 503630e88a py : add requirements for legacy imatrix convert script Francis Couture-Harpin 2024-09-09 21:56:04 -04:00
  • c50293c348 make : do not run llama-gen-docs when building slaren 2024-09-10 03:46:38 +02:00
  • 1c57c3a54a llama : move random seed generation to the samplers slaren 2024-09-10 03:41:16 +02:00
  • 94596be679 convert : identify missing model files Francis Couture-Harpin 2024-09-09 19:36:31 -04:00
  • 250df0e909 llama_sampler_penalties : clamp penalty_last_n to zero slaren 2024-09-10 02:49:51 +02:00
  • 141dd55e53 convert : refactor rope_freqs generation Francis Couture-Harpin 2024-09-08 20:01:13 -04:00
  • f07cb4a73d initial integration of VIT+Projector root 2024-09-09 23:28:31 +00:00
  • 1d5a2df1ef sycl : update support condition to im2col Alberto Cabrera 2024-09-09 22:52:53 +01:00
  • bfe76d4a17
    common : move arg parser code to arg.cpp (#9388) b3716 Xuan Son Nguyen 2024-09-09 23:36:09 +02:00
  • b3a4218aa9
    Merge c882647e7c into 293bebe077 Pavel Fatin 2024-09-09 21:34:09 +01:00
  • decde48be7 fix test Xuan Son Nguyen 2024-09-09 20:31:52 +02:00
  • 42d5fc1986 update server readme Xuan Son Nguyen 2024-09-09 20:25:53 +02:00
  • 96311e3248 refactor gpt_params_parse Xuan Son Nguyen 2024-09-09 20:23:45 +02:00
  • cf2a874142 fix build Xuan Son Nguyen 2024-09-09 19:02:40 +02:00
  • 293bebe077
    rpc : fix segfault with nkvo (#9389) b3715 Radoslav Gerganov 2024-09-09 18:40:10 +03:00
  • 5fac4d5764
    ggml : vector length agnostic SVE support (#9290) b3714 Prashant Vithule 2024-09-09 21:07:18 +05:30
  • bb689e1d82
    Update ggml/src/ggml-quants.c Prashant Vithule 2024-09-09 21:02:43 +05:30
  • 6412a598a1
    common : more explicit includes Georgi Gerganov 2024-09-09 18:22:25 +03:00
  • 5fb5e24811
    llama : minor sampling refactor (2) (#9386) b3713 slaren 2024-09-09 17:10:46 +02:00
  • 3e03807043 missing cstdarg Xuan Son Nguyen 2024-09-09 16:26:39 +02:00
  • fa00ec0e59 missing climits Xuan Son Nguyen 2024-09-09 16:07:58 +02:00
  • 2fd513a826 match discrete_dist type and function return type slaren 2024-09-09 16:06:05 +02:00
  • fe16c7a8ad fix type specifier in format string slaren 2024-09-09 15:58:03 +02:00
  • 8f5f25cd72 rpc : buf_size must not be static Radoslav Gerganov 2024-09-09 16:57:47 +03:00
  • 30f06726e7 add cmake Xuan Son Nguyen 2024-09-09 15:56:25 +02:00
  • 6e801df136 rpc : fix nkvo slaren 2024-09-07 03:24:47 +02:00
  • 9ea1e93591 better categorize args Xuan Son Nguyen 2024-09-09 15:44:48 +02:00
  • 9444f3fca2 RWKV v6: Add time_mix_decay_w1/w2 in quant exclusion list Molly Sophia 2024-09-09 21:31:04 +08:00
  • 4f7b808ba7 llama : minor sampling refactor (2) slaren 2024-09-09 15:27:07 +02:00
  • 5d399f5689 common : move arg parser to arg.cpp Xuan Son Nguyen 2024-09-09 15:17:58 +02:00
  • bfeb2f5b3b llama : update llm_build_copy_mask_state comment Daniel Bevenius 2024-09-09 15:04:34 +02:00
  • 2bed2542ba
    Merge pull request #1 from ggerganov/SVE-vector-length-agnostic-VLA-gg Prashant Vithule 2024-09-09 18:22:16 +05:30
  • 38ca6f644b
    readme : update hot topics Georgi Gerganov 2024-09-09 15:51:37 +03:00
  • 8e6e2fbe14
    CUDA: fix variable name conflict for Windows build (#9382) b3711 Johannes Gäßler 2024-09-09 14:22:53 +02:00
  • 8d954a8629 CUDA: fix variable name conflict for Windows build Johannes Gäßler 2024-09-09 13:28:40 +02:00
  • 5ed087573e
    readme : add LLMUnity to UI projects (#9381) Antonis Makropoulos 2024-09-09 14:21:38 +03:00
  • 8d8aa81dcc add newline to examples/rpc/README.md to fix editorconfig-checker unit test Antonis Makropoulos 2024-09-09 14:18:46 +03:00
  • cfbf33a705
    ggml : style changes + fix 512-bit nb loop check SVE-vector-length-agnostic-VLA-gg Georgi Gerganov 2024-09-09 12:50:35 +03:00
  • 8bd723e5c5 add LLMUnity to UI projects Antonis Makropoulos 2024-09-09 12:34:17 +03:00
  • 54f376d0b9
    rpc : update README [no ci] (#9320) Radoslav Gerganov 2024-09-09 11:04:39 +03:00
  • 195a062986 make tokenizer_pre consistent; llama.cpp work hoangdz 2024-09-09 16:20:39 +09:00
  • b2e89a3274
    Arm AArch64: Documentation updates (#9321) Dan Johansson 2024-09-09 09:02:45 +02:00
  • e26d17c0bb
    ci: bump actions/checkout to v4 Trivikram Kamat 2024-09-08 16:23:55 -07:00
  • daa9623ab0
    Overlap cmdbuffer creation and cmdbuffer execution in Vulkan backend by submitting smaller cmdbuffers early. (#9118) b3707 Markus Tavenrath 2024-09-08 21:43:48 +02:00
  • e079bffb66
    cuda : fix FA Q src index (1 -> 0) (#9374) b3706 Georgi Gerganov 2024-09-08 22:01:02 +03:00
  • f0de0bf28e
    Merge 39ae18444f into 3f7ccfd649 curvedinf 2024-09-08 18:13:39 +02:00
  • 3f7ccfd649
    common : bring back missing args, add env var duplication check (#9375) b3705 Xuan Son Nguyen 2024-09-08 18:08:55 +02:00
  • 374acb9392 correct default values Xuan Son Nguyen 2024-09-08 17:29:58 +02:00
  • 944ea66861
    cuda : fix FA Q src index (1 -> 0) Georgi Gerganov 2024-09-08 18:23:30 +03:00
  • 9b04a44325 add check for duplicated env var Xuan Son Nguyen 2024-09-08 17:21:07 +02:00
  • b5dd43555a move duplication check to test-arg-parser Xuan Son Nguyen 2024-09-08 17:18:17 +02:00
  • 056822ec4f common : bring back missing args Xuan Son Nguyen 2024-09-08 17:12:33 +02:00
  • d19101c9a0 imatrix : use FMA and sort tensor names Francis Couture-Harpin 2024-09-08 11:03:59 -04:00
  • a249843d89
    common : restore --n-gpu-layers (#9371) b3704 slaren 2024-09-08 16:44:42 +02:00
  • 3ad0603c65 Merge branch 'master' into compilade/imatrix-batched-chunks Francis Couture-Harpin 2024-09-08 10:05:08 -04:00
  • c8ab6a3ba3 imatrix : fix conversion problems Francis Couture-Harpin 2024-09-08 10:04:01 -04:00
  • 19f4a7b296
    llama : refactor samplers internal implementation (#9370) b3703 slaren 2024-09-08 15:52:07 +02:00
  • 6c95dfe829 remove outdated comment slaren 2024-09-08 15:17:28 +02:00
  • e1c4fb7f9c fix LLAMA_TOKEN_NULL checks in penalties sampler slaren 2024-09-08 15:16:50 +02:00
  • 80f6666b26 common : restore --n-gpu-layers slaren 2024-09-08 15:07:46 +02:00
  • f3ecf6d740 llama : refactor samplers internal implementation slaren 2024-09-08 15:05:05 +02:00
  • 2a358fb0c4
    [SYCL] add check malloc result on device (#9346) b3702 Neo Zhang Jianyu 2024-09-08 19:05:29 +08:00
  • c882647e7c Direct I/O and Transparent HugePages Pavel Fatin 2024-05-20 21:55:33 +02:00
  • eae597182c
    llama : sanitize tokens in the upper bound (#9359) b3701 slaren 2024-09-08 12:41:51 +02:00
  • 00b02bb249
    imatrix : fix arg parser for imatrix (#9366) b3700 Xuan Son Nguyen 2024-09-08 12:12:17 +02:00
  • 08d9acd981 beautify printing first arg Xuan Son Nguyen 2024-09-08 11:38:03 +02:00
  • 01da813dc7 imatrix : fix arg parser Xuan Son Nguyen 2024-09-08 11:28:57 +02:00
  • e106feb048 update for review comments, check all malloc_device() result arthw 2024-09-08 17:23:33 +08:00
  • 9dc0223390 Fix some nodes are not checked with GGML_VULKAN_CHECK_RESULTS enabled. Markus Tavenrath 2024-09-08 11:19:34 +02:00
  • dffb4b1909
    Merge db78320b4d into a876861455 jaime-m-p 2024-09-08 10:06:51 +02:00
  • a876861455 metal : update support condition for im2col + fix warning (#0) b3699 Georgi Gerganov 2024-09-08 09:57:57 +03:00
  • 385decbd63 sync : ggml Georgi Gerganov 2024-09-08 09:38:56 +03:00
  • 60a3107ccd scripts : option to increase git patch context Georgi Gerganov 2024-09-08 09:38:42 +03:00
  • 406c1a32a1 vulkan: add dryrun support to sin and cos ops (ggml/947) Salvatore Mesoraca 2024-09-06 14:34:25 +02:00
  • 9cb9260861 vulkan: correctly report support for OP_CONT (ggml/946) Salvatore Mesoraca 2024-09-06 14:34:07 +02:00
  • 202084d31d tests: add gradient tests for all backends (ggml/932) Johannes Gäßler 2024-09-03 17:21:46 +02:00
  • dbbebcab33 ggml: fix ggml_graph_cpy undefined behavior (ggml/943) Johannes Gäßler 2024-08-31 14:35:42 +02:00
  • ba1cf846ed cann : fix doxy (ggml/0) Georgi Gerganov 2024-08-28 18:45:01 +03:00
  • d2d3200b38 cann : add Ascend NPU support (whisper/2336) Mengqing Cao 2024-08-09 20:21:56 +08:00
  • 51d964a4ef cuda : mark BF16 CONT as unsupported Georgi Gerganov 2024-08-28 17:08:03 +03:00
  • efe6a83e30 ggml : fix cont with transposed tensors when one dimension is 1 (ggml/934) Salvatore Mesoraca 2024-08-28 10:23:02 +02:00
  • 99afa0cdb9
    metal : update support condition for im2col + fix warning (#0) Georgi Gerganov 2024-09-08 09:57:57 +03:00
  • 365945566a
    sync : ggml Georgi Gerganov 2024-09-08 09:38:56 +03:00
  • 38209025b1
    scripts : option to increase git patch context Georgi Gerganov 2024-09-08 09:38:42 +03:00
  • 5516d614ae
    vulkan: add dryrun support to sin and cos ops (ggml/947) Salvatore Mesoraca 2024-09-06 14:34:25 +02:00
  • 0274447789
    vulkan: correctly report support for OP_CONT (ggml/946) Salvatore Mesoraca 2024-09-06 14:34:07 +02:00
  • 6c7165ff96
    tests: add gradient tests for all backends (ggml/932) Johannes Gäßler 2024-09-03 17:21:46 +02:00
  • 22ea8634a4
    ggml: fix ggml_graph_cpy undefined behavior (ggml/943) Johannes Gäßler 2024-08-31 14:35:42 +02:00
  • d8ddddf655
    cann : fix doxy (ggml/0) Georgi Gerganov 2024-08-28 18:45:01 +03:00
  • 2e97d211c2
    cann : add Ascend NPU support (whisper/2336) Mengqing Cao 2024-08-09 20:21:56 +08:00
  • ff0ba802a4
    cuda : mark BF16 CONT as unsupported Georgi Gerganov 2024-08-28 17:08:03 +03:00
  • 80df61bdc5
    ggml : fix cont with transposed tensors when one dimension is 1 (ggml/934) Salvatore Mesoraca 2024-08-28 10:23:02 +02:00
  • 87d8636e93 add phi name to convert_hf_to_gguf_update.py hoangdz 2024-09-08 15:14:24 +09:00
  • ade52b6cc6
    common : add llama_arg Georgi Gerganov 2024-09-08 08:57:56 +03:00
  • 471e7e1e59
    llama : llama_perf + option to disable timings during decode Georgi Gerganov 2024-09-07 20:50:23 +03:00
  • fbb7fcffbc
    llama : set attrs of mislabelled EOT/EOM tokens (#9348) b3688 Kevin Gibbons 2024-09-07 22:51:00 -07:00
  • 30f71e60de flake.lock: Update github-actions[bot] 2024-09-08 00:21:46 +00:00
  • 297ba5c3af llama : sanitize tokens in the upper bound slaren 2024-09-08 00:33:40 +02:00