Commit graph

  • b12fa0d1c1
    build : link against build info instead of compiling against it (#3879) b1468 cebtenzzre 2023-11-02 02:50:16 -04:00
  • 4d719a6d4e
    cuda : check if this fixes Pascal card regression (#3882) b1467 Georgi Gerganov 2023-11-02 08:35:10 +02:00
  • 183b3fac6c
    metal : fix build errors and kernel sig after #2268 (#3898) b1466 Georgi Gerganov 2023-11-02 08:33:37 +02:00
  • 396412c02b
    metal : fix build errors and kernel sig after #2268 Georgi Gerganov 2023-11-02 08:15:18 +02:00
  • 82267e5e69 switched back to clinfo since it's possibly more cross platform and can get memory vals easily Concedo 2023-11-02 14:12:05 +08:00
  • 2fffa0d61f
    cuda : fix RoPE after #2268 (#3897) b1465 cebtenzzre 2023-11-02 01:49:44 -04:00
  • fd04ac513e cuda : fix RoPE after #2268 cebtenzzre 2023-11-02 00:28:29 -04:00
  • e9abcc9c7c fix linter complaints cebtenzzre 2023-11-02 00:06:32 -04:00
  • 66ccd62102 sort imports cebtenzzre 2023-11-01 23:26:28 -04:00
  • 8f31dc54ec fix mypy errors cebtenzzre 2023-11-01 23:22:04 -04:00
  • d480d2c204 ggml-cuda : compute ptrs for cublasGemmBatchedEx in a kernel (#3891) slaren 2023-11-01 23:10:09 +01:00
  • 1ab18ecb53 Merge commit 'c43c2da8af' into concedo_experimental Concedo 2023-11-02 11:17:59 +08:00
  • 9ea1c9bf99 Fix ROCM build by relaxing constness KerfuffleV2 2023-11-01 19:24:11 -06:00
  • 0eb332a10f
    llama : fix llama_context_default_params after #2268 (#3893) b1464 cebtenzzre 2023-11-01 19:29:14 -04:00
  • 3536f2048c llama : fix llama_context_default_params after #2268 cebtenzzre 2023-11-01 18:56:20 -04:00
  • d02e98cde0
    ggml-cuda : compute ptrs for cublasGemmBatchedEx in a kernel (#3891) b1463 slaren 2023-11-01 23:10:09 +01:00
  • 898aeca90a
    llama : implement YaRN RoPE scaling (#2268) b1462 cebtenzzre 2023-11-01 18:04:33 -04:00
  • 20787d8c5d Merge upstream changes, fix conflicts 0cc4m 2023-11-01 22:51:17 +01:00
  • 402f428ab2 cmake : revert change to CMP0115 cebtenzzre 2023-11-01 17:47:53 -04:00
  • 081f73815f Merge branch 'master' of https://github.com/ggerganov/llama.cpp into ntkv2 cebtenzzre 2023-11-01 17:28:51 -04:00
  • 4b7eccc7da Add Vulkan to llama-bench 0cc4m 2023-11-01 22:22:57 +01:00
  • c43c2da8af
    llm : fix llm_build_kqv taking unused tensor (benign, #3837) b1461 Georgi Gerganov 2023-11-01 23:08:30 +02:00
  • 523e49b111
    llm : fix falcon norm after refactoring (#3837) b1460 Georgi Gerganov 2023-11-01 23:00:50 +02:00
  • 15f26efdb1 implement YaRN for GPT-NeoX RoPE cebtenzzre 2023-11-01 16:44:49 -04:00
  • 2e01682a56 Reuse timeline semaphores, allow parallel operation with binary semaphores to work around nvidia driver limitations 0cc4m 2023-11-01 21:46:54 +01:00
  • e16b9fa4ba
    metal : multi-simd softmax (#3710) b1459 Georgi Gerganov 2023-11-01 21:25:00 +02:00
  • 46868a499e
    metal : multi-simd softmax metal-soft-max Georgi Gerganov 2023-10-21 13:18:26 +03:00
  • ff8f9a88da
    common : minor (#3715) b1458 Georgi Gerganov 2023-11-01 21:15:55 +02:00
  • 1354122c21 fix warnings slaren 2023-11-01 19:43:27 +01:00
  • a9ab02eb6f ggml-cuda : compute ptrs for cublasGemmBatchedEx in a kernel slaren 2023-11-01 19:26:48 +01:00
  • 50337961a6
    llm : add llm_build_context (#3881) b1457 Georgi Gerganov 2023-11-01 20:11:02 +02:00
  • a8796f9609
    llm : cleanup + comments llm-build-context Georgi Gerganov 2023-11-01 20:08:02 +02:00
  • 0e40806c1c
    common : allow caller to handle help/argument exceptions (#3715) b1456 bandoti 2023-11-01 14:42:01 -03:00
  • f4de12b36f
    Update common/common.cpp bandoti 2023-11-01 14:06:19 -03:00
  • af30448b3e
    Update common/common.h bandoti 2023-11-01 14:02:37 -03:00
  • bf416a3875 zig : make build info a .cpp source instead of a header cebtenzzre 2023-11-01 12:15:45 -04:00
  • 2807111dcc exit instead of returning false Mason M 2023-11-01 12:47:37 -03:00
  • 21588cefd4 tunnel code done (+1 squashed commits) Concedo 2023-11-01 22:55:37 +08:00
  • a7622c03c4
    Merge branch 'ggerganov:master' into parse-args-error-handling bandoti 2023-11-01 12:11:33 -03:00
  • 78186f4009
    llm : restore the non-graph llm_build_ functional API Georgi Gerganov 2023-11-01 15:25:50 +02:00
  • a2758d08e4
    log : make generating separate log files optional (#3787) b1455 staviq 2023-11-01 15:18:27 +01:00
  • e75dfdd31b
    sampling : null grammar field after reset (#3885) b1454 l3utterfly 2023-11-01 21:40:43 +08:00
  • 3b227fc704 automatic gpu layer detection Concedo 2023-11-01 20:55:26 +08:00
  • b395dbf6f5 wip layer calculator Concedo 2023-11-01 20:04:10 +08:00
  • 524c6ca8a7 null grammar field after reset l3utterfly 2023-11-01 19:54:18 +08:00
  • 9a3b4f6c86
    ggml : fix UNUSED macro (#3762) b1453 Georgi Gerganov 2023-11-01 13:50:45 +02:00
  • 73bdcb395e
    finetune : add -ngl parameter (#3762) Andrew Godfrey 2023-11-01 04:49:04 -07:00
  • ae2cd56de8 kobold integration of min_p sampler (+1 squashed commits) Concedo 2023-11-01 19:07:26 +08:00
  • bcb397953f Merge remote-tracking branch 'llama.cpp/try-fix-3869' into concedo_experimental Concedo 2023-11-01 18:29:08 +08:00
  • 92d80b94b3 bundle simpleclinfo into pyinstaller except for linux Concedo 2023-11-01 18:26:15 +08:00
  • 9342636408 Merge branch 'master' into concedo_experimental Concedo 2023-11-01 18:24:36 +08:00
  • df7e757d40 windows: added simpleclinfo, which helps determine clblast platform and device on windows Concedo 2023-11-01 18:10:35 +08:00
  • f0e209324a
    scripts : add server-llm.sh (#3868) Georgi Gerganov 2023-11-01 11:29:07 +02:00
  • ca190bca8e
    server : re-enable completion and embedded at the same time (#3876) b1450 Adrian Hesketh 2023-11-01 09:28:28 +00:00
  • 995ee0919f
    llm : deduce norm eps based on type + explict max_alibi_bias, clamp_kqv Georgi Gerganov 2023-11-01 11:19:58 +02:00
  • 9284aa6a70
    llm : add llm_build_context Georgi Gerganov 2023-11-01 08:51:43 +02:00
  • 7420bef83e
    wip wip wip llm-reuse-constants Georgi Gerganov 2023-11-01 08:51:43 +02:00
  • 71e3718abd
    llama : refactor graph build code (#3837) b1449 Georgi Gerganov 2023-11-01 08:04:02 +02:00
  • 4fdd7cdf2b Review fixes, persimmon fixes Galunid 2023-11-01 02:32:49 +01:00
  • 56c1fad0b4 build : link against build info instead of compiling against it cebtenzzre 2023-10-30 23:44:23 -04:00
  • 6288f9b91d cmake : add missing dependencies on BUILD_INFO cebtenzzre 2023-10-31 00:44:40 -04:00
  • fc94d4b684 cmake : simplify BUILD_INFO target cebtenzzre 2023-10-31 00:24:13 -04:00
  • 2a8af07381 cmake : fix build when .git does not exist cebtenzzre 2023-10-31 00:16:18 -04:00
  • 3af7756042 Merge branch 'master' of https://github.com/ggerganov/llama.cpp into finetune_enableGpu cebtenzzre 2023-10-31 19:34:54 -04:00
  • 6ce46c5270 Fix test - fix line ending Mihai 2023-11-01 00:58:14 +02:00
  • 82fdee6406 Update index.html.hpp after running deps.sh Mihai 2023-11-01 00:46:28 +02:00
  • 95356519a7
    server : re-enable completion and embedded at the same time Adrian Hesketh 2023-10-31 22:31:17 +00:00
  • 3ec89dcc69 Use 'IntEnum' instead of 'Enum' Galunid 2023-10-31 22:23:26 +01:00
  • 77931e7d0a Use spaces instead of tabs Mihai 2023-10-31 22:25:48 +02:00
  • a7449e8fc2 Update server.cpp with min_p after it was introduced in https://github.com/ggerganov/llama.cpp/pull/3841 Mihai 2023-10-31 22:16:36 +02:00
  • 238657db23
    samplers : Min-P sampler implementation [alternative to Top P/Top K] (#3841) b1448 kalomaze 2023-10-31 14:44:49 -05:00
  • 2080f24688 Enable sigint handler even when not in interactive mode Jaggzh 2023-10-31 11:50:52 -07:00
  • afb3929279
    Merge branch 'master' into llama-refactor llama-refactor Georgi Gerganov 2023-10-31 20:35:31 +02:00
  • 9de3e6c7c3 remove commented out code cebtenzzre 2023-10-31 14:31:51 -04:00
  • 3b58af2648 forgot one small thing! kalomaze 2023-10-31 13:19:49 -05:00
  • 974640ac25 Update README for consistency kalomaze 2023-10-31 13:18:48 -05:00
  • 22cc9bef09
    cuda : check if this fixes Pascal card regression Georgi Gerganov 2023-10-31 20:01:47 +02:00
  • 07178c98e1
    flake.nix: fix for rocm 5.7 (#3853) Tungsten842 2023-10-31 18:24:03 +01:00
  • 5baefef497
    llama : add llm_build helper functions (#3848) Georgi Gerganov 2023-10-31 19:23:12 +02:00
  • 29fe516913
    wip test-mmv Georgi Gerganov 2023-10-31 18:36:37 +02:00
  • 512cac630c added a bit more context to the README kalomaze 2023-10-31 11:32:01 -05:00
  • 9248325f82 Update README & set 0.05 default kalomaze 2023-10-31 11:25:23 -05:00
  • d09772a6d0 gguf : prevent adding tensors after header is written cebtenzzre 2023-10-31 11:52:32 -04:00
  • 389d2e6b9e gguf : free tensors as they are written Cebtenzzre 2023-10-01 21:42:42 -04:00
  • d97afcfc02 gguf : track writer state Cebtenzzre 2023-10-01 19:31:08 -04:00
  • 3fcdc9330a gguf : cleanup tensor padding Cebtenzzre 2023-10-01 18:07:43 -04:00
  • 6df988d5f1 gguf : do not store defaults in class vars Cebtenzzre 2023-10-01 17:49:22 -04:00
  • f4b9a7ea02 Remove 'old' conversion scripts Galunid 2023-10-31 16:27:06 +01:00
  • 235acc18cd Small refactor Galunid 2023-10-31 16:23:53 +01:00
  • c94df09732 Rework tokenizer handling Galunid 2023-10-31 16:11:08 +01:00
  • dab42893c9
    scripts : working curl pipe deploy Georgi Gerganov 2023-10-31 17:03:56 +02:00
  • 7923b70cb8
    llama : add llm_build_inp_embd helper llama-refactor-norm Georgi Gerganov 2023-10-31 16:43:08 +02:00
  • b2ba44eab2 Flake8 fixes Galunid 2023-10-31 15:38:24 +01:00
  • 2073347e3b
    llama : remove extra ; + deduplicate gate_b logic Georgi Gerganov 2023-10-31 16:28:09 +02:00
  • 43a5143450 added clinfo binary, cleanup unused stuff Concedo 2023-10-31 22:25:25 +08:00
  • f3690ba6d2 shifting enabled by default Concedo 2023-10-31 21:41:57 +08:00
  • 80bfc59692 Remove queue information 0cc4m 2023-10-31 14:29:36 +01:00
  • 2c7fa8de5a Reduce number of used semaphores by utilizing timelines more properly 0cc4m 2023-10-31 14:24:37 +01:00
  • e62f38abd1 Merge branch 'master' into concedo_experimental Concedo 2023-10-31 21:09:49 +08:00
  • cc5b282350 Merge branch 'master' into concedo_experimental Concedo 2023-10-31 20:44:04 +08:00