Commit graph

  • b2130e65b6 build : re-enable some warnings for train-text-from-scratch Cebtenzzre 2023-09-28 16:56:55 -04:00
  • 7b15e8afac make : fix clang version detection Cebtenzzre 2023-09-28 16:54:28 -04:00
  • 39b566393f fix new warnings after merge Cebtenzzre 2023-09-28 16:47:41 -04:00
  • 4b90878b36 Merge branch 'master' of https://github.com/ggerganov/llama.cpp into clang-warnings Cebtenzzre 2023-09-28 16:34:04 -04:00
  • 89cecedf8d
    Merge branch 'ggerganov:master' into llama_native Eve 2023-09-28 20:13:14 +00:00
  • 0ccfc62a96
    ggml_tensor: update the structure comments. (#3283) b1291 Hua Jiang 2023-09-28 13:06:18 -07:00
  • 7f1a0fe709
    ggml : release the requested thread pool resource (#3292) b1290 Qu Zongfu 2023-09-29 03:51:52 +08:00
  • 16bc66d947
    llama.cpp : split llama_context_params into model and context params (#3301) b1289 slaren 2023-09-28 21:42:38 +02:00
  • 0512d66670
    ci : multithreaded builds (#3311) b1288 Eve 2023-09-28 19:31:04 +00:00
  • 5fe3594e70
    Update build.yml Eve 2023-09-28 19:24:43 +00:00
  • 287a0e9a81
    Merge branch 'master' into ci_threads Eve 2023-09-28 19:21:09 +00:00
  • c8a9658e65 Merge remote-tracking branch 'origin/master' into llama-model-params slaren 2023-09-28 20:50:19 +02:00
  • 17e841ac22 cuda : print total VRAM used slaren 2023-09-28 20:43:04 +02:00
  • 0e76a8992c
    train : finetune LORA (#2632) b1287 xaedes 2023-09-28 20:40:11 +02:00
  • 546112944a
    Merge branch 'master' into HEAD Georgi Gerganov 2023-09-28 21:34:26 +03:00
  • 2db94d98ed
    gguf : basic type checking in gguf_get_* (#3346) b1286 Cebtenzzre 2023-09-28 14:30:31 -04:00
  • ecf90b1a51
    gguf : make token scores and types optional (#3347) b1285 Cebtenzzre 2023-09-28 14:30:15 -04:00
  • 720503ba4c Merge branch 'master' of github.com:phillip-kravtsov/llama.cpp into phillip-kravtsov/support-adept-persimmon-8b Phillip Kravtsov 2023-09-28 11:11:14 -07:00
  • 3f3179996d Rename adept->persimmon Phillip Kravtsov 2023-09-28 10:47:44 -07:00
  • 5659391b6a remove duplicated ctx/model functions slaren 2023-09-28 19:35:57 +02:00
  • 65b83f37bd Merge remote-tracking branch 'origin/master' into llama-model-params slaren 2023-09-28 19:06:12 +02:00
  • 2619109ad5
    ci : disable freeBSD builds due to lack of VMs (#3381) b1284 Georgi Gerganov 2023-09-28 19:36:36 +03:00
  • 666ca5ae97
    ci : disable freeBSD builds due to lack of VMs Georgi Gerganov 2023-09-28 19:10:14 +03:00
  • ec893798b7
    llama : custom attention mask + parallel decoding + no context swaps (#3228) b1283 Georgi Gerganov 2023-09-28 19:04:36 +03:00
  • c5650ed470
    server : avoid context swaps by shifting the KV cache custom-attention-mask Georgi Gerganov 2023-09-28 19:03:36 +03:00
  • ca8b315202 increase context for gguf to 32k, horde worker stats, fixed glitch in horde launcher ui, oai freq penalty, updated lite Concedo 2023-09-28 23:50:08 +08:00
  • ce2d995af2
    server : clear the KV cache beyond n_past before llama_decode Georgi Gerganov 2023-09-28 18:12:39 +03:00
  • 2b8830af71
    examples : do not eval prompt 2 times (close #3348) Georgi Gerganov 2023-09-28 17:48:25 +03:00
  • a207561503
    examples : add example for batched decoding Georgi Gerganov 2023-09-28 17:32:04 +03:00
  • 45855b3f1c
    docs : mark code as Bash (#3375) Kevin Ji 2023-09-28 09:11:32 -04:00
  • d008733e6b
    examples : utilize new llama_get_logits_ith() Georgi Gerganov 2023-09-28 16:05:37 +03:00
  • 4c72ab13b2
    metal : use mm kernels for batch size > 2 Georgi Gerganov 2023-09-28 16:02:20 +03:00
  • e9463792d3
    llama : simplify returns if/else branches Georgi Gerganov 2023-09-28 16:01:49 +03:00
  • 4ad0676927
    parallel : fix crash when -n -1 Georgi Gerganov 2023-09-28 15:48:38 +03:00
  • 25856900db
    Merge branch 'master' into custom-attention-mask Georgi Gerganov 2023-09-28 15:19:57 +03:00
  • 4aea3b846e
    readme : add Mistral AI release 0.1 (#3362) Pierre Alexandre SCHEMBRI 2023-09-28 14:13:37 +02:00
  • 159bc313b0
    Merge 787bc4af60 into da0400344b chooper1 2023-09-28 12:09:08 +02:00
  • da0400344b
    ggml-cuda : perform cublas fp16 matrix multiplication as fp16 (#3370) vbatts-gguf-2023-sept b1280 slaren 2023-09-28 12:08:28 +02:00
  • 6a821b268a improved SSE streamiing Concedo 2023-09-28 17:33:34 +08:00
  • cb227c2975
    docs : mark code as Bash Kevin Ji 2023-09-27 22:42:28 -07:00
  • 07d091298a docker : ignore Git files Kevin Ji 2023-09-22 20:45:46 -07:00
  • 7d5674dd2d restrict fp16 mat mul to volta and up slaren 2023-09-28 00:39:14 +02:00
  • e519621010
    convert : remove bug in convert.py permute function (#3364) Zhang Peiyuan 2023-09-28 02:45:20 +08:00
  • 32ada53c8e try to fix rocm build slaren 2023-09-27 20:16:16 +02:00
  • 79fe5a1f62 ggml-cuda : perform cublas fp16 matrix multiplication as fp16 slaren 2023-09-27 17:10:09 +02:00
  • ac43576124
    make-ggml.py : compatibility with more models and GGUF (#3290) Richard Roberson 2023-09-27 10:25:12 -06:00
  • 20c7e1e804
    gguf : fix a few general keys (#3341) b1277 Cebtenzzre 2023-09-27 12:18:07 -04:00
  • dc6897404e
    metal : reusing llama.cpp logging (#3152) b1276 Rickard Hallerbäck 2023-09-27 17:48:33 +02:00
  • b6434356ed
    ggml : minor Georgi Gerganov 2023-09-27 18:47:54 +03:00
  • 527e57cfd8
    build : add ACCELERATE_NEW_LAPACK to fix warning on macOS Sonoma (#3342) b1275 Jag Chadha 2023-09-27 11:34:32 -04:00
  • ffe88a36a9
    readme : add some recent perplexity and bpw measurements to READMES, link for k-quants (#3340) BarfingLemurs 2023-09-27 11:30:36 -04:00
  • c1596f633f
    llama : fix kv cache heuristic when context is less than 32 Georgi Gerganov 2023-09-27 18:12:43 +03:00
  • add6fa837f
    remove bug in convert.py permute function Zhang Peiyuan 2023-09-27 21:36:44 +08:00
  • 724534e9e6 Added the fact that llama.cpp supports Mistral AI release 0.1 Pierre Alexandre SCHEMBRI 2023-09-27 14:34:18 +02:00
  • 38d4c6cedd updated lite Concedo 2023-09-27 16:06:17 +08:00
  • 8484e5f52d gguf : use 'key_id' instead of 'i' for clarity Cebtenzzre 2023-09-26 21:06:53 -04:00
  • 8f5b0eaa8a llama.cpp : add llama_get_model common : add llama_tokenize from model slaren 2023-09-26 23:23:59 +02:00
  • 72e7ef4e53 simple : fixes cam-simple-fix slaren 2023-09-26 23:19:36 +02:00
  • db2181a47b fp32 works Phillip Kravtsov 2023-09-26 13:10:04 -07:00
  • d1b40efcfa Correct outputs through masked & softmax'd KQ Phillip Kravtsov 2023-09-26 11:36:36 -07:00
  • cdd52f6a3c gguf : make token scores and types optional Cebtenzzre 2023-09-26 13:56:57 -04:00
  • cf31658cbf added a flag to keep console in foreground Concedo 2023-09-27 01:53:30 +08:00
  • 74edc401c1 Merge branch 'master' into concedo_experimental Concedo 2023-09-27 01:30:15 +08:00
  • eb86cd4027 bump token limits Concedo 2023-09-27 01:26:00 +08:00
  • 6f72693c1c fix copy-paste error Cebtenzzre 2023-09-26 13:07:05 -04:00
  • 1eb0cf5c15 gguf : basic type checking in gguf_get_* Cebtenzzre 2023-09-26 13:01:47 -04:00
  • 8bf6f7f8b0 added simulated OAI endpoint Concedo 2023-09-27 00:49:24 +08:00
  • 6fc11cadf2 build : add ACCELERATE_NEW_LAPACK to fix warning on macOS Sonoma Jagtesh Chadha 2023-09-26 12:00:24 -04:00
  • 99c5c9a0d8 Upload immediately to device. master-99c5c9a Adam Treat 2023-09-26 11:58:39 -04:00
  • 99920d2649 fix typo Cebtenzzre 2023-09-26 11:48:27 -04:00
  • 7f112e2cd4 support genkeys in polled streaming Concedo 2023-09-26 23:46:07 +08:00
  • 8488d32132 gptneox-wip : fix potential segfault while loading model Cebtenzzre 2023-09-26 11:12:28 -04:00
  • 94573e154d gguf : fix general.source key typos Cebtenzzre 2023-09-26 11:08:51 -04:00
  • 043e5b99f2
    Update README.md with k-quants bpw measurements BarfingLemurs 2023-09-26 10:23:45 -04:00
  • 086547ec33
    Update README.md BarfingLemurs 2023-09-26 10:08:18 -04:00
  • 537846fce5 avoid leaking ggml_context on failure cleanup slaren 2023-09-26 15:52:59 +02:00
  • c9e1446f52 correct tensors thru RoPE Phillip Kravtsov 2023-09-26 00:07:19 -07:00
  • 4bcf412d86 wip: correct tensors up to RoPE Phillip Kravtsov 2023-09-25 23:49:35 -07:00
  • 8c1828aa6c lora : add support for non-llama models slaren 2023-09-26 01:35:38 +02:00
  • 99115f3fa6
    cmake : fix build-info.h on MSVC (#3309) b1273 DAN™ 2023-09-25 18:45:33 -04:00
  • 86db9b12e9 Fix build-info on MSVC. DAN™ 2023-09-22 09:50:41 -04:00
  • 1726f9626f
    docs: Fix typo CLBlast_DIR var. (#3330) 2f38b454 2023-09-26 02:24:52 +08:00
  • 6c2134a860 improved makefile, allowing building without k quants Concedo 2023-09-25 22:10:47 +08:00
  • a98b1633d5
    nix : add cuda, use a symlinked toolkit for cmake (#3202) Erik Scholz 2023-09-25 13:48:30 +02:00
  • 17ee719c56 improved remotelink cmd, fixed lib unload, updated class.py Concedo 2023-09-25 17:50:00 +08:00
  • 72f25da96a
    Fix CLBlast_DIR var. 2f38b454 2023-09-25 16:40:45 +08:00
  • 71efcc2b61 Adds Bert in LlamaCPP to avoid duplicating GGML in multiple packages Marc Terns 2023-09-24 18:48:06 -05:00
  • fdadbd0fbb updated lite (+1 squashed commits) Concedo 2023-09-24 19:12:53 +08:00
  • 021c11a21a add flush for printf William Wu 2023-09-24 22:59:10 +08:00
  • 166065837e
    improve handling of not yet supported tensor types xaedes 2023-09-24 14:55:21 +02:00
  • ad64e33aad
    Fix export-lora.cpp "not enough space in the context's memory pool" (#1) meatbag-18a 2023-09-24 05:48:19 -07:00
  • 2912f17010
    improve handling of export-lora arguments xaedes 2023-09-24 14:42:52 +02:00
  • 8ecf505d5d improved embedded horde worker (+2 squashed commit) Concedo 2023-09-24 01:20:09 +08:00
  • b3a6b28622 also enable mmap on Windows Cebtenzzre 2023-09-23 23:39:28 -04:00
  • 0e9ed7f84f main : fix rope freq/scale warning slaren 2023-09-24 01:18:22 +02:00
  • c091cdfb24
    llama-bench : add README (#3317) slaren 2023-09-23 21:48:24 +02:00
  • 858d5469c8 minor edit slaren 2023-09-23 21:47:42 +02:00
  • 32cf02487e colab use mmq, update lite and ver Concedo 2023-09-23 23:32:00 +08:00
  • 3f9a483065 llama-bench : add README slaren 2023-09-23 14:58:27 +02:00
  • e5afe4204c llama-bench fix slaren 2023-09-23 14:55:58 +02:00