Commit graph

  • 2f84f5dc84
    fix code style kir-gadjello 2023-11-22 02:40:47 -03:00
  • a0a08eedb6
    Add openai-compatible POST /v1/chat/completions API endpoint to server example kir-gadjello 2023-11-22 02:16:38 -03:00
  • 87fe183d4d fix format Bingxuan Wang 2023-11-22 11:41:11 +08:00
  • 76f6831fce remove printf Bingxuan Wang 2023-11-22 11:30:52 +08:00
  • 31fbcf6890 Merge branch 'master' into swiftui_metal_update Bailey Chittle 2023-11-21 18:57:11 -08:00
  • f002a2e752 fixed it! Bailey Chittle 2023-11-21 18:47:05 -08:00
  • a22264ac0b update to match latest code, new errors Bailey Chittle 2023-11-21 18:28:19 -08:00
  • 6f8adf99d5 simplify mike dupont 2023-11-21 21:12:56 -05:00
  • 9c33f53e9a working mike dupont 2023-11-21 20:52:29 -05:00
  • 9b7a296452 update mike dupont 2023-11-21 20:38:10 -05:00
  • 0b63b52ece update mike dupont 2023-11-21 20:11:05 -05:00
  • f3f829a0d7 improvement mike dupont 2023-11-21 20:03:25 -05:00
  • f1558ab38f diff mike dupont 2023-11-21 19:57:47 -05:00
  • 3a916678e3 update mike dupont 2023-11-21 19:19:49 -05:00
  • 84adb5412c add n_dims parameter to llm_build_k_shift, default to n_rot via overload slaren 2023-11-22 00:45:43 +01:00
  • 5432cc18da adding debug notes mike dupont 2023-11-21 15:11:18 -05:00
  • 22359f7afe now it might even run mike dupont 2023-11-21 14:48:47 -05:00
  • ee9b0bceeb rebased and trimmed down mike dupont 2023-11-21 11:25:37 -05:00
  • 4a3469f20e remove unused freq_base kernel parameter slaren 2023-11-21 17:53:54 +01:00
  • 58444931dc ggml-cuda : support stablelm rope slaren 2023-11-21 17:48:00 +01:00
  • 8e672efe63
    stablelm : simplify + speedup generation (#4153) b1550 Galunid 2023-11-21 16:22:30 +01:00
  • 8c45f1c34a Merge branch 'master' into regex_gpt2_preprocess Bingxuan Wang 2023-11-21 20:43:21 +08:00
  • 3a7f0c4cf3 optimized performance Bingxuan Wang 2023-11-21 20:35:38 +08:00
  • 319e47e703 stablelm : simplify + speedup generation Galunid 2023-11-21 10:57:31 +01:00
  • 566785f560 Address comments Haohui Mai 2023-11-20 22:04:21 -08:00
  • 8fe9c2e2de
    [github][workflows][docker]: adds jlumbroso/free-disk-space samm81 2023-11-20 21:48:54 +03:00
  • cc07106ab5
    [github][workflows][docker]: removes hardcoded ggerganov from ghcr repo samm81 2023-11-19 20:45:56 +03:00
  • 0b871f1a04
    finetune - update readme to mention llama support only (#4148) Galunid 2023-11-20 19:30:00 +01:00
  • cbadbfd61d finetune - update readme to mention llama support only Galunid 2023-11-20 18:41:15 +01:00
  • e81dae7ee4 adding the simple name lookup mike dupont 2023-11-20 12:25:36 -05:00
  • a7b816ecdf improving on logging mike dupont 2023-11-20 11:54:23 -05:00
  • dfc7cd48b1
    readme : update ROCm Windows instructions (#4122) Aaryaman Vasishta 2023-11-21 00:02:46 +09:00
  • 56a5fa7a60 Merge branch 'master' into concedo_experimental Concedo 2023-11-20 22:37:06 +08:00
  • 4d7c14be73 fix stop seq escaping newline Concedo 2023-11-20 22:35:45 +08:00
  • dc4078c039 fixed segfault with all non-gguf models Concedo 2023-11-20 22:31:56 +08:00
  • 881800d1f0
    main : Add ChatML functionality to main example (#4046) b1547 Seb C 2023-11-21 00:26:59 +10:30
  • f23c0359a3
    ci : add flake8 to github actions (python linting) (#4129) b1546 Galunid 2023-11-20 11:35:47 +01:00
  • 40a34fe8d0
    speculative : fix prompt tokenization in speculative example (#4025) b1545 Branden Butler 2023-11-20 03:50:04 -06:00
  • bd2a828742
    Update README.md Aaryaman Vasishta 2023-11-20 14:35:52 +09:00
  • dae06c06e5
    Revert "finetune : add --n-gpu-layers flag info to --help (#4128)" b1544 Georgi Gerganov 2023-11-19 19:16:07 +02:00
  • ee66c69dba
    chore: resolve comments vodkaslime 2023-11-20 01:03:24 +08:00
  • 05e8301e45
    finetune : add --n-gpu-layers flag info to --help (#4128) b1543 Clark Saben 2023-11-19 11:56:38 -05:00
  • 936c79b227
    server : relay error messages (#4131) b1542 SoftwareRenderer 2023-11-19 11:54:10 -05:00
  • 262005ad9d
    common : comma should be semicolon (#4137) b1541 kchro3 2023-11-19 08:52:57 -08:00
  • 35985acffa
    gitignore : tokenize Georgi Gerganov 2023-11-19 18:50:49 +02:00
  • 9de48fdd20 comma should be semicolon kchro3 2023-11-19 08:42:16 -08:00
  • 5d21a82786
    chore: resolve comments vodkaslime 2023-11-20 00:40:45 +08:00
  • e937066420
    gguf-py : export chat templates (#4125) b1539 slaren 2023-11-19 11:10:52 +01:00
  • 34a9ef6c6c fix: readme vodkaslime 2023-11-19 17:20:18 +08:00
  • d9ca45638f Allow multi-op partial offloading by parsing the graph to preallocate enough between-op buffers 0cc4m 2023-11-19 09:39:46 +01:00
  • cf646fa809 try to scale custom roped models Concedo 2023-11-19 16:24:13 +08:00
  • c885cc9f76
    update CMakefiles.txt to check openblas64 annalee 2023-11-19 05:37:28 +00:00
  • c6874f9759 Fix incorrect format strings and uninitialized variables. Haohui Mai 2023-11-18 20:51:09 -08:00
  • 8f83ca592d
    Update llama.cpp John 2023-11-19 04:54:58 +01:00
  • 2e263ca200
    update vocab class wonjun Jang 2023-11-19 10:20:06 +09:00
  • f28b2e26d1 server : relay error messages SoftwareRenderer 2023-11-18 19:21:50 -05:00
  • 28a2e6e7d4
    tokenize example: Respect normal add BOS token behavior (#4126) b1538 Kerfuffle 2023-11-18 14:48:17 -07:00
  • 471a1b009d Fix issues with older Vulkan headers on Ubuntu 22.04 0cc4m 2023-11-18 22:17:42 +01:00
  • 66c9cd6d5c E701 - multiple statements on one line (colon) Galunid 2023-11-18 21:46:23 +01:00
  • 12f53d9752 Remove baichuan script from excludes Galunid 2023-11-18 21:18:47 +01:00
  • 1a738a5d1d Merge branch 'master' into python-flake8 Galunid 2023-11-18 21:18:17 +01:00
  • 0b5c3b0457
    scripts : Remove missed baichuan convert script (#4127) Galunid 2023-11-18 21:08:33 +01:00
  • 0b1a4f697b Fix linter errors Galunid 2023-11-18 19:37:52 +01:00
  • 1ef22ac4f5 add --n-gpu-layers flag to finetune.cpp for clarity on finetune --help csaben 2023-11-18 14:15:00 -05:00
  • 9cfc5e2160 Ensure tgt and dft have same add_bos setting Branden Butler 2023-11-18 12:49:35 -06:00
  • e7f29df1e0 Merge branch 'master' into delete-baichuan Galunid 2023-11-18 19:45:52 +01:00
  • 246a5a6ad0 Remove missed baichuan convert script Galunid 2023-11-18 19:40:13 +01:00
  • b3ea7ade46 Add flake8 action Galunid 2023-11-18 19:33:47 +01:00
  • e778ce4a4c Adapt to new should_add_bos function Branden Butler 2023-11-18 12:23:33 -06:00
  • c2d0041704 Support special tokens and not adding BOS to prompt in speculative Branden Butler 2023-11-10 09:59:22 -06:00
  • 046a469d11 Fix(ish?) prompt tokenizing KerfuffleV2 2023-11-18 09:37:31 -07:00
  • 89262ded9e Merge branch 'master' into feat-seqrep-sampler-simple KerfuffleV2 2023-11-18 09:37:14 -07:00
  • 4b5cf30b7f
    Merge 147ab5eb0c into 2923f17f6f trabbart 2023-11-18 18:21:03 +02:00
  • b044ba7aaf gguf-py : initialize chat_template slaren 2023-11-18 17:16:33 +01:00
  • edd98313ca gguf-py : check chat_template type slaren 2023-11-18 16:59:22 +01:00
  • 05a231baa0 tokenize example: Respect normal add BOS token behavior KerfuffleV2 2023-11-18 08:20:41 -07:00
  • 2923f17f6f
    Clean up ggml-cuda.cu warnings when compiling with clang (for ROCM) (#4124) b1536 Kerfuffle 2023-11-18 08:11:18 -07:00
  • 72c3a5d28a gguf-py : bump version slaren 2023-11-18 16:11:06 +01:00
  • b027868dfd putting the tensor name in the json mike dupont 2023-11-18 10:01:23 -05:00
  • bb8189ae6d Revert "ggml-cuda.cu: Move static items into anonymous namespace" KerfuffleV2 2023-11-18 08:00:19 -07:00
  • 2820883b9b Revert "ggml-cuda.cu: Fix use of namespace start macro" KerfuffleV2 2023-11-18 08:00:03 -07:00
  • 9829b5b38b llama.cpp : escape new lines in gguf kv info prints slaren 2023-11-18 15:59:32 +01:00
  • d8aa96496c now emit the json format mike dupont 2023-11-18 08:55:36 -05:00
  • eaefebb2c9 gguf-py : export chat templates slaren 2023-11-18 14:53:34 +01:00
  • f10956876a Merge branch 'master' into feat-seqrep-sampler-simple KerfuffleV2 2023-11-18 04:38:30 -07:00
  • 26c1149026 ggml-cuda.cu: Fix use of namespace start macro KerfuffleV2 2023-11-18 04:35:02 -07:00
  • e29757e0f7 ggml-cuda.cu: Move static items into anonymous namespace KerfuffleV2 2023-11-18 04:20:19 -07:00
  • e7a65bb7ca ggml-cuda.cu: Clean up warnings when compiling with clang KerfuffleV2 2023-11-18 03:41:45 -07:00
  • aa7cf3143b Fix broken logic for parsing bool KV overrides Fix issue where overrides didn't apply when key missing in GGUF metadata Resolve merge changes KerfuffleV2 2023-11-18 03:07:03 -07:00
  • 2147421904 Merge branch 'master' into feat-override-metadata KerfuffleV2 2023-11-18 02:27:56 -07:00
  • cb5bfe0c18 Various cleanups KerfuffleV2 2023-11-18 02:24:52 -07:00
  • 75518890d2 Merge upstream changes, fix conflicts 0cc4m 2023-11-18 09:56:58 +01:00
  • 39cd277073 Fix issues with float16 overflows in shaders 0cc4m 2023-11-18 09:45:50 +01:00
  • 473eb10947
    Update README.md Aaryaman Vasishta 2023-11-18 14:55:09 +09:00
  • 22c56f9221 default to multiuser Concedo 2023-11-18 12:55:59 +08:00
  • ce31d955f3 begin sync with master Bailey Chittle 2023-11-17 20:01:41 -08:00
  • f510cc1719 Merge branch 'master' into swiftui_metal_update Bailey Chittle 2023-11-17 19:55:22 -08:00
  • 026eb7cd01
    Fix when params.n_vocab < tokenizer vocab size wonjun Jang 2023-11-18 12:55:14 +09:00
  • cd618549dd added O3, now has insufficient memory access Bailey Chittle 2023-11-17 19:43:30 -08:00
  • 6bf8ee4aea Merge branch 'master' into concedo_experimental Concedo 2023-11-18 11:10:45 +08:00