Commit graph

  • eef5ae3898 Merge branch 'master' into fix-respect-use-bos-token KerfuffleV2 2023-11-13 18:12:40 -07:00
  • b46d12f86d
    convert.py: also look for plain model.safetensors (#4043) afrideva 2023-11-13 17:03:40 -08:00
  • 3804bc8c5d cublas mike dupont 2023-11-13 19:47:59 -05:00
  • 4823ffc3bc
    Merge branch 'ggerganov:master' into convert-safetensors-fix afrideva 2023-11-13 16:46:26 -08:00
  • a05aa8a83e removing error, seems to be - mike dupont 2023-11-13 17:36:56 -05:00
  • 1dddcdb0a4 adding debug ebnf mike dupont 2023-11-13 17:22:07 -05:00
  • 1f778c1652 segfault mike dupont 2023-11-13 15:47:42 -05:00
  • 930e132c5b Let's try merging master instead of rebasing for a little change of pace KerfuffleV2 2023-11-13 13:19:29 -07:00
  • eda5614f41 gguf-py readme example fixes for keys: general.architecture and general.alignment jay-johnson 2023-11-13 17:44:30 +00:00
  • 63c950c5b7
    Update ggml-quants.c Michael Potter 2023-11-13 09:08:11 -08:00
  • bd90eca237
    llava : fix regression for square images in #3613 (#4056) b1513 M. Yusuf Sarıgöz 2023-11-13 18:20:52 +03:00
  • 853fe042b9 Fix gguf post merge Galunid 2023-11-13 16:07:14 +01:00
  • beb17a7d94 Merge branch 'master' into stablelm-support Galunid 2023-11-13 16:02:37 +01:00
  • 3d68f364f1
    ggml : sync (im2col, GPU conv, 32-bit arm compat) (#4060) b1512 Georgi Gerganov 2023-11-13 16:55:52 +02:00
  • 9f72de7732
    ggml : sync (im2col, GPU conv, 32-bit arm compat) Georgi Gerganov 2023-11-13 14:38:22 +02:00
  • c049b37d7b
    readme : update hot topics Georgi Gerganov 2023-11-13 14:18:08 +02:00
  • 4760e7cc0b
    sync : ggml (backend v2) (#3912) b1510 Georgi Gerganov 2023-11-13 14:16:23 +02:00
  • 5600bd8cbc add new gpt2 Bingxuan Wang 2023-11-13 18:23:12 +08:00
  • bb50a792ec
    Add ReLU and SQR CUDA ops to (partially) fix Persimmon offloading (#4041) b1509 Kerfuffle 2023-11-13 01:58:15 -07:00
  • 5925436075 llava : fix regression for square images in #3613 M. Yusuf Sarıgöz 2023-11-13 11:55:57 +03:00
  • 3fad249bf6 Persimmon loader: More helpful error on CUDA/ROCM when offloading too many layers KerfuffleV2 2023-11-13 01:34:28 -07:00
  • f4ee91abbb improved estimation Concedo 2023-11-13 15:45:13 +08:00
  • c442941031
    revert convert.py help message change afrideva 2023-11-12 21:43:32 -08:00
  • 08801ac865
    Merge branch 'ggerganov:master' into convert-safetensors-fix afrideva 2023-11-12 21:40:54 -08:00
  • 5b0d76f665
    increase indentation per 4-spaces rule Michael Potter 2023-11-12 18:17:39 -08:00
  • 287bc68573
    Update ggml-quants.c Michael Potter 2023-11-12 16:54:53 -08:00
  • 7a5e92e748
    Update ggml-quants.c Michael Potter 2023-11-12 16:52:45 -08:00
  • 21fd874c8d
    gguf-py: gguf_writer: Use bytearray to build metadata (#4051) Kerfuffle 2023-11-12 16:39:37 -07:00
  • 2393050b6d Use bytearray instead KerfuffleV2 2023-11-12 16:14:49 -07:00
  • bf1c8d761a
    Update ggml-quants.c Michael Potter 2023-11-12 15:07:24 -08:00
  • 446ee3c79f gguf-py: gguf_writer: Use BytesIO to build metadata KerfuffleV2 2023-11-12 15:43:09 -07:00
  • ff7fc9885f
    Update convert.py "model" option help message afrideva 2023-11-12 11:34:13 -08:00
  • c70a837d93
    Merge branch 'ggerganov:master' into convert-safetensors-fix afrideva 2023-11-12 11:29:15 -08:00
  • 9efc6b94f7
    Merge branch 'master' into sync Georgi Gerganov 2023-11-12 16:55:31 +02:00
  • 317bca9876 Add ChatML functionality to main example Sebastian Cramond 2023-11-12 20:59:06 +10:30
  • 2659a180ee
    Merge branch 'master' into amx Abhilash Majumder 2023-11-12 12:06:16 +05:30
  • 532dd74e38
    Fix some documentation typos/grammar mistakes (#4032) Richard Kiss 2023-11-11 22:04:58 -08:00
  • 8c9b38fd38
    Update examples/parallel/README.md Richard Kiss 2023-11-11 21:59:05 -08:00
  • c7bae1e125
    Check for single-file safetensors model afrideva 2023-11-11 20:20:07 -08:00
  • 4dce910cbc
    add safetensors to convert.py help message afrideva 2023-11-11 20:13:11 -08:00
  • be2ac38a28 Make qrot, krot contiguous Galunid 2023-11-12 04:30:17 +01:00
  • 047032d689 Duh - add llava in another place Galunid 2023-11-12 03:48:51 +01:00
  • 49e395a66c
    Merge branch 'ggerganov:master' into feature/save_temps Mike DuPont 2023-11-11 19:31:42 -05:00
  • 8df6fe601a Add ReLU and SQR CUDA ops to fix Persimmon offloading KerfuffleV2 2023-11-11 15:50:28 -07:00
  • c7ff2d5581 Respect add_bos_token GGUF metadata value KerfuffleV2 2023-11-11 13:16:04 -07:00
  • 0c201b9142 gguf-py: gguf-dump: Respect --no-tensor flag in JSON mode. KerfuffleV2 2023-11-11 13:15:27 -07:00
  • 5d77b73ec8 examples : Add tokenize Zakkor 2023-11-11 18:21:20 +02:00
  • dcf372e60e
    Update convert.py wonjun Jang 2023-11-12 03:26:46 +09:00
  • 9f4dc236a9
    Update convert.py wonjun Jang 2023-11-12 03:23:41 +09:00
  • f37a7d7028
    Update convert.py wonjun Jang 2023-11-12 02:22:37 +09:00
  • e86fc56f75
    Fix gguf-convert-endian script (#4037) M. Yusuf Sarıgöz 2023-11-11 18:35:31 +03:00
  • b226d07d01 Bump version and upd description M. Yusuf Sarıgöz 2023-11-11 13:24:49 +03:00
  • b899e76bb4 Fix gguf-convert-endian script M. Yusuf Sarıgöz 2023-11-11 13:19:44 +03:00
  • 9e035cdab7 Add vision model support Galunid 2023-11-11 08:20:20 +01:00
  • d96ca7ded7
    server : fix crash when prompt exceeds context size (#3996) b1505 Alexey Parfenov 2023-11-11 05:48:21 +00:00
  • 018b107f9b typos Richard Kiss 2023-11-11 05:37:07 +00:00
  • 8b4a28bf3f main : Call llama_log_set to use LOG_TEE Andrew Godfrey 2023-11-10 21:27:02 -08:00
  • 34b0a08207
    gguf-py: Refactor and allow reading/modifying existing GGUF files (#3981) Kerfuffle 2023-11-10 22:04:50 -07:00
  • a00a32e049 fixed localflag Concedo 2023-11-11 10:21:45 +08:00
  • 027cd8c4aa Merge branch 'master' into concedo_experimental Concedo 2023-11-11 10:07:34 +08:00
  • e08e1bdc68 include opencl dll Concedo 2023-11-11 10:05:42 +08:00
  • 4a4fd3eefa
    server : allow continue edit on completion mode (#3950) b1503 Jhen-Jie Hong 2023-11-11 06:49:33 +08:00
  • ee76500835 update mike dupont 2023-11-10 16:46:24 -05:00
  • 4814b4bbcd Promote add_X_token to GGUF metadata for BOS and EOS KerfuffleV2 2023-11-10 14:12:55 -07:00
  • 2927cca611
    Fix bug where POST /infill doesn't work without prompt argument Jonas Templestein 2023-11-10 21:26:48 +01:00
  • f22b2f2045 cleanup Jared Van Bortel 2023-11-10 14:46:57 -05:00
  • 4b9a685f10 rename file comments to welcome Concedo 2023-11-11 01:15:35 +08:00
  • e571faeb6b Make hipBLAS CMake more similar to cuBLAS ardfork 2023-11-10 15:21:58 +00:00
  • f034effa22 server: fix core dump when input prompt larger than prompt context (n_ctx) ydlme 2023-11-10 22:53:25 +08:00
  • 147ab5eb0c restore flushing stdout Bartosz Podkanowicz 2023-11-10 15:39:26 +01:00
  • a6e6b8b96b Merge branch 'master' into concedo_experimental Concedo 2023-11-10 22:27:11 +08:00
  • 36e860e94d updated docs Concedo 2023-11-10 22:25:11 +08:00
  • df9d1293de
    Unbreak persimmon after #3837 (#4010) b1502 Galunid 2023-11-10 14:24:54 +01:00
  • 9ce51b69b0 gguf-py: SpecialVocab: Always try available sources for special token ids KerfuffleV2 2023-11-10 05:50:45 -07:00
  • 960f912a14 convert.py: We can't currently support Q8_0 on big endian. KerfuffleV2 2023-11-10 05:50:15 -07:00
  • 0b0e726b2d And include scripts/__init__.py, derp KerfuffleV2 2023-11-10 00:55:15 -07:00
  • eff662d66e Set up gguf- scripts in pyproject.toml KerfuffleV2 2023-11-10 00:53:23 -07:00
  • 4a130ee11c added support for filecomments Concedo 2023-11-10 14:12:06 +08:00
  • e87d709446 Cleanup for review Galunid 2023-11-10 06:53:28 +01:00
  • be92cfa125 added preloadstory Concedo 2023-11-10 13:05:22 +08:00
  • a371a8b611 Use ggml_view_3d Galunid 2023-11-10 05:54:46 +01:00
  • a21e9e7126 fix python 3.8 compat Jared Van Bortel 2023-11-09 21:09:24 -05:00
  • 795dc0f048 constants : remove unneeded type annotations Jared Van Bortel 2023-11-09 21:03:05 -05:00
  • 5608cd8d89 cleanup Jared Van Bortel 2023-11-09 20:59:59 -05:00
  • 7d3580d5b1
    Murder accidental tuple in gguf-py/scripts/gguf-dump.py Kerfuffle 2023-11-09 17:50:11 -07:00
  • d0445a2eff better documentation llama-metadata slaren 2023-11-10 01:38:20 +01:00
  • bfcbb5bc32 format -> std::to_string slaren 2023-11-10 01:26:12 +01:00
  • 382f9751fd A few for gguf-dump.py cleanups KerfuffleV2 2023-11-09 17:08:44 -07:00
  • bd241db879 Add JSON dumping support to gguf-dump.py KerfuffleV2 2023-11-09 16:56:27 -07:00
  • a04f0487b0 Make GGUFReader endian detection less arbitrary KerfuffleV2 2023-11-09 16:55:58 -07:00
  • 07352f4950 llama : add functions to get the model's metadata slaren 2023-11-10 00:49:16 +01:00
  • 811c26a1c8 Suggest setting AMDGPU_TARGETS ardfork 2023-11-09 18:24:33 +00:00
  • 52bdc7e946 Reorganize scripts KerfuffleV2 2023-11-09 14:52:44 -07:00
  • 4e23f8a81b added lto Chad Brewbaker 2023-11-09 15:03:58 -06:00
  • 3f8e444d0d fix error in the formula - formula now is similar to formula in the paper. Bartosz Podkanowicz 2023-11-09 21:21:35 +01:00
  • 1cf0b09273 erase flushing stderr, changes in spaces Bartosz Podkanowicz 2023-11-09 21:02:59 +01:00
  • 5738b2f3b6 gguf-py : bump minor version Jared Van Bortel 2023-11-09 12:28:28 -05:00
  • 233cb0741f cleanup Jared Van Bortel 2023-11-09 12:11:41 -05:00
  • 2a2c518125 Unbreak persimmon after #3837 Galunid 2023-11-09 16:37:44 +01:00
  • bca0962575 Add convert-gguf-endian.py script KerfuffleV2 2023-11-09 08:35:35 -07:00