Commit graph

  • 0e1010b67d fix M. Yusuf Sarıgöz 2023-10-07 21:12:28 +03:00
  • 9ccbb2770b Bump version gguf-v0.4.1 M. Yusuf Sarıgöz 2023-10-07 20:51:47 +03:00
  • 68017ef43a Fix CI for publishing GGUF package M. Yusuf Sarıgöz 2023-10-07 20:48:00 +03:00
  • 5d7a3a5c0d allow forcing ext_factor to zero if scaling type is YaRN Cebtenzzre 2023-10-07 13:20:33 -04:00
  • 4f4e94804d improve printing of YaRN parameters Cebtenzzre 2023-10-07 12:59:27 -04:00
  • 746641574a gguf : store scaling type as a string instead of an int Cebtenzzre 2023-10-07 12:57:55 -04:00
  • 9ee8aeccd7 fixed memory leak by freeing temporary graph during session load l3utterfly 2023-10-08 00:36:20 +08:00
  • 88a14fcfef fixed logic error, should not allocate more memory, and instead update existing parts l3utterfly 2023-10-08 00:33:29 +08:00
  • 545b03491c
    minor Georgi Gerganov 2023-10-07 19:20:40 +03:00
  • e46708eedc updated lite Concedo 2023-10-07 23:33:54 +08:00
  • cd0022f70f added GGML_ALLOCATOR_DEBUG code back l3utterfly 2023-10-07 22:55:16 +08:00
  • 4c3a7cdd46 fixed bad memory access exception on ios 17 l3utterfly 2023-10-07 22:34:33 +08:00
  • 678f31f2fd Merge branch 'master' into concedo_experimental Concedo 2023-10-07 22:00:09 +08:00
  • ef5fae8b46 py: fix 'gguf' has no attribute 'TENSOR_NAMES' #3496 slashapp 2023-10-07 21:53:20 +08:00
  • ca4a8c5dc8 updated lite Concedo 2023-10-07 21:50:24 +08:00
  • 8f6ad68427
    metal : indentations Georgi Gerganov 2023-10-07 16:16:23 +03:00
  • c60022488a
    metal : rename kernels mul_mat_ to mul_mv_ Georgi Gerganov 2023-10-07 15:06:22 +03:00
  • 4a214689e7 Merge branch 'master' of github.com:ggerganov/llama.cpp vvhg1 2023-10-07 13:01:26 +02:00
  • b1b6beff2b rm unnecessary bool check vvhg1 2023-10-07 12:55:59 +02:00
  • c3a7f848f2 fix interactive prompt escaping and fix server infill leading space handling vvhg1 2023-10-07 12:07:07 +02:00
  • 99ed03a24a
    metal : improve decoding speed for batches of 2-16 Georgi Gerganov 2023-10-07 12:59:24 +03:00
  • c47066d833
    py : change version of numpy requirement to 1.24.4 (#3515) Tom C 2023-10-07 02:56:15 -07:00
  • f1782c68de
    quantize : fail fast on write errors (#3521) b1342 cebtenzzre 2023-10-07 04:41:52 -04:00
  • c26765a0a1
    metal : support default.metallib load & reuse code for swift package (#3522) b1341 Jhen-Jie Hong 2023-10-07 03:40:27 -05:00
  • 42833bc7a8
    ggml : silu(-inf) should never happen Georgi Gerganov 2023-10-07 11:30:36 +03:00
  • bdbe11719d
    refact : fix convert script + zero out KV cache to avoid nans Georgi Gerganov 2023-10-07 11:18:04 +03:00
  • fc01dc0ca4 Merge branch 'master' of github.com:ggerganov/llama.cpp vvhg1 2023-10-07 09:59:44 +02:00
  • 003c15bfc5 Revert "only rm when params.escape, rm space if possible which is added back or rm added space token" vvhg1 2023-10-07 09:22:50 +02:00
  • 63ba0b621f only rm when params.escape, rm space if possible which is added back or rm added space token vvhg1 2023-10-07 09:22:36 +02:00
  • 0e797c2fc5
    llm : support Adept Persimmon 8B (#3410) b1340 Phillip Kravtsov 2023-10-07 00:12:43 -07:00
  • 0526560759 only rm when params.escape, rm space if possible which is added back or rm added space token vvhg1 2023-10-07 09:08:30 +02:00
  • 3a716b4dae
    Fix for #3454 (#3455) b1339 goerch 2023-10-07 06:57:01 +02:00
  • f134d64bbd metal : use SWIFT_PACKAGE def instead of define GGML_SWIFT Jhen 2023-10-07 11:20:11 +08:00
  • 87a35470e0 quantize : fail fast on write errors Cebtenzzre 2023-10-06 23:11:49 -04:00
  • 1b88cbadd5 metal : support load default.metallib & reuse code for swift package Jhen 2023-10-07 10:52:05 +08:00
  • 6b282271b1 Merge branch 'master' into concedo_experimental Concedo 2023-10-07 10:24:34 +08:00
  • 07a114de63 force debugmode to be indicated on horde, allow 64k context for gguf Concedo 2023-10-07 10:23:33 +08:00
  • a8435c3e32 improved token gen logic and limits FSSRepo 2023-10-06 18:22:07 -04:00
  • 0d70518220 Update contextual help pudepiedj 2023-10-06 22:19:29 +01:00
  • f53d6681e3 Change version of numpy requirement to 1.24.4. (https://github.com/ggerganov/llama.cpp/issues/3511) Lyjia 2023-10-06 13:44:05 -07:00
  • b4046aabbf removing any leading whitespace from infill suffix and removing leeading space token from suffix when params.escape vvhg1 2023-10-06 21:53:24 +02:00
  • 485a471e93 add overlooked offload code ggml-ci Phillip Kravtsov 2023-10-06 12:39:27 -07:00
  • 0c1a8f67a5 Merge branch 'master' of github.com:ggerganov/llama.cpp into phillip-kravtsov/support-adept-persimmon-8b Phillip Kravtsov 2023-10-06 12:39:15 -07:00
  • 1faaae8c2b
    readme : update models, cuda + ppl instructions (#3510) BarfingLemurs 2023-10-06 15:13:36 -04:00
  • cb13d73a72
    server : docs fix default values and add n_probs (#3506) Mihai 2023-10-06 21:39:33 +03:00
  • 377be2f39d removing any leading whitespace from infill suffix and removing leeading space token from suffix when params.escape vvhg1 2023-10-06 20:34:04 +02:00
  • c1ac53fbdb improve README + more questions FSSRepo 2023-10-06 14:18:03 -04:00
  • 0fe23121dd
    Update README.md BarfingLemurs 2023-10-06 14:08:36 -04:00
  • d8f7a7077a Merge branch 'master' into concedo_experimental Concedo 2023-10-07 01:36:14 +08:00
  • 120695ddf7 add update link Concedo 2023-10-07 01:33:18 +08:00
  • 6796e7450c serverinfill tokens correction vvhg1 2023-10-06 18:35:50 +02:00
  • 8bd24b2e5c
    Merge branch 'ggerganov:master' into master vvhg1 2023-10-06 18:28:10 +02:00
  • dfeda32abd infill tokens correction vvhg1 2023-10-06 18:26:18 +02:00
  • 9ca79d5cbb
    kv cache slot search improvements (#3493) b1336 Kerfuffle 2023-10-06 10:10:13 -06:00
  • 2fdc181dcb example added to makefile FSSRepo 2023-10-06 11:46:51 -04:00
  • 9db21757ef update docs Concedo 2023-10-06 23:40:21 +08:00
  • 777d4a7d0e Server docs: fix default values and add n_probs Mihai 2023-10-06 18:38:56 +03:00
  • 2a36c85558 abort has multiuser support via genkey too Concedo 2023-10-06 23:27:00 +08:00
  • 6a5d6733fc log sys - build info + rnd seed FSSRepo 2023-10-06 11:25:58 -04:00
  • 84eeecb889 updated lite Concedo 2023-10-06 23:15:11 +08:00
  • f0c646f023 fix makefile server build FSSRepo 2023-10-06 10:31:14 -04:00
  • cdceda30c9 added cors middleware FSSRepo 2023-10-06 10:02:37 -04:00
  • c71d933d5b ci: wrong indent style fixed FSSRepo 2023-10-06 09:53:36 -04:00
  • 7a4dcff667 Update contextual help dev pudepiedj 2023-10-06 14:50:17 +01:00
  • f4f9367faa less code duplication, offload k and v separately slaren 2023-10-06 15:44:06 +02:00
  • c12e18f2f1 httplib.h json.hpp -> common lib FSSRepo 2023-10-06 09:40:08 -04:00
  • 0c731ca403
    prompts : fix editorconfig checks after #3416 Georgi Gerganov 2023-10-06 16:35:55 +03:00
  • 465b8f4fc0 Ensure kv cache head points to a valid slot in llama_decode internal KerfuffleV2 2023-10-06 07:33:10 -06:00
  • a8777ad84e
    parallel : add option to load external prompt file (#3416) b1334 pudepiedj 2023-10-06 14:16:38 +01:00
  • 4bded6e23c
    Update examples/parallel/parallel.cpp Georgi Gerganov 2023-10-06 16:15:57 +03:00
  • defffb6055
    Merge branch 'ggerganov:master' into load-parallel-prompt-file pudepiedj 2023-10-06 14:04:16 +01:00
  • bb093eb295
    Merge pull request #4 from ggerganov/server-parallel Steward Garcia 2023-10-06 08:54:35 -04:00
  • 97af49fa39
    server : reuse llama_sample_token common util (#3494) b1333 Jhen-Jie Hong 2023-10-06 07:44:24 -05:00
  • 3144563db1 Use n_ctx in kv find slot for consistency KerfuffleV2 2023-10-06 06:23:49 -06:00
  • 5ab6c2132a
    server-parallel : add "--reverse-prompt" + compiler warning fixes server-parallel Georgi Gerganov 2023-10-06 14:32:19 +03:00
  • fbc5582db2
    fix comments about k-quant block sizing Johannes Rudolph 2023-10-06 12:50:53 +02:00
  • 16820a5a0d
    llama : correct hparams comparison (#3446) b1332 l3utterfly 2023-10-06 18:47:59 +08:00
  • 1d1232ffbc show horde job count Concedo 2023-10-06 18:42:59 +08:00
  • 04b2f4386e
    ci : fix xcodebuild destinations (#3491) b1331 Jhen-Jie Hong 2023-10-06 05:36:43 -05:00
  • b5cd935cdb Merge branch 'master' into concedo_experimental Concedo 2023-10-06 17:58:08 +08:00
  • 84b43bb718 Merge branch 'load-parallel-prompt-file' of https://github.com/pudepiedj/llama.cpp into load-parallel-prompt-file pudepiedj 2023-10-06 09:54:38 +01:00
  • 8b7d88afff Reinstate original jeopardy.sh pudepiedj 2023-10-06 09:54:32 +01:00
  • 739d6d3022 Automatic helper dev pudepiedj 2023-10-06 09:52:33 +01:00
  • 1c4c8cd801
    Merge branch 'ggerganov:master' into load-parallel-prompt-file pudepiedj 2023-10-06 09:51:26 +01:00
  • 9d2a25b12b updated lite, fixed fancy quotes Concedo 2023-10-06 15:44:37 +08:00
  • efd0567f10 Merge branch 'concedo' into concedo_experimental Concedo 2023-10-06 11:22:01 +08:00
  • b8f0576c7b updated docs Concedo 2023-10-06 11:19:04 +08:00
  • 9d0dd7ab11
    avoid leaving a zombie process for --onready (#462) grawity 2023-10-06 06:06:37 +03:00
  • 090383b21b bug fixes, but now has an invalid memory access :( Bailey Chittle 2023-10-05 17:53:42 -07:00
  • ae6beb4696 initial conversion to new format, utf8 errors? Bailey Chittle 2023-10-05 17:03:49 -07:00
  • 5e97a60ded Merge branch 'master' into swiftui_metal Bailey Chittle 2023-10-05 16:17:46 -07:00
  • 00006da253 common : use n_probs for temperature sampling jhen 2023-10-06 07:11:18 +08:00
  • 898816a34a server : reuse llama_sample_token common function jhen 2023-10-06 06:18:01 +08:00
  • abafd01ec8 kv cache slot search improvements KerfuffleV2 2023-10-05 16:06:55 -06:00
  • f9cd6dc171 ci : add .swift to paths Jhen 2023-10-06 04:14:06 +08:00
  • bb22f2e882 ci : fix xcodebuild destinations Jhen 2023-10-06 03:49:30 +08:00
  • 1d518d65d3 Fix build Phillip Kravtsov 2023-10-05 12:24:06 -07:00
  • afc09db51c fix json format README FSSRepo 2023-10-05 15:23:58 -04:00
  • eb75395b5c remove trail whitespace FSSRepo 2023-10-05 15:18:47 -04:00
  • a7a6ceb7ae server handling multiple clients with cam FSSRepo 2023-10-05 15:12:39 -04:00