Commit graph

  • 79f34abddb
    ggml : add RISC-V Vector Support for K-Quants and improved the existing intrinsics (#3453) b1317 Tameem 2023-10-03 23:38:19 +05:00
  • 8186242b6d
    main : consistent prefix/suffix coloring (#3425) b1316 h-h-h-h 2023-10-03 20:16:15 +02:00
  • ac2219fef3
    llama : fix session saving/loading (#3400) b1315 Georgi Gerganov 2023-10-03 21:04:01 +03:00
  • 5418932b71
    llama : fix comments for llama_kv_cache API fix-sessions Georgi Gerganov 2023-10-03 21:01:45 +03:00
  • e9bcf66a5c per-layer KV slaren 2023-10-03 17:49:36 +02:00
  • bf8c4dfde8 Merge branch 'Update-load-parallel-prompt-file' of https://github.com/pudepiedj/llama.cpp into Update-load-parallel-prompt-file pudepiedj 2023-10-03 18:13:24 +01:00
  • fc1ba35b09 Merge remote-tracking branch 'origin/load-parallel-prompt-file' into Update-load-parallel-prompt-file with requested changes pudepiedj 2023-10-03 18:12:21 +01:00
  • 48be797ffb
    llama : expose model's rope_freq_scale in the API (#3418) b1314 Alex Klinkhamer 2023-10-03 10:09:28 -07:00
  • f56e1baec3
    metal : alibi for arbitrary number of heads (#3426) Jiahao Li 2023-10-04 00:55:21 +08:00
  • 017efe899d
    cmake : make LLAMA_NATIVE flag actually use the instructions supported by the processor (#3273) b1312 Eve 2023-10-03 16:53:15 +00:00
  • d37ed4750f Add llava inference code, but it's buggy. debugging M. Yusuf Sarıgöz 2023-10-03 19:49:45 +03:00
  • ea726fcffa cleanup threaded horde submit Concedo 2023-10-04 00:34:26 +08:00
  • c249f7dbc5 Merge branch 'master' into concedo_experimental Concedo 2023-10-03 23:51:30 +08:00
  • 0cc740115d updated lite, improve horde worker (+1 squashed commits) Concedo 2023-10-03 23:43:13 +08:00
  • 337120cc0d
    llama : fix handling of "future" tokens when loading sessions Georgi Gerganov 2023-10-03 18:29:22 +03:00
  • af2fbb82e1
    Merge branch 'ggerganov:master' into Update-load-parallel-prompt-file pudepiedj 2023-10-03 16:03:05 +01:00
  • ce10861214
    Merge branch 'ggerganov:master' into load-parallel-prompt-file pudepiedj 2023-10-03 16:02:44 +01:00
  • ee8e2b2604
    Workaround for #3454 goerch 2023-10-03 16:23:07 +02:00
  • ae8ccdc1be Remove old tkinter gui (+1 squashed commits) Concedo 2023-10-03 17:49:07 +08:00
  • b343833720 Final revision pudepiedj 2023-10-03 14:45:31 +01:00
  • f6883a7809 Added RVV intrinsics support for k_quants Ahmad Tameem 2023-10-02 13:49:30 +05:00
  • d10470a1e3 Breaking Change: Remove deprecated commands Concedo 2023-10-03 17:16:09 +08:00
  • 2e3dad3a9c Interim commit pudepiedj 2023-10-03 10:10:00 +01:00
  • 51196a44dc Interim commit pudepiedj 2023-10-03 09:46:53 +01:00
  • 39bd512dd1 Fix q6_k dequant shader for AMD 0cc4m 2023-10-03 09:31:54 +02:00
  • ff5a3f0c09
    Work on the BPE tokenizer (#3252) b1311 goerch 2023-10-03 09:16:26 +02:00
  • 3e518e255b
    Apply @jploski 's fix for missing tokens goerch 2023-10-03 08:34:27 +02:00
  • 7e9120f7b1 LLaVA image encoder is working. will combine with llama M. Yusuf Sarıgöz 2023-10-03 01:20:07 +03:00
  • 1c84003c08
    convert : fix vocab size when not defined in hparams (#3421) cebtenzzre 2023-10-02 18:07:24 -04:00
  • 7a279fe5a8 Remove old script Phillip Kravtsov 2023-10-02 14:25:41 -07:00
  • 5a0990c1c3 Merge branch 'master' of github.com:ggerganov/llama.cpp into phillip-kravtsov/support-adept-persimmon-8b Phillip Kravtsov 2023-10-02 14:00:14 -07:00
  • e293ebd68e
    Merge branch 'ggerganov:master' into load-parallel-prompt-file pudepiedj 2023-10-02 21:14:15 +01:00
  • 178d0dd78b No --in-prefix coloring h-h-h-h 2023-10-01 11:58:36 +02:00
  • 470801292d mpt : remove tabs, trailing whitespace Jan Ploski 2023-10-02 21:55:22 +02:00
  • e78f0b0d05
    cmake : increase minimum version for add_link_options (#3444) b1309 cebtenzzre 2023-10-02 15:38:43 -04:00
  • 66f2063da2 fixed floating point comparison issues l3utterfly 2023-10-03 03:33:52 +08:00
  • 665018c749
    CLBlast: Add broadcast support for matrix multiplication (#3402) b1308 shibe2 2023-10-02 23:26:15 +04:00
  • 7d14b33981 fixed bug where kv_self.size is being set wrongly to the buffer size instead of the context size l3utterfly 2023-10-03 03:23:42 +08:00
  • 29a404a951
    gguf : add BERT, MPT, and GPT-J arch info (#3408) cebtenzzre 2023-10-02 15:20:28 -04:00
  • 89fa828b23 Merge branch 'master' of https://github.com/ggerganov/llama.cpp into add-gguf-architectures Cebtenzzre 2023-10-02 15:17:39 -04:00
  • 8562be9f30 cmake : increase minimum version for add_link_options Cebtenzzre 2023-10-02 15:10:17 -04:00
  • 0fe321031a
    gguf : general usability improvements (#3409) gguf-v0.4.0 cebtenzzre 2023-10-02 14:58:46 -04:00
  • bd890ec29e gguf : fix typos Cebtenzzre 2023-10-02 14:57:08 -04:00
  • 64607e409b gguf : bump version Cebtenzzre 2023-10-02 14:52:21 -04:00
  • 0f0e7c6480 rm scratch buf for now, will revert after cleanup M. Yusuf Sarıgöz 2023-10-02 21:38:32 +03:00
  • 422b110841 Minor changes to conversion script Phillip Kravtsov 2023-10-02 10:56:31 -07:00
  • 90e7d6de28 mpt : fixed comment s/gptneox/mpt/ Jan Ploski 2023-10-02 19:55:59 +02:00
  • cd4d3df820 Formatting changes Phillip Kravtsov 2023-10-02 10:26:39 -07:00
  • e6bf87f785 Small changes from review Phillip Kravtsov 2023-10-02 10:21:16 -07:00
  • 0f332a9104
    llama : temp fix for clearing "future" tokens from the KV cache Georgi Gerganov 2023-10-02 16:42:14 +03:00
  • 6a9fe3dfac
    Merge branch 'master' into fix-sessions Georgi Gerganov 2023-10-02 16:36:58 +03:00
  • 9476b01226
    cmake : make CUDA flags more similar to the Makefile (#3420) b1305 cebtenzzre 2023-10-02 09:16:50 -04:00
  • a03ce38455
    finetune : fix #3404 (#3437) b1304 xaedes 2023-10-02 15:15:45 +02:00
  • f80994b09c
    fix #3404 xaedes 2023-10-02 14:44:21 +02:00
  • d673691619 Move ParallelQuestions to /proimpts and rename pudepiedj 2023-10-02 13:15:15 +01:00
  • 5d3e142145 use_default_badwordsids defaults to false if the parameter is missing Concedo 2023-10-02 19:41:07 +08:00
  • 2fd71e27d8 Merge branch 'load-parallel-prompt-file' of https://github.com/pudepiedj/llama.cpp into load-parallel-prompt-file pudepiedj 2023-10-02 12:35:43 +01:00
  • 3e41cbabd1 Experiments with jeopardy pudepiedj 2023-10-02 12:33:05 +01:00
  • 3c2d677abd
    Merge branch 'ggerganov:master' into load-parallel-prompt-file pudepiedj 2023-10-02 12:30:24 +01:00
  • 59aa1acfe9 WIP: start implementing LLaVA M. Yusuf Sarıgöz 2023-10-02 14:12:35 +03:00
  • bb941fcee8
    llama : expose model's rope_freq_scale in the API grencez 2023-10-02 04:01:53 -07:00
  • 5aee498d97
    Fix coding style goerch 2023-10-02 13:01:46 +02:00
  • a847676984
    metal : set log callback before initializing (#3427) b1303 Adrian 2023-10-02 03:49:59 -07:00
  • 1bc01cbcd4 update images (+3 squashed commit) Concedo 2023-10-02 17:36:45 +08:00
  • 095231dfd3
    cmake : fix transient definitions in find pkg (#3411) b1302 bandoti 2023-10-02 06:51:49 -03:00
  • 3d162cc8ad
    Ported Starcoder and added some assertions goerch 2023-10-02 11:14:08 +02:00
  • 5b4cef5a60 archived old unused file Concedo 2023-10-02 16:57:20 +08:00
  • ea55295a74
    docker : ignore Git files (#3314) Kevin Ji 2023-10-02 04:53:53 -04:00
  • dd13a1bf2a Added RVV intrinsics support for Q8 quantize row and also improved the existing dot product function for risc-v. Ahmad Tameem 2023-10-02 13:46:56 +05:00
  • 0613562412 check whether platform is 390x if yes->do not import immintrin.h chenqiny 2023-10-02 04:28:22 -04:00
  • c97f01c362
    infill : add new example + extend server API (#3296) b1300 vvhg1 2023-10-02 09:42:02 +02:00
  • 02b9ccfd60
    Update unicode.h goerch 2023-10-02 09:19:28 +02:00
  • dccd1db48e
    Update unicode.h goerch 2023-10-02 09:19:06 +02:00
  • a9a2af93ed
    Update llama.cpp goerch 2023-10-02 09:18:51 +02:00
  • 28778f8ad3
    Add scores and token types back, adapt gptneox goerch 2023-10-02 08:15:50 +02:00
  • 23b9d3af49 force oai endpoints to return json Concedo 2023-10-02 12:45:14 +08:00
  • 0c47e79537 updated the API routing path and fixed a bug with threads Concedo 2023-10-02 11:05:19 +08:00
  • 38b01ba136
    remove out-commented code xaedes 2023-10-02 02:53:04 +02:00
  • cc5e2eeb49
    remove trailing whitespace xaedes 2023-10-01 23:51:32 +02:00
  • 02fbbf9099 fix typo Cebtenzzre 2023-10-01 17:04:01 -04:00
  • f18cfeab62 s/else if/elif/ Cebtenzzre 2023-10-01 17:02:39 -04:00
  • 6e81bc5f8b
    add multiple functions, decision to send message or use function_call and include sent function results in chat prompt xaedes 2023-10-01 22:41:54 +02:00
  • 0673ed89dd Set metal log callback before initializing metal. Adrian Smith 2023-10-01 12:15:53 -07:00
  • 2117e23f58
    Fix initialization of static maps goerch 2023-10-01 20:46:06 +02:00
  • f632a3c15e attempt enabling metal, fails Bailey Chittle 2023-10-01 10:52:55 -07:00
  • c0d710d132 [metal] alibi for arbitrary number of heads lijiahao 2023-10-02 00:48:10 +08:00
  • dffc6bee74 deprecate some launcher arguments. Concedo 2023-10-01 22:30:48 +08:00
  • b49a5bc546 formatting of text Concedo 2023-10-01 18:38:32 +08:00
  • fd6b6b2426 Typo h-h-h-h 2023-10-01 11:06:05 +02:00
  • 3816e93072
    Disable some flags in Apple x86_64 Wu Zhenyu 2023-10-01 16:52:55 +08:00
  • 37af613dfc
    Remove unused code goerch 2023-10-01 10:44:08 +02:00
  • bc841ec302 flag to retain grammar, fix makefile (+2 squashed commit) Concedo 2023-10-01 11:46:50 +08:00
  • b6ff08a291
    fix accessing None function_call xaedes 2023-10-01 06:54:55 +02:00
  • 509b4112fa cmake : fix MSVC build Cebtenzzre 2023-10-01 00:12:03 -04:00
  • 9c621afad7 convert : fix vocab size when not defined in hparams Cebtenzzre 2023-09-30 22:58:02 -04:00
  • e950411b8b
    add basic support for function calls xaedes 2023-10-01 04:42:28 +02:00
  • 7ab01ee3c6 Merge branch 'master' into concedo_experimental Concedo 2023-10-01 10:22:05 +08:00
  • 2fc00fac8c fixed makefile Concedo 2023-10-01 10:17:23 +08:00
  • bc70f21c3a
    remove -DLLAMA_MPI=ON Eve 2023-10-01 02:12:02 +00:00
  • 757a51e656 cmake : make CUDA flags more similar to the Makefile Cebtenzzre 2023-09-30 21:32:43 -04:00