Commit graph

  • 9035cfcd68
    Merge 'origin/master' into hipblas Henri Vasserman 2023-08-25 10:38:48 +03:00
  • 87e3733f24
    gguf : make gguf pip-installable M. Yusuf Sarıgöz 2023-08-25 09:26:05 +03:00
  • 0248ca811e gguf : add notes for tests gguf-pip M. Yusuf Sarıgöz 2023-08-25 09:08:05 +03:00
  • 2897926d90 gguf : update readme with build notes M. Yusuf Sarıgöz 2023-08-25 09:06:33 +03:00
  • 8798aea247 gguf : update readme with build notes M. Yusuf Sarıgöz 2023-08-25 09:02:36 +03:00
  • b91ad7f461
    ggml-alloc : enlarge size of parse_seq (#2776) b1057 Shouzheng Liu 2023-08-25 01:58:00 -04:00
  • 3e98cbe76e Merge branch 'master' into gguf-pip M. Yusuf Sarıgöz 2023-08-25 08:50:40 +03:00
  • 87338093d6 requirements : add gguf M. Yusuf Sarıgöz 2023-08-25 08:47:19 +03:00
  • b8a777e77b Merge branch 'master' into gguf-pip M. Yusuf Sarıgöz 2023-08-25 08:38:11 +03:00
  • 54e81bac6f
    Merge branch 'ggerganov:master' into betterlogs staviq 2023-08-25 06:08:45 +02:00
  • 181e8a9902 main: replaced fprintf/LOG_TEE, some trace logging staviq 2023-08-25 05:22:23 +02:00
  • 360b36c921 log tostring helpers, token vectors pretty prints staviq 2023-08-25 05:20:47 +02:00
  • c199936302 ggml-alloc: enlarge size of parse_seq lshzh-ww 2023-08-24 21:49:32 -04:00
  • 8f86eb9762 revert unnecessary change Jhen 2023-08-25 08:55:52 +08:00
  • 607758aed4 trim last new lines on stop Jhen 2023-08-25 08:54:55 +08:00
  • 8cec4409c6 skip -1 tok in loop to avoid send '' on end Jhen 2023-08-25 08:54:44 +08:00
  • 8209b5d6a2 revert llama_eval, create main example netrunnereve 2023-08-24 20:26:19 -04:00
  • 343be7fa14 use <br /> for new line Jhen 2023-08-25 06:47:29 +08:00
  • aa896e790b Rewrite for clarity Nigel Bosch 2023-08-24 17:22:53 -05:00
  • db29d68db5 Merge branch 'master' into server-probs Jhen 2023-08-25 06:20:32 +08:00
  • 2e5f70a25f
    Added enum to llama_token_get_type return type (#2774) b1056 Marcus Dunn 2023-08-24 14:49:30 -07:00
  • d9de89d8ab
    Added enum to llama_token_get_type return type Marcus Dunn 2023-08-24 14:37:06 -07:00
  • c3ff062b1e Merge https://github.com/ggerganov/llama.cpp into metal Ravindra Marella 2023-08-25 03:03:49 +05:30
  • 8ac33ce0ff Save rope scale only for linear scaling Nigel Bosch 2023-08-24 16:25:06 -05:00
  • 252647cf55
    ggml.c : undefine GGML_GELU_FP16 klosax 2023-08-24 23:02:01 +02:00
  • 06f792597a convert.py : add freq_base when converting CodeLlama from an HF model slaren 2023-08-24 22:13:18 +02:00
  • 8d0d83b391 Get rope scale from HF models Nigel Bosch 2023-08-24 15:09:22 -05:00
  • 53755ed841 llama-bench : add model sizes slaren 2023-08-24 22:03:02 +02:00
  • d0f77b1353
    convert.py : try to determine n_ctx automatically for CodeLlama (#2770) slaren 2023-08-24 21:10:39 +02:00
  • 7594540380 convert.py : try to determine n_ctx automatically for CodeLlama slaren 2023-08-24 20:47:13 +02:00
  • 45f1a43d86 mv include to common, params, help msg staviq 2023-08-24 20:44:47 +02:00
  • 8ee186c0aa Massive speed improvement thanks to Cebtenzzre KerfuffleV2 2023-08-24 12:13:00 -06:00
  • 0d3094f0c7
    gguf : add rope_freq_base parameter for CodeLlama (#2769) b1054 slaren 2023-08-24 20:04:05 +02:00
  • 21dcd944be gguf : add rope_freq_base parameter for CodeLlama slaren 2023-08-24 19:54:02 +02:00
  • 3efcbb8f59 Add --concurrency option KerfuffleV2 2023-08-23 19:36:55 -06:00
  • f54daa0735 Allow convert.py to convert to q8_0 KerfuffleV2 2023-08-23 15:42:19 -06:00
  • 8054c32260
    Merge branch 'master' into betterlogs staviq 2023-08-24 19:52:31 +02:00
  • f5080da474 update .gitignore staviq 2023-08-24 19:47:18 +02:00
  • 19f7b46f98 Merge https://github.com/ggerganov/llama.cpp into metal Ravindra Marella 2023-08-24 23:05:48 +05:30
  • f51c5d7620
    implement llama model file saving using gguf xaedes 2023-08-24 19:25:39 +02:00
  • 01f2224682
    falcon : write file type Georgi Gerganov 2023-08-24 19:58:30 +03:00
  • 38b16dfca6
    metal : bug-fix when enable ggml-alloc (#2757) b1052 Shouzheng Liu 2023-08-24 12:27:25 -04:00
  • 8f8c28e89c
    convert : auto-determine model name based on dir + scripts update Georgi Gerganov 2023-08-24 19:26:19 +03:00
  • 7694adda8d
    Fix for main example getting stuck when -n -2 and --interactive (#2767) b1050 Kerfuffle 2023-08-24 10:11:13 -06:00
  • fea95c682d
    fix convert.py for codellama, add llama 34B to the list of recognized models (#2768) b1049 slaren 2023-08-24 17:44:11 +02:00
  • f06caa3723 fix convert.py for codellama, add llama 34B to the list of recognized models slaren 2023-08-24 17:30:24 +02:00
  • c2f1790be9 Add a comment so future generations may suffer less. KerfuffleV2 2023-08-24 09:21:39 -06:00
  • e979cef2ff Attempted fix for main example getting stuck when -n -2 and --interactive KerfuffleV2 2023-08-24 09:09:49 -06:00
  • 4c4e4358ed fixed linux build error Concedo 2023-08-24 22:12:56 +08:00
  • ef955fbd23
    Tag release with build number (#2732) b1048 DannyDaemonic 2023-08-24 06:58:02 -07:00
  • 4072f20bba
    add missing lctx argument to get_example_targets_batch xaedes 2023-08-24 15:49:34 +02:00
  • 0c52c65d7f
    Merge branch 'master' into pr-train-mem-usage-improvements xaedes 2023-08-24 15:46:52 +02:00
  • d67777c202
    metal : add Q8_0 support (#2763) Georgi Gerganov 2023-08-24 16:19:57 +03:00
  • 661bede62f optimize tokenize method Concedo 2023-08-24 21:16:16 +08:00
  • 84e8da665d
    ggml.c : use ggml_float for gelu klosax 2023-08-24 15:13:18 +02:00
  • 1202e06c6f
    metal : add Q8_0 mul_mm kernel Georgi Gerganov 2023-08-24 15:42:29 +03:00
  • b95a4ccb22 added a token counting endpoint, set mmq as default Concedo 2023-08-24 20:41:49 +08:00
  • 61c8259a88
    metal : add mul_mat_q8_0_f32 kernel Georgi Gerganov 2023-08-24 15:32:27 +03:00
  • 797312e758
    ggml.c : use double precision for tanh klosax 2023-08-24 14:18:57 +02:00
  • 7ec7ef94a9 skip-unused: disable skipping on ROCm / when LLAMA_USE_HIPBLAS ochafik 2023-08-23 21:36:56 +01:00
  • 2cf4f62e12 Skip computation of unused logits during batch prompt eval (drop other batch positions after writing their kv to cache) ochafik 2023-08-18 01:46:20 +01:00
  • d30cb53c9d metal : use metal_printf for debug logging Ravindra Marella 2023-08-24 16:50:05 +05:30
  • 8e2b5abaa4 Fix SAFE_NAME Danny Daemonic 2023-08-24 04:04:53 -07:00
  • 238335f54f
    fix -nommq help for non CUDA/HIP Henri Vasserman 2023-08-24 14:03:31 +03:00
  • 81ecaa4b6c
    fix llama-bench Henri Vasserman 2023-08-24 13:52:51 +03:00
  • a9efc5e417 Prefix the build number with b Danny Daemonic 2023-08-24 03:50:49 -07:00
  • a60231f786
    Add Dockerfiles Henri Vasserman 2023-08-24 13:45:05 +03:00
  • 46a0881c7f
    metal : add dequantize_q8_0 kernel Georgi Gerganov 2023-08-24 13:40:34 +03:00
  • 058f905ef9
    ignore all build dirs Henri Vasserman 2023-08-24 13:23:23 +03:00
  • 7b842170c4
    Merge 'origin/master' into hipblas Henri Vasserman 2023-08-24 13:18:58 +03:00
  • c3e53b421a
    llama : escape all U+2581 in a string (#2750) b1047 Georgi Gerganov 2023-08-24 12:26:01 +03:00
  • ac4bb6ba02 cuda : add RoPE kernel for mode == 2 (NeoX) Georgi Gerganov 2023-08-24 11:13:56 +03:00
  • 81a0ef342c updated lite, switched to unminified source Concedo 2023-08-24 16:26:38 +08:00
  • 598d4d89ab fix for config file loading. from kcpp settings file Concedo 2023-08-24 15:45:33 +08:00
  • a3b9949626 Merge remote-tracking branch 'pop/add_config_arg' into concedo_experimental Concedo 2023-08-24 15:22:17 +08:00
  • b8372d4466 Merge branch 'master' into concedo_experimental Concedo 2023-08-24 15:21:24 +08:00
  • 0288361b65 gguf : fix line endings M. Yusuf Sarıgöz 2023-08-24 09:26:13 +03:00
  • 344f6e373b gguf: prepare as Pip package M. Yusuf Sarıgöz 2023-08-24 09:09:52 +03:00
  • 5dd870574e gguf: prepare as Pip package M. Yusuf Sarıgöz 2023-08-24 09:08:19 +03:00
  • 050046fa45 gitignore : add dist and rm pyproject.toml M. Yusuf Sarıgöz 2023-08-24 09:07:42 +03:00
  • 0c268a83e8 ggml-alloc: avoid return silently lshzh-ww 2023-08-24 01:34:57 -04:00
  • ee8b2aa75d metal: better memory alloc w/ concurrency dispatch lshzh-ww 2023-08-24 00:55:58 -04:00
  • 6e91a1b070
    llama : fix grammar sometimes generating null char (#2756) b1046 Evan Jones 2023-08-24 00:07:13 -04:00
  • 471e469ae2 pre gguf merge netrunnereve 2023-08-23 23:53:06 -04:00
  • d50ccb03a3 manual merge netrunnereve 2023-08-23 23:46:15 -04:00
  • 3bf60c5eb3 llama : fix grammar sometimes generating null char Evan Jones 2023-08-23 20:49:47 -04:00
  • 47b9f2d36f log_enable/disable, LOG_TEE, basic usage doc staviq 2023-08-24 02:00:16 +02:00
  • 463e117820 Simplify vector building logic ochafik 2023-08-23 22:21:07 +01:00
  • 44d5462b5c
    readme : fix link Georgi Gerganov 2023-08-23 23:44:19 +03:00
  • c7868b0753
    minor : fix trailing whitespace Georgi Gerganov 2023-08-23 23:43:00 +03:00
  • 79da24b58c
    readme : update hot topics Georgi Gerganov 2023-08-23 23:41:16 +03:00
  • 5132130af7 llama2.c: direct gguf output (WIP) ochafik 2023-08-23 21:08:00 +01:00
  • d8beb85c74
    Merge branch 'master' into fix-whitespace Georgi Gerganov 2023-08-23 23:09:35 +03:00
  • cf658adc83
    llm : add Falcon support (#2717) master-cf658ad Georgi Gerganov 2023-08-23 23:08:04 +03:00
  • fae8faa135
    perplexity : add log for start of tokenization Georgi Gerganov 2023-08-23 22:56:50 +03:00
  • 977629a34e
    Merge branch 'master' into fix-eos fix-eos Georgi Gerganov 2023-08-23 22:40:19 +03:00
  • 680ab3dcb1
    Merge 6803aac321 into a192860cfe akawrykow 2023-08-23 21:37:57 +02:00
  • a192860cfe
    minor : fix trailing whitespace master-a192860 Georgi Gerganov 2023-08-23 22:37:39 +03:00
  • 95385241a9
    examples : restore the functionality to import llama2.c models (#2685) master-9538524 Olivier Chafik 2023-08-23 20:33:05 +01:00
  • 5afce7939c
    llama : escape all U+2581 in a string Georgi Gerganov 2023-08-23 22:06:16 +03:00