Commit graph

  • e1b2bf783e
    tests : add sample usage Georgi Gerganov 2024-04-26 13:43:54 +03:00
  • aeafb43ed7
    tests : remove and rename tokenizer test scripts Georgi Gerganov 2024-04-26 13:39:03 +03:00
  • d999cf65c5
    unicode : remove redundant headers Georgi Gerganov 2024-04-26 13:29:48 +03:00
  • bbe3c6e761
    ci: server: fix python installation (#6925) Pierrick Hymbert 2024-04-26 12:27:25 +02:00
  • d6ceec7c92 ci: server: fix python installation Pierrick HYMBERT 2024-04-26 12:24:41 +02:00
  • 7a44e44342
    tests : add tokenizer tests for numbers Georgi Gerganov 2024-04-26 13:21:28 +03:00
  • 7f5ff558ee
    server: stop generation at n_ctx_train if n_predict is not set (#6638) Pierrick Hymbert 2024-04-26 12:15:30 +02:00
  • bcbdd28cfd kv override: ensure string termination Pierrick HYMBERT 2024-04-26 12:10:39 +02:00
  • b54eede81c Merge remote-tracking branch 'refs/remotes/origin/master' into hp/quantize/imatrix-metadata Pierrick HYMBERT 2024-04-26 12:07:06 +02:00
  • c56e19db4b
    lint : fix whitespaces Georgi Gerganov 2024-04-26 12:58:07 +03:00
  • 06d3e693db
    unicode : fix? unicode_wstring_to_utf8 Georgi Gerganov 2024-04-26 12:55:11 +03:00
  • 9e4e077ec5
    ci: server: fix python installation (#6922) Pierrick Hymbert 2024-04-26 11:11:51 +02:00
  • 08c5e35014
    main : don't print special tokens with --grammar Justine Tunney 2024-04-26 02:01:58 -07:00
  • f7d3da5667 ci: server: fix python installation Pierrick HYMBERT 2024-04-26 10:59:30 +02:00
  • ff2a6e845b fixed off by one error when context shifting l3utterfly 2024-04-26 17:53:08 +09:00
  • 36d983262e
    Fixed issue with gpt2 regex custom preprocessor Kazim Abrar Mahi 2024-04-17 07:40:40 +06:00
  • 753580360b
    Fixed issues Kazim Abrar Mahi 2024-04-16 05:53:29 +06:00
  • feeaf4f39c
    Added needed functionality, testing remains Kazim Abrar Mahi 2024-04-16 04:56:35 +06:00
  • 7e308ed212
    Adding unicode regex function Kazim Abrar Mahi 2024-04-16 01:52:33 +06:00
  • a5710a4101
    Adding unicode regex mappings Kazim Abrar Mahi 2024-04-15 23:48:04 +06:00
  • 4c3e882a85
    Refactored code Kazim Abrar Mahi 2024-04-13 19:33:06 +06:00
  • c8e7d9521d
    Updated/merged the deepseek coder pr Jaggzh 2024-02-12 04:18:06 -08:00
  • 4056dc5b1e
    added and refactored unicode_regex_split and related functions Kazim Abrar Mahi 2024-04-01 00:48:49 +06:00
  • 1c924e4b35
    Resolved issues Kazim Abrar Mahi 2024-03-23 14:38:06 +06:00
  • 54f93eb50b
    Moved header files Kazim Abrar Mahi 2024-03-23 01:16:04 +06:00
  • d2cfc2225f
    Moved regex patterns to unicode.cpp and updated unicode.h Kazim Abrar Mahi 2024-03-23 01:13:08 +06:00
  • 6fbab2dbc8
    merged the changes from deepseeker models to main branch Jaggzh 2024-02-12 04:04:34 -08:00
  • 99874e5e34 support minicpmv Achazwl 2024-04-26 16:37:46 +08:00
  • 18dbe4b8af link direct storage to ggml_shared as well. Markus Tavenrath 2024-04-26 10:32:10 +02:00
  • 83b72cb086
    Merge pull request from GHSA-p5mv-gjc5-mwqv Georgi Gerganov 2024-04-26 10:41:53 +03:00
  • d4a9afc100
    ci: server: fix python installation (#6918) b2740 Pierrick Hymbert 2024-04-26 09:27:49 +02:00
  • 7d641c26ac
    ci: fix concurrency for pull_request_target (#6917) Pierrick Hymbert 2024-04-26 09:26:59 +02:00
  • 5790c8dac1
    bench: server add stop word for PHI-2 (#6916) Pierrick Hymbert 2024-04-26 09:26:16 +02:00
  • 10d1e7b4ba ci: server: fix python installation Pierrick HYMBERT 2024-04-26 09:20:17 +02:00
  • c14306a2ff ci: fix concurrency for pull_request_target Pierrick HYMBERT 2024-04-26 09:10:47 +02:00
  • 5a4978f2ec bench: server add stop word for PHI-2 Pierrick HYMBERT 2024-04-26 09:02:40 +02:00
  • a3e75fe481 Fixes mann1x 2024-04-26 08:56:35 +02:00
  • 55dec7c4a8 add neon impl slaren 2024-04-26 03:26:39 +02:00
  • 9c0db4dd9d args: main & server now call gpt_params_handle_model_default Olivier Chafik 2024-04-26 00:50:29 +01:00
  • 40a961db60 args: default --model to models/ + filename from --model-url or --hf-file (or else legacy models/7B/ggml-model-f16.gguf) Olivier Chafik 2024-04-26 00:40:45 +01:00
  • cf4fa0c193 quantize : validate generated data slaren 2024-04-26 00:32:21 +02:00
  • 77d4ca906b spacing and capitalization changes. Julia Longtin 2024-04-25 21:23:22 +00:00
  • 63cd3dc251 Initial support for Linux mann1x 2024-04-25 22:27:50 +02:00
  • 27ace6d2f4
    Merge f9b42b8cd8 into 46e12c4692 ManniX-ITA 2024-04-25 22:50:00 +03:00
  • 46e12c4692
    llava : add support for moondream vision language model (#6899) b2737 vik 2024-04-25 12:38:31 -07:00
  • 3d771207b7
    Update examples/llava/clip.cpp Georgi Gerganov 2024-04-25 22:38:14 +03:00
  • fab99db997
    Increase opacity. JohnnyB 2024-04-25 19:54:21 +01:00
  • dba497e0c1
    cmake : restore LLAMA_LLAMAFILE_DEFAULT b2736 Georgi Gerganov 2024-04-25 21:31:17 +03:00
  • d1d176e7bf
    Merge branch 'master' into master vik 2024-04-25 11:26:49 -07:00
  • ac829932a6
    Increased opacity for contrast JohnnyB 2024-04-25 19:11:30 +01:00
  • 9e3876061c
    llama : add static reminder for llama_state_get_size Georgi Gerganov 2024-04-25 20:33:36 +03:00
  • a124dfad1c Merge remote-tracking branch 'origin/master' into 0cc4m/vulkan-moe 0cc4m 2024-04-25 19:32:52 +02:00
  • 4f4c0249bf
    metal : remove tmp log Georgi Gerganov 2024-04-25 20:29:25 +03:00
  • 1e590ac3c9
    llama : update llama_state_get_size after v_trans field Georgi Gerganov 2024-04-25 20:06:23 +03:00
  • 0fc5c5eb74
    llama : disallow incompatible states Georgi Gerganov 2024-04-25 19:53:57 +03:00
  • bab346ba69
    llama : fix copy-paste errors, add TODO Georgi Gerganov 2024-04-25 19:45:36 +03:00
  • 330b3bc5b5
    Merge branch 'ggerganov:master' into sgemm-avx Eve 2024-04-25 16:45:30 +00:00
  • c225609f10
    llama : llama_kv_cache_clear zeroes data + fix save-load seq Georgi Gerganov 2024-04-25 19:37:27 +03:00
  • ac1c6d91de
    ci : add CUDA save-load-state tests Georgi Gerganov 2024-04-25 19:03:59 +03:00
  • 09d0381c58
    Merge branch 'master' into gg/flash-attn Georgi Gerganov 2024-04-25 19:01:52 +03:00
  • fa0b4ad252
    cmake : remove obsolete ANDROID check b2735 Georgi Gerganov 2024-04-25 18:59:51 +03:00
  • d6e1d44f16
    llama : synchronize before get/set session data (#6911) b2734 slaren 2024-04-25 17:59:03 +02:00
  • c7d534572b llama : synchronize before get/set session data slaren 2024-04-25 17:48:20 +02:00
  • 00f3fb6bc2 fix setting clamp_qkv value in OLMo conversion nopperl 2024-04-25 17:34:20 +02:00
  • 1fd5bc3d5e
    llama : support save/load state with FA enabled Georgi Gerganov 2024-04-25 18:18:13 +03:00
  • cb3547ac46
    Merge branch 'master' into gg/flash-attn Georgi Gerganov 2024-04-25 17:06:56 +03:00
  • 853d06ffe2
    ci : tmp disable slow tests Georgi Gerganov 2024-04-25 17:06:27 +03:00
  • 3fe0596c18
    readme : update model list (#6908) BarfingLemurs 2024-04-25 09:52:28 -04:00
  • 145d315127 add --check-tensors command line argument slaren 2024-04-25 15:41:36 +02:00
  • 4eefd38570
    llama3 ! BarfingLemurs 2024-04-25 09:39:12 -04:00
  • ac3195999d
    missing space BarfingLemurs 2024-04-25 09:35:46 -04:00
  • 0ead1f1072
    llama : check that all the tensor data is in the model file (#6885) b2731 slaren 2024-04-25 15:23:47 +02:00
  • 00ab7dbbc2
    Update README.md BarfingLemurs 2024-04-25 09:23:08 -04:00
  • b57a190e34 also check for unsigned overflow slaren 2024-04-25 15:17:57 +02:00
  • ff2c64a9f4
    tests : remove TMP_ATTN_BENCH Georgi Gerganov 2024-04-25 15:51:46 +03:00
  • 1f77f49787
    Merge branch 'master' into gg/flash-attn Georgi Gerganov 2024-04-25 15:50:36 +03:00
  • 51543729ff
    ggml : fix redefinition of vaddvq_f32 for 32-bit ARM (#6906) b2730 Georgi Gerganov 2024-04-25 15:48:25 +03:00
  • 7db9fcaf26
    ggml : fix redefinition of vaddvq_f32 for 32-bit ARM Georgi Gerganov 2024-04-25 15:39:55 +03:00
  • 4ab99d8d47
    clip : rename lerp function to avoid conflict (#6894) b2729 Daniel Bevenius 2024-04-25 14:38:14 +02:00
  • 0426bdf5c2
    clip : rename lerp function to avoid conflict Daniel Bevenius 2024-04-25 07:39:52 +02:00
  • 54770413c4
    ggml : fix MIN / MAX macros (#6904) b2728 Georgi Gerganov 2024-04-25 15:12:28 +03:00
  • 8c259f6f3e
    ggml : fix MIN / MAX macros gg/fix-min-max Georgi Gerganov 2024-04-25 14:28:41 +03:00
  • aa750c1ede
    tests : minor bash stuff (#6902) b2727 Georgi Gerganov 2024-04-25 14:27:20 +03:00
  • 3e4a3e4fa8
    tests : fix fname Georgi Gerganov 2024-04-25 13:54:05 +03:00
  • afb3715434
    tests : fix CUR_DIR -> ROOT_DIR Georgi Gerganov 2024-04-25 13:46:34 +03:00
  • 1292b6cf67
    llama : fix build Georgi Gerganov 2024-04-25 13:33:56 +03:00
  • c2ee36ae3b
    tests : minor bash stuff Georgi Gerganov 2024-04-25 13:31:24 +03:00
  • 1966eb2615
    quantize : add '--keep-split' to quantize model into shards (#6688) jiez 2024-04-25 18:29:35 +08:00
  • 9cd09aa79b not allow adding duplicated tensor name ngxson 2024-04-25 11:49:19 +02:00
  • 238551ed8c parse gmml_type and llama_ftype, allow specifiying cfg file Julia Bruckner 2024-04-25 11:42:09 +02:00
  • 43daf2fe3c
    Trailing whitespace JohnnyB 2024-04-25 10:21:13 +01:00
  • c7c03260f6
    Newline JohnnyB 2024-04-25 10:19:29 +01:00
  • ae2a08114d
    Newline JohnnyB 2024-04-25 10:04:19 +01:00
  • 32a9792275
    Newline JohnnyB 2024-04-25 10:04:01 +01:00
  • 7c50cb090a add support for moondream vision language model vik 2024-04-25 01:13:26 -07:00
  • 0640427f7b limit to GGML_ALLOW_CUDA_GRAPHS defined in llama.cpp cmake Alan Gray 2024-04-25 00:51:48 -07:00
  • 11a11b0b54
    make github CI happy zhou.weiguo 2024-04-25 15:28:45 +08:00
  • 5bc7b76117
    doc: add README-qnn.md zhou.weiguo 2024-04-25 15:06:59 +08:00
  • 4d603e3520 added DRY implementation l3utterfly 2024-04-25 15:58:59 +09:00
  • aea4ad0296 fixed editor config check l3utterfly 2024-04-25 15:57:54 +09:00