Commit graph

  • 16ede02a47 Use sentencepiece tokenizer, or fall back to hfft. Pedro Cuenca 2024-03-28 18:59:26 +01:00
  • 71a08675c0 Simplify tokenize.cpp; by getting rid of handling positional style arguments. Mikko Juola 2024-03-28 10:59:23 -07:00
  • 23ffda00df Merge remote-tracking branch 'origin/master' into mistral-hf-conversion Pedro Cuenca 2024-03-28 18:40:39 +01:00
  • 08e69c5008
    cuda : adapt soft_max to F16 mask and pos Georgi Gerganov 2024-03-28 19:40:11 +02:00
  • 3e318e764f
    Merge branch 'master' into gg/flash-attn Georgi Gerganov 2024-03-28 19:32:51 +02:00
  • 57c03b78b6
    metal : improve perf via smaller int registers Georgi Gerganov 2024-03-28 19:29:06 +02:00
  • 5106ef482c
    [SYCL] Revisited & updated SYCL build documentation (#6141) Ouadie EL FAROUKI 2024-03-28 16:01:47 +00:00
  • b341037d01
    Merge 0f8d5aa091 into be55134a53 repo-reviews 2024-03-28 08:46:54 -07:00
  • f746e7074e Merge branch 'master' into sycl_fix_non_intel_fp16 OuadiElfarouki 2024-03-28 15:45:06 +00:00
  • be55134a53
    convert : refactor vocab selection logic (#6355) b2568 Jared Van Bortel 2024-03-28 11:44:36 -04:00
  • 4070423210 moved INTEL_MKL guard from gemm_impl to gemm (wrapper) OuadiElfarouki 2024-03-28 15:43:40 +00:00
  • d09e4ac66c convert : appease flake8 Jared Van Bortel 2024-03-28 11:38:02 -04:00
  • 63eaed650b added Layla to supported UIs l3utterfly 2024-03-29 00:05:23 +09:00
  • 66ba560256
    llava : fix MobileVLM (#6364) b2567 Ziang Wu 2024-03-28 22:33:10 +08:00
  • 90919809a6 cmake: add explicit metal version options Matt Clayton 2024-03-28 10:22:24 -04:00
  • 4d5356bbbb rename state get set functions Jan Boon 2024-03-28 22:19:57 +08:00
  • 61f0ae73ef
    llama: remove redundant reshape in build_kv_store Daniel Bevenius 2024-03-28 15:17:43 +01:00
  • c4443d7ad4 rename sequence state functions Jan Boon 2024-03-28 22:10:04 +08:00
  • 72dbd3250b
    Update MobileVLM-README.md Ziang Wu 2024-03-28 21:42:10 +08:00
  • 1ef3250abd
    Merge branch 'ggerganov:master' into master Ziang Wu 2024-03-28 21:41:19 +08:00
  • 2a77902a1d
    Update examples/llava/MobileVLM-README.md Ziang Wu 2024-03-28 21:37:57 +08:00
  • 1440d445db
    Merge branch 'master' into master MasterYi1024 2024-03-28 21:02:29 +08:00
  • 0308f5e3d7
    llama : fix command-r inference when omitting outputs (#6367) b2566 compilade 2024-03-28 08:05:54 -04:00
  • 28cb9a09c4
    ci: bench: fix master not schedule, fix commit status failed on external repo (#6365) Pierrick Hymbert 2024-03-28 11:27:56 +01:00
  • 64b7d85891 llama : fix command-r inference compilade/fix-command-r Francis Couture-Harpin 2024-03-28 06:22:24 -04:00
  • 166b807a15 ci: bench: fix master not schedule, fix commit status failed on external repo Pierrick HYMBERT 2024-03-28 11:12:27 +01:00
  • 4ab46218c1
    Update MobileVLM-README.md Ziang Wu 2024-03-28 17:58:26 +08:00
  • 1cdd3b0ae3
    Update MobileVLM-README.md Ziang Wu 2024-03-28 17:10:21 +08:00
  • 79de0e65e1
    Update MobileVLM-README.md Ziang Wu 2024-03-28 17:07:45 +08:00
  • a4527cb16e
    Update MobileVLM-README.md Ziang Wu 2024-03-28 17:05:12 +08:00
  • 7fc9c777a3
    Update MobileVLM-README.md Ziang Wu 2024-03-28 17:04:10 +08:00
  • 5310114cd6
    Update MobileVLM-README.md Ziang Wu 2024-03-28 17:02:32 +08:00
  • 741eebf257
    Update MobileVLM-README.md Ziang Wu 2024-03-28 17:02:00 +08:00
  • b97e6fc812
    Merge branch 'ggerganov:master' into master Ziang Wu 2024-03-28 16:55:03 +08:00
  • cfc4d75df6
    doc: fix outdated default value of batch size (#6336) Ting Sun 2024-03-28 16:51:06 +08:00
  • 6902cb7f2e
    server : stop gracefully on SIGTERM (#6348) b2563 Eric Zhang 2024-03-28 16:50:48 +08:00
  • 7497e1baa4 resolve float abhilash1910 2024-03-28 01:32:08 -07:00
  • 7835299423 resolve float abhilash1910 2024-03-28 01:28:38 -07:00
  • 6d825db8a1 resolve float abhilash1910 2024-03-28 01:24:30 -07:00
  • d6a8653fc8 add vector abhilash1910 2024-03-28 01:20:14 -07:00
  • 9489fc379e add iq4 non linear placeholder abhilash1910 2024-03-28 01:09:28 -07:00
  • d2d8f38996 nix: removed unnessesary indentation hutli 2024-03-27 19:17:30 +01:00
  • d39b308eaf nix: moved blas availability check to package inputs so it is still overridable hutli 2024-03-27 19:14:28 +01:00
  • c873976649 using blas.meta.available to check host platform hutli 2024-03-27 18:10:08 +01:00
  • dbb03e2b9c only using explicit blas if hostPlatform is allowed hutli 2024-03-27 17:25:05 +01:00
  • e9f17dc3bf nix: .#windows: proper cross-compilation set-up Someone Serge 2024-03-26 16:22:42 +00:00
  • 22a462cc1f nix: package: don't introduce the dependency on python Someone Serge 2024-03-26 16:22:07 +00:00
  • f6a0f5c642 nix: .#widnows: init hutli 2024-02-15 14:25:04 +01:00
  • 935eabd917 add condition abhilash1910 2024-03-27 22:56:15 -07:00
  • 619ce80144
    Update ggml-sycl.cpp Abhilash Majumder 2024-03-28 11:06:29 +05:30
  • d0e2f6416b
    doc: fix typo in MobileVLM-README.md (#6181) Ziang Wu 2024-03-28 12:03:30 +08:00
  • 275ccea394
    Merge pull request #1 from ibehnam/ibehnam-patch-server-cb Behnam Moh 2024-03-27 22:57:43 -04:00
  • 644456361d
    allow continuous batching to be disabled Behnam Moh 2024-03-27 22:52:49 -04:00
  • 928164d2ad fix empty bug zane 2024-03-28 09:48:27 +08:00
  • a5b98e99fa Merge branch 'master' of https://github.com/hxer7963/llama.cpp root 2024-03-28 01:08:23 +00:00
  • 4364308210 - Fix format issues - Remove duplicate set kqv_out to llm_build_kv willhe 2024-03-28 08:57:27 +08:00
  • 25f4a613c4
    [SYCL] fix set main gpu crash (#6339) b2554 Neo Zhang Jianyu 2024-03-28 08:55:24 +08:00
  • 16a5d0a1bc
    Merge branch 'ggerganov:master' into master hxer7963 2024-03-28 08:42:29 +08:00
  • 121cf74c04
    readme: add Android UI binding zhou.weiguo 2024-03-28 08:14:06 +08:00
  • 38d0e1da9a remove next_metadata_size ngxson 2024-03-28 00:18:11 +01:00
  • 80e9fc7c4d llama : update vocab type descriptions to reflect actual meaning Jared Van Bortel 2024-03-27 17:04:53 -04:00
  • ebad773e9d convert-hf : HfVocab -> LlamaHfVocab Jared Van Bortel 2024-03-27 16:13:09 -04:00
  • 79852ab884 convert : refactor vocab selection logic Jared Van Bortel 2024-03-27 15:43:16 -04:00
  • 2e6fd63b29 convert-hf : fix type of tokens after #3252 Jared Van Bortel 2024-03-27 15:49:05 -04:00
  • 8d2ac2cce0 convert : use appropriate exception types Jared Van Bortel 2024-03-27 14:06:29 -04:00
  • d12a63ca3e convert : fix incorrect added token dedup in BpeVocab Jared Van Bortel 2024-03-27 13:35:33 -04:00
  • b2b63d1350 convert : use context managers with most file handles Jared Van Bortel 2024-03-27 12:57:40 -04:00
  • 911b59d911
    Create SECURITY.md Joyce 2024-03-27 17:56:53 -03:00
  • 0cda567931 Clean up merge artifacts 0cc4m 2024-03-27 21:08:00 +01:00
  • a016026a3a
    server: continuous performance monitoring and PR comment (#6283) Pierrick Hymbert 2024-03-27 20:26:49 +01:00
  • 53c7ec53d5 nix: ci: dont test cuda and rocm (for now) Someone Serge 2024-03-27 16:17:46 +00:00
  • d852c61d5c convert : do not allow "no_vocab" in --vocab-types Jared Van Bortel 2024-03-27 12:49:22 -04:00
  • 03f0c2e8ce convert-persimmon : typing fixup Jared Van Bortel 2024-03-27 12:36:08 -04:00
  • 9803bb7206 convert : vocab inheritance instead of duck typing Jared Van Bortel 2024-03-27 12:30:49 -04:00
  • 72e95e33a9 convert : remove unused vocab attributes Jared Van Bortel 2024-03-27 12:14:40 -04:00
  • 6cb07fb0af Merge upstream changes, fix conflicts 0cc4m 2024-03-27 20:12:14 +01:00
  • 4a6bfa92c5 ci: bench: reduce bullet point size Pierrick HYMBERT 2024-03-27 19:55:13 +01:00
  • d00b11b0b5 Fix Vulkan GGML_OP_GET_ROWS implementation 0cc4m 2024-03-27 19:54:26 +01:00
  • fce86c3a55 ci: bench: move images in a details section Pierrick HYMBERT 2024-03-27 19:23:13 +01:00
  • 7f4d575151 nix: removed unnessesary indentation hutli 2024-03-27 19:17:30 +01:00
  • e469532ddd nix: moved blas availability check to package inputs so it is still overridable hutli 2024-03-27 19:14:28 +01:00
  • dd1a60c536 convert : remove redundant annotations Jared Van Bortel 2024-03-27 11:45:47 -04:00
  • 30195d7307 ci: bench: trailing spaces Pierrick HYMBERT 2024-03-27 18:30:28 +01:00
  • a2b48b95f5 cleanup error cases Jan Boon 2024-03-28 01:11:07 +08:00
  • ae316601c6 using blas.meta.available to check host platform hutli 2024-03-27 18:10:08 +01:00
  • a1968c2e63
    sync : ggml Georgi Gerganov 2024-03-27 19:03:43 +02:00
  • b182f8f67f Returning 0 for some cases, instead of asserting. Martin Evans 2024-03-27 16:31:27 +00:00
  • df639e07d7 only using explicit blas if hostPlatform is allowed hutli 2024-03-27 17:25:05 +01:00
  • a7871ca53d
    nix: ci: dont test cuda and rocm (for now) Someone Serge 2024-03-27 16:17:46 +00:00
  • b8e8facb0e add --slot-save-path arg to enable save restore and restrict save location Jan Boon 2024-03-28 00:05:56 +08:00
  • a46924cf6b
    server : stop gracefully on SIGTERM EZForever 2024-03-28 00:02:28 +08:00
  • 83f944c22d be positive ngxson 2024-03-27 16:21:29 +01:00
  • b7cb3bb76f error on 0 tensors ngxson 2024-03-27 15:51:38 +01:00
  • 02a184065a add kv seq save restore to test case Jan Boon 2024-03-27 22:39:28 +08:00
  • 8569ba30c3 add dry run option ngxson 2024-03-27 15:25:24 +01:00
  • 583022c5c7 split: ok ngxson 2024-03-27 15:10:44 +01:00
  • e5b89a441a
    ggml : fix bounds checking of zero size views (#6347) b2551 slaren 2024-03-27 15:07:50 +01:00
  • 6c699be1c9 ggml : fix bounds checking of zero size views slaren 2024-03-27 15:01:21 +01:00
  • 23577d0745
    nix: .#windows: proper cross-compilation set-up Someone Serge 2024-03-26 16:22:42 +00:00
  • 07120aed84
    nix: package: don't introduce the dependency on python Someone Serge 2024-03-26 16:22:07 +00:00