Commit graph

  • db3ba108e7 code aestheticization Christian Zhou-Zheng 2024-05-31 21:38:02 -04:00
  • 62560367aa add command-line args for num threads, num completions file lines, always reload model Christian Zhou-Zheng 2024-05-31 21:27:14 -04:00
  • 2f5d2b55ae
    ggml-backend: refine ggml backend subsystem for mixed inference between CPU&GPU / CPU/NPU easily for some special ggml backend zhou.weiguo 2024-06-01 09:09:22 +08:00
  • 4d7d71bc43 fix square_diff matmul index range and CRLF->LF line endings Christian Zhou-Zheng 2024-05-31 21:08:25 -04:00
  • 72197aef96
    Merge branch 'ggerganov:master' into refine-ggml-backend-subsystem zhouwg 2024-06-01 09:02:30 +08:00
  • 647d25282b
    patch: Apply fix for backward compat for source repo teleprint-me 2024-05-31 20:43:00 -04:00
  • c4470108ab
    refactor: Add prototyped bridge interface for transformers and tokenizers teleprint-me 2024-05-31 20:36:30 -04:00
  • 47ef6157a0
    refactor: Add prototyped bridge interface for tokenizers and llama.cpp teleprint-me 2024-05-31 20:35:41 -04:00
  • b18295249e
    Merge d3179662cd into a323ec60af Sameer Charles 2024-06-01 08:00:21 +10:00
  • a323ec60af
    server : update js (#7670) Georgi Gerganov 2024-05-31 22:23:04 +03:00
  • 05133280ab remove obsolete code Johannes Gäßler 2024-05-31 20:12:00 +02:00
  • c2e48979e2
    Merge branch 'master' into auto-model-support teleprint-me 2024-05-31 14:11:30 -04:00
  • 6a8aa22aaa Merge branch 'master' into threadpool Faisal Zaghloul 2024-05-31 13:01:06 -04:00
  • 9d6675def4 fix n_threads == 1 bug Faisal Zaghloul 2024-05-31 12:30:12 -04:00
  • 719d12bc51 fix server test Faisal Zaghloul 2024-05-31 02:57:56 -04:00
  • 4d88cd1af1 fix zero output & param parsing, functional templating Christian Zhou-Zheng 2024-05-31 12:40:35 -04:00
  • d8a0b87091 error if type_v != FP16 and not flash_attn Johannes Gäßler 2024-05-31 18:34:38 +02:00
  • 0515ad93f4
    convert-hf : Handle NotImplementedError in convert-hf-to-gguf (#7660) Galunid 2024-05-31 17:42:33 +02:00
  • 5f8720fb7b add rpc-server to Makefile sl/rpc-backend-cpy slaren 2024-05-31 17:22:05 +02:00
  • a7060dffdd - fix copy_tensor being called on the src buffer instead of the dst buffer slaren 2024-05-31 17:05:14 +02:00
  • 0d358c1328
    Update server.cpp Robert Sinclair 2024-05-31 18:01:08 +03:00
  • c8047d538f
    scripts: update compare_llama_bench.py [no ci] (#7673) Johannes Gäßler 2024-05-31 16:26:21 +02:00
  • 667e27ccd4 fix typo Yazan Agha-Schrader 2024-05-31 16:09:35 +02:00
  • 30e238b246
    Improve HIP compatibility (#7672) b3058 Daniele 2024-05-31 14:00:29 +00:00
  • 5e6fbb9cd8 scripts: update compare_llama_bench.py [no ci] Johannes Gäßler 2024-05-31 15:50:40 +02:00
  • 96a6f55222 Merge branch 'master' of https://github.com/JoanFM/llama.cpp into feat-jina-v2-base-code Joan Martinez 2024-05-31 15:22:31 +02:00
  • dca59fc850
    Improve HIP compatibility Daniele 2024-05-31 13:40:37 +00:00
  • 9a65c7a273 fix: fix the usage of the code model Joan Martinez 2024-05-31 15:10:43 +02:00
  • 956af1552a
    server : update js gg/server-update-js Georgi Gerganov 2024-05-31 15:47:19 +03:00
  • 3e26cd0055
    Merge branch 'ggerganov:master' into server-ui-pr Yazan Agha-Schrader 2024-05-31 14:45:35 +02:00
  • 21138dd2da add new ui files to makefile Yazan Agha-Schrader 2024-05-31 14:39:43 +02:00
  • c0b154a7a8 use correct indent Yazan Agha-Schrader 2024-05-31 14:24:56 +02:00
  • 02d8b34900
    docs: repeat-penalty 1.0 = disabled Brandon Lockaby 2024-05-31 08:24:37 -04:00
  • 16926dff92
    readme : link homebrew discussion b3057 Georgi Gerganov 2024-05-31 15:04:58 +03:00
  • 0c27e6f62e
    ggml : fix loongson compile warnings (#7537) b3056 Georgi Gerganov 2024-05-31 14:17:10 +03:00
  • 77c16ee0d4
    tests : disable json test due to lack of python on the CI node gg/ci-loongson Georgi Gerganov 2024-05-31 14:03:45 +03:00
  • 42cbf565f0
    Merge branch 'ggerganov:master' into refine-ggml-backend-subsystem zhouwg 2024-05-31 18:24:58 +08:00
  • d32a8f6142 backup sycl-global-variables Meng, Hengyu 2024-05-31 16:51:56 +08:00
  • 50fb3d347f
    Fix loongarch quantize test fail. junchao-loongson 2024-05-30 21:05:23 +08:00
  • 6c276deb9d llama : offload to RPC in addition to other backends Radoslav Gerganov 2024-05-30 09:45:50 +03:00
  • 46465cdcff merge upstream zhangkaihuo 2024-05-31 16:38:15 +08:00
  • 2e32f874e6
    Somehow '**' got lost (#7663) Galunid 2024-05-31 10:24:41 +02:00
  • 828c176561 Merge branch 'master' into quick-fixup Galunid 2024-05-31 10:17:18 +02:00
  • 84866fffd2 support lm_head zhangkaihuo 2024-05-31 16:14:58 +08:00
  • be73420152 Somehow '**' got lost Galunid 2024-05-31 10:14:32 +02:00
  • 1af511fc22
    Add convert.py removal to hot topics (#7662) Galunid 2024-05-31 10:09:20 +02:00
  • e3e132515a Add convert.py removal to hot topics Galunid 2024-05-31 09:47:29 +02:00
  • a913ca4cb9 receive review comments and modify caitianchi-mb 2024-05-31 15:06:30 +08:00
  • 2dbe149b05 fix server test Faisal Zaghloul 2024-05-31 02:57:56 -04:00
  • 6545f59e24 Handle NotImplementedError in convert-hf-to-gguf Galunid 2024-05-31 08:55:25 +02:00
  • bc69a1e977 fix typos "prompt-format" -> "prompt-formats" Yazan Agha-Schrader 2024-05-31 06:31:55 +02:00
  • 80888e93cc renaming to ensure consistency Yazan Agha-Schrader 2024-05-31 06:17:40 +02:00
  • fa85ba6ae3 preliminary template/multiprompt support Christian Zhou-Zheng 2024-05-30 23:39:59 -04:00
  • d9742fbf4e fix wrong link to old ui Yazan Agha-Schrader 2024-05-31 05:25:57 +02:00
  • bb9542b54f include new ui in cpp Yazan Agha-Schrader 2024-05-30 06:49:31 +02:00
  • 5b125003ca
    Merge branch 'ggerganov:master' into master Andrew Ferruolo 2024-05-30 22:13:38 -04:00
  • 31f153fe9c fix matrix transpose multiplication Christian Zhou-Zheng 2024-05-30 21:36:17 -04:00
  • b356daf97e remove arg for testing Mana 2024-05-31 08:38:58 +08:00
  • 0541f06296
    [no ci] docs: add aikit to readme (#7650) Sertaç Özercan 2024-05-30 16:57:16 -07:00
  • d446c6d887 add debugs ngxson 2024-05-31 00:41:12 +02:00
  • 287da25f48 fix mem error ngxson 2024-05-31 00:06:45 +02:00
  • 447023fc43 add multi prompts, multi-thread for PCA ngxson 2024-05-30 23:58:32 +02:00
  • 9022c33646
    Fixed painfully slow single process builds. (#7326) JohnnyB 2024-05-30 21:32:38 +01:00
  • ddafa03a30 Merge branch 'master' of https://github.com/ggerganov/llama.cpp Adrian Liechti 2024-05-30 21:57:07 +02:00
  • e8c3364387 fixes for non-llvm builds Faisal Zaghloul 2024-05-30 07:58:02 -04:00
  • 42859a583d
    llama : avoid double token-to-piece cache Georgi Gerganov 2024-05-30 22:12:04 +03:00
  • 40a4139a42
    [no ci] docs: add aikit to readme Sertac Ozercan 2024-05-30 17:48:44 +00:00
  • 02c15840bf manually merge branch 😭 Mana 2024-05-31 01:57:53 +08:00
  • dc46264ff0 example template completions Christian Zhou-Zheng 2024-05-30 13:12:54 -04:00
  • 5921b8f089
    llama : cache llama_token_to_piece (#7587) b3051 Georgi Gerganov 2024-05-30 19:01:41 +03:00
  • f58f6af133 param parsing, refactor, comments Christian Zhou-Zheng 2024-05-30 11:31:45 -04:00
  • 4f0a128e28 fix: change first msg check Ryan Hua 2024-05-30 11:29:43 -04:00
  • ff0fc6892a
    Merge branch 'ggerganov:master' into maybe-no-middle-token Sigbjørn Skjæret 2024-05-30 17:13:22 +02:00
  • a96430c164
    Only use FIM middle if it exists Sigbjørn Skjæret 2024-05-30 17:13:03 +02:00
  • 5dcdf94676
    Fix conan badge display [no ci] (#7645) Martin Delille 2024-05-30 17:07:39 +02:00
  • 7846568123
    Fix conan badge display [no ci] Martin Delille 2024-05-30 16:47:57 +02:00
  • 93fb684bef
    Only use FIM middle if it exists Sigbjørn Skjæret 2024-05-30 17:04:33 +02:00
  • 2e2340de17
    Add brew installation instruction to README [no ci] (#7616) Manuel 2024-05-30 16:58:15 +02:00
  • 62855ca3f6 ci : disable openmp with thread sanitizer slaren 2024-05-30 16:55:16 +02:00
  • 2a4884b1d0 Fix loongarch quantize test fail. junchao-loongson 2024-05-30 21:05:23 +08:00
  • 29a98840a0
    Merge branch 'ggerganov:master' into non-llama-fim-fix Sigbjørn Skjæret 2024-05-30 16:33:33 +02:00
  • 998d208e14
    More checks before assuming FIM tokens for Llama arch Sigbjørn Skjæret 2024-05-30 16:33:06 +02:00
  • 0001ec37b4
    Update server.cpp Robert Sinclair 2024-05-30 17:09:15 +03:00
  • 5b36de7ec3
    ggml-backend: refine backend subsystem for CPU&GPU / CPU&NPU mixed inference more easily for a specified GGML backend zhou.weiguo 2024-05-30 22:01:17 +08:00
  • 1f80e0e428 seperate DPCT helpers outside remove global variables and pack into context Meng, Hengyu 2024-05-30 20:41:54 +08:00
  • 7846540bd2
    readme : add Conan badge (#7638) Martin Delille 2024-05-30 14:52:50 +02:00
  • 82723ccda9
    Add conan badge Martin Delille 2024-05-30 14:46:52 +02:00
  • e6157f94c8
    github: add contact links to issues and convert question into research [no ci] (#7612) Brian 2024-05-30 21:55:36 +10:00
  • 9c4c9cc83f
    Move convert.py to examples/convert-legacy-llama.py (#7430) b3046 Galunid 2024-05-30 13:40:00 +02:00
  • 59b0d07766
    faster avx512 exp implementation (#7551) b3045 Chris Elrod 2024-05-30 07:32:55 -04:00
  • 5cf27d0b2b
    Merge 2b2fd541c2 into d5c05821f3 Mohammadreza Hendiani 2024-05-30 18:01:23 +08:00
  • 93c1d26d88
    Merge ca61d3e498 into d5c05821f3 Herman Semenov 2024-05-30 18:00:51 +08:00
  • f1772c9973 disable openmp on macos slaren 2024-05-30 11:46:28 +02:00
  • fd5de67bb7
    ggml : fix loongson compile warnings Georgi Gerganov 2024-05-25 20:24:12 +03:00
  • d5c05821f3
    ggml : fix loongarch build (O2 issue) (#7636) b3044 junchao-loongson 2024-05-30 17:30:10 +08:00
  • 94f4c5ce40 Fix loongarch code dependency on O2 compilation options in ggml-quants junchao-loongson 2024-05-30 17:18:18 +08:00
  • 5970a26d66 fix msvc build slaren 2024-05-30 10:26:02 +02:00
  • 88f5e6ab36 fix bug in bicubic resize when need resize iamge smaller caitianchi 2024-05-30 16:39:42 +08:00
  • 5ddbd1843d enable openmp by default slaren 2024-05-30 10:01:37 +02:00
  • 377bc78341 clear numa affinity for main thread even with openmp slaren 2024-05-30 09:58:08 +02:00