Commit graph

  • 8b916c2f07 updates the package.swift to use ggml as dependency Ashraful Islam 2023-12-29 13:59:09 -06:00
  • f81d2058ae
    tests : fix trailing whitespace Georgi Gerganov 2023-12-29 20:23:51 +02:00
  • 12ab343c82
    gitignore : fix Georgi Gerganov 2023-12-29 19:34:07 +02:00
  • 128c213ab5
    llama : minor stuff Georgi Gerganov 2023-12-29 19:32:30 +02:00
  • d24da31d2f
    Merge branch 'master' into HEAD Georgi Gerganov 2023-12-29 19:24:48 +02:00
  • 0235b9b571
    clip : use ggml_backend_buffer_is_host (#4205) b1727 Georgi Gerganov 2023-12-29 18:53:34 +02:00
  • ce18d727a4
    clip : enable gpu backend (#4205) b1726 Steward Garcia 2023-12-29 11:52:15 -05:00
  • 44c5f7b1dd
    llava : fixes Georgi Gerganov 2023-12-29 18:44:31 +02:00
  • 91bb39cec7
    cuda: fix vmm oom issue on NVIDIA AGX Orin (#4687) b1725 hydai 2023-12-30 00:31:19 +08:00
  • 8386034e08 Revert quants other than q4_k and q5_k Henrik Forstén 2023-12-29 18:26:29 +02:00
  • b289c24f8a cuda: fix vmm oom issue on NVIDIA AGX Orin hydai 2023-12-29 10:25:05 -06:00
  • 2cf4f37e36 add metal backend FSSRepo 2023-12-29 10:32:40 -05:00
  • a52154d3b3 Merge branch 'master' of https://github.com/ggerganov/llama.cpp FSSRepo 2023-12-29 10:26:25 -05:00
  • 04ac0607e9
    python : add check-requirements.sh and GitHub workflow (#4585) b1724 crasm 2023-12-29 09:50:29 -05:00
  • 68eccbdc5b
    flake.nix : rewrite (#4605) b1723 Philip Taron 2023-12-29 06:42:26 -08:00
  • 97bbca6e85
    cmake : fix ld warning duplicate libraries libllama.a (#4671) b1722 Cuong Trinh Manh 2023-12-29 21:39:15 +07:00
  • 4af4801566
    llava-cli : refactor to use sampling library (#4669) b1721 Justine Tunney 2023-12-29 06:38:38 -08:00
  • 4f7273c882
    Merge bc1b0d5351 into db49ff8ed7 ParetoOptimalDev 2023-12-29 06:33:36 -08:00
  • db49ff8ed7
    server : replace sleep with condition variables (#4673) b1720 Justine Tunney 2023-12-29 06:24:12 -08:00
  • 60f55e888c
    server : fix OpenAI server sampling w.r.t. penalty. (#4675) b1719 SakuraUmi 2023-12-29 22:22:44 +08:00
  • b93edd22f5
    server : allow to generate multimodal embeddings (#4681) b1718 Karthik Sethuraman 2023-12-29 06:22:10 -08:00
  • 82d6eab224
    main-cmake-pkg : fix build issue (#4665) b1717 andrijdavid 2023-12-29 15:18:20 +01:00
  • 2129e3ef05
    cmake : fix trailing whitespace Georgi Gerganov 2023-12-29 16:16:57 +02:00
  • 43e8cc8cf6 CUDA: fix tensor core logic for Pascal and HIP JohannesGaessler 2023-12-29 11:38:56 +01:00
  • afd997ab60
    llama.swiftui : fix infinite loop, ouput timings, buff UI (#4674) b1716 Peter Sugihara 2023-12-29 05:58:56 -08:00
  • c8255f8a6b
    scripts : print list of sync commits b1715 Georgi Gerganov 2023-12-29 15:12:35 +02:00
  • 441f51dca0
    ci : build with CLBlast + ggml-opencl use GGML_API (whisper/1576) Tamotsu Takahashi 2023-12-29 19:23:27 +09:00
  • 33b8159ef0 preliminary method for general chat format handling in server Peter Nagymathe 2023-12-29 12:13:42 +00:00
  • 38b3de4658
    sync : ggml b1713 Georgi Gerganov 2023-12-29 14:56:41 +02:00
  • afc8c19291
    ggml : fix some mul mat cases + add tests for src1 F16 (ggml/669) bssrdf 2023-12-29 03:32:31 -05:00
  • ca38b8d334
    scripts : do not sync commits from this repo Georgi Gerganov 2023-12-29 14:41:36 +02:00
  • 3616ed585b
    Use glob to load common files andrijdavid 2023-12-29 12:56:55 +01:00
  • c653799907 Allow server to generate multimodal embeddings and update README.md Karthik Sethuraman 2023-12-29 00:07:35 -08:00
  • aec7fd528a
    Merge pull request #1 from X0RSH1FT/windows X0RSH1FT 2023-12-28 22:09:53 -05:00
  • b8d263a08d Changed default model X0RSH1FT 2023-12-28 22:01:21 -05:00
  • ef0e65e5ad Merge remote-tracking branch 'origin/master' into x0rsh1ft X0RSH1FT 2023-12-28 21:56:44 -05:00
  • fc4779d4dc Simplified scripts. Added environment variables. Renamed functions. X0RSH1FT 2023-12-28 21:55:40 -05:00
  • 9273226dcf
    Fix OpenAI server sampling w.r.t. penalty. SakuraUmi 2023-12-29 03:34:17 +08:00
  • 65e5f6dadb
    Fix OpenAI server sampling w.r.t. temp and seed (#4668) b1710 Justine Tunney 2023-12-28 11:20:00 -08:00
  • 0f29c0247f Merge branch 'master' of github.com:psugihara/llama.cpp into swiftui-timings psugihara 2023-12-28 10:26:26 -08:00
  • e3a1eb41a0 clearer UI text, add timings to completion log psugihara 2023-12-28 10:25:58 -08:00
  • c9c4e1f077 reorder operations ct-clmsn 2023-12-28 12:53:43 -05:00
  • 2e3d229597 fixed syntax issue ct-clmsn 2023-12-28 12:50:11 -05:00
  • f7e04d43f3
    Replace sleep with condition variables in server Justine Tunney 2023-12-28 09:35:29 -08:00
  • beb28d68ab trying to remove mutex/lock from parallel region ct-clmsn 2023-12-28 12:33:58 -05:00
  • cc8cc3face slight UI simplification, clearer UX psugihara 2023-12-28 09:04:54 -08:00
  • b8f325224b fix warning in example. Cuong Trinh 2023-12-29 00:04:41 +07:00
  • 760df0c3af fix "ld: warning: ignoring duplicate libraries: '../libllama.a'" Cuong Trinh 2023-12-28 23:55:50 +07:00
  • eb2a9e6bdf
    Refactor llava-cli to use sampling library Justine Tunney 2023-12-28 07:38:35 -08:00
  • af5f066280
    Fix OpenAI server sampling w.r.t. temp and seed Justine Tunney 2023-12-28 07:28:04 -08:00
  • 2a21bd8ab2 Remove Q_K 0 type changes Henrik Forstén 2023-12-28 17:13:39 +02:00
  • 63b65efb78 added tooltips for all items in the GUI launcher Concedo 2023-12-28 23:08:57 +08:00
  • ea5497df5d
    gpt2 : Add gpt2 architecture integration (#4555) b1709 manikbhandari 2023-12-28 09:03:57 -05:00
  • 3fafecae9e Weighted least squares Henrik Forstén 2023-12-28 15:55:12 +02:00
  • 35157e4c0e
    Fix main-cmake-pkg compilation andrijdavid 2023-12-28 14:15:22 +01:00
  • 5eb01d4e64 Move check-requirements.sh into ./scripts/ crasm 2023-12-28 04:41:59 -05:00
  • b6bf2643a7 See if this fixes docker workflow crasm 2023-12-28 04:20:24 -05:00
  • ec46661a32 wip adding tooltips Concedo 2023-12-28 15:54:22 +08:00
  • cf360f3e62
    Update expose.cpp '#include <cstdint> (#586) Nexesenex 2023-12-28 08:01:22 +01:00
  • ba77e916ef added missing parameters for United class.py Concedo 2023-12-28 14:07:26 +08:00
  • 5d546a3c98
    Merge branch 'master' into phi-1 teleprint-me 2023-12-27 23:37:04 -05:00
  • 417884ea8d Merge branch 'master' into regex_gpt2_preprocess Bingxuan Wang 2023-12-28 12:25:39 +08:00
  • 5e59112de8 prevent other calls when uninitialized Concedo 2023-12-28 12:04:53 +08:00
  • 2d5d82e915 addlocate gpt_params on heap instead to avoid rare segfault Concedo 2023-12-28 11:48:21 +08:00
  • 2ac2edd2b7 Merge branch 'master' of https://github.com/ggerganov/llama.cpp into check-requirements-txt Jared Van Bortel 2023-12-27 21:42:12 -05:00
  • f3a447e1c6 don't remove venvs if nocleanup is passed Jared Van Bortel 2023-12-27 21:38:54 -05:00
  • d0ab7a1dc6 small syntax change Jared Van Bortel 2023-12-27 21:34:46 -05:00
  • ce26f49208 improve check-requirements.sh Jared Van Bortel 2023-12-27 21:27:18 -05:00
  • 599b773654 fix infinite loop psugihara 2023-12-27 16:20:00 -08:00
  • 9174699f84
    llama : adapt plamo to new ffn Georgi Gerganov 2023-12-27 17:24:34 +02:00
  • 278f3e99c2
    Merge branch 'master' into HEAD Georgi Gerganov 2023-12-27 17:24:10 +02:00
  • 69ab1bf2f8 Merge branch 'master' into concedo_experimental Concedo 2023-12-27 21:43:46 +08:00
  • 5b2d93a1f8 updated lite and colab, added logit bias support to lite Concedo 2023-12-27 21:32:18 +08:00
  • f478136773 Fix other than 4 quants Henrik Forstén 2023-12-27 15:13:23 +02:00
  • 68afe4f71b remove unused code manikbhandari 2023-12-27 07:33:27 -05:00
  • 7970696685 Merge branch 'master' into add_gpt2_support manikbhandari 2023-12-27 07:22:22 -05:00
  • f6793491b5
    llama : add AWQ for llama, llama2, mpt, and mistral models (#4593) b1708 Nam D. Tran 2023-12-27 22:39:45 +07:00
  • 879b690a9e
    finetune : fix output formatting in print_params (#4653) b1707 Daniel Bevenius 2023-12-27 15:16:55 +01:00
  • d6313d8385 Refactor Henrik Forstén 2023-12-27 12:57:43 +02:00
  • 4d6d967c10 silence autoplay for colab Concedo 2023-12-27 19:13:34 +08:00
  • 41f714b695
    finetune: fix output formatting in print_params Daniel Bevenius 2023-12-27 11:48:55 +01:00
  • b47879b0dd
    scripts : add sync-ggml-am.sh Georgi Gerganov 2023-12-27 11:15:31 +02:00
  • 951010fa53
    ggml : fix dot product for ARM (#4630) b1705 Georgi Gerganov 2023-12-27 11:02:13 +02:00
  • f56d6077d0
    Add byte token type when tokenizer.model is not exists (#4641) wonjun Jang 2023-12-27 17:37:25 +09:00
  • d1af0a3a94 Quantization loop Henrik Forstén 2023-12-27 09:44:38 +02:00
  • cb58775719 Add upper version bound for transformers and protobuf crasm 2023-12-27 02:09:30 -05:00
  • dd0f47060b Merge branch 'master' into check-requirements-txt crasm 2023-12-27 02:05:16 -05:00
  • 0b6207ef61 Better k quantization Henrik Forstén 2023-12-27 07:11:02 +02:00
  • 3f7003b4bb
    flake.nix: darwin: only expose the default Someone Serge 2023-12-26 22:41:53 +00:00
  • d0adab60d5
    nix: explicit jetson support Someone Serge 2023-12-26 20:04:49 +00:00
  • 7bd8d8c6d7
    nix: explicit mpi support Someone Serge 2023-12-26 22:23:30 +00:00
  • 82e48e2567
    nix: clean sources more thoroughly Someone Serge 2023-12-26 22:20:07 +00:00
  • dc68f0054c
    cuda : fix vmm pool with multi GPU (#4620) b1703 slaren 2023-12-26 21:23:59 +01:00
  • f097bed543 remove unnecessary check id != g_main_device slaren 2023-12-26 21:23:23 +01:00
  • 433c4f5050 Least squares quantization Henrik Forstén 2023-12-26 17:37:46 +02:00
  • 1efbc6b064
    nix: add the impure driver's location to the DT_RUNPATHs Someone Serge 2023-12-26 17:26:22 +00:00
  • e733a9e425
    Add logit_bias to the OpenAI api (#577) DebuggingLife46 2023-12-26 21:56:19 +05:30
  • 42f5246eff
    ggml : fix dot product when missing intrinsics Georgi Gerganov 2023-12-26 17:37:33 +02:00
  • f32f30bc57
    test gg/test-arm Georgi Gerganov 2023-12-26 17:37:33 +02:00
  • 6a2a13b51c adapt to recent changes manikbhandari 2023-12-26 07:28:22 -05:00