Commit graph

  • 0610672b19 Rename to llm_build_ffn_mpt_awq Le Hoang Anh 2023-12-20 10:48:00 +07:00
  • 4ab3f47cbc add the parameter : --no-display-prompt , combine with --log-disable it will display only the generated tokens Yann Follet 2023-12-20 03:24:26 +00:00
  • c3678ca84f unmap offloaded part of the model slaren 2023-12-20 04:12:11 +01:00
  • bb486b88e1
    Online GPU slicing (#11) Holden X 2023-12-20 10:09:43 +08:00
  • d3e7242bdb more accurate mlock slaren 2023-12-20 03:01:33 +01:00
  • 9a056ed708 Remove venv before creation crasm 2023-12-19 20:56:22 -05:00
  • 72a0c96621 disable gpu backends with ngl 0 slaren 2023-12-20 02:45:54 +01:00
  • c8bd5d8b65 add ggml_backend_buffer_is_hos, used to avoid copies if possible when accesing tensor data slaren 2023-12-19 23:47:34 +01:00
  • b38a1e6b02 fix deps.sh crasm 2023-12-19 18:44:48 -05:00
  • e4c510441e changed to use uint32_t instead of int in llama_n_ctx marcus 2023-12-19 15:20:28 -08:00
  • 9b57750b69 changed to use uint32_t instead of int marcus 2023-12-19 15:19:26 -08:00
  • d173ddac9e allowed getting n_batch from llama_context in c api marcus 2023-12-19 15:06:49 -08:00
  • 9809314bbf Disable test-model-load-cancel in make crasm 2023-12-19 17:46:36 -05:00
  • a386278304 remove ggml_repeat of clip.cpp FSSRepo 2023-12-19 17:07:07 -05:00
  • ffdb10d276 Merge branch 'master' of https://github.com/ggerganov/llama.cpp FSSRepo 2023-12-19 17:06:11 -05:00
  • 4f88c2c9fa fix mistral prompt Yazan Agha-Schrader 2023-12-19 20:54:40 +01:00
  • 1ac01fbbd1 add ggml_backend_buffer_clear zero-init KV cache buffer slaren 2023-12-19 18:31:00 +01:00
  • 0c5ee7c417 cuda backend can be used though ggml-backend with LLAMA_GGML_BACKEND_CUDA_TEST access all tensor data with ggml_backend_tensor_get/set slaren 2023-12-19 17:55:37 +01:00
  • d2e9d00cbc update: readme Trần Đức Nam 2023-12-19 23:31:15 +07:00
  • 8177ad4e37 update: work for bot mpt and awqmpt Trần Đức Nam 2023-12-19 23:25:00 +07:00
  • 328b83de23
    ggml : fixed check for _MSC_VER (#4535) b1662 Eric Sommerlade 2023-12-19 16:17:01 +00:00
  • 5e5f780b6c fixed check for _MSC_VER Eric Sommerlade 2023-12-19 16:10:37 +00:00
  • 3f863eed72 add presence penalty Concedo 2023-12-19 23:18:56 +08:00
  • a40f6110f0
    ggml : force F32 precision for ggml_mul_mat gg/cublas-f32 Georgi Gerganov 2023-12-19 16:23:39 +02:00
  • da2db0302c Added support for ssl cert and key Concedo 2023-12-19 22:23:19 +08:00
  • ded0613bd4
    Fix issues in README on feedback (#15) Holden X 2023-12-19 17:09:33 +08:00
  • e3b4b85caa
    Update LICENSE and TODOs in README (#14) Holden X 2023-12-19 16:23:10 +08:00
  • 49a5dfc604 Merge branch 'master' into concedo_experimental Concedo 2023-12-19 16:07:48 +08:00
  • 8fece75e35 format code Trần Đức Nam 2023-12-19 15:01:42 +07:00
  • 1f77d2ad73 move multiprocessing import into function scope Concedo 2023-12-19 15:56:58 +08:00
  • 6948da5a0d
    Fix for windows model unloading not releasing memory (#569) ebolam 2023-12-19 02:55:41 -05:00
  • 4c274dc2fd fix tools compilation Concedo 2023-12-19 15:53:22 +08:00
  • 1b300cbd58 black Trần Đức Nam 2023-12-19 14:51:52 +07:00
  • f8cf783935 update: change order import Trần Đức Nam 2023-12-19 14:45:50 +07:00
  • 1e79625910 update requirements.txt crasm 2023-12-19 02:42:07 -05:00
  • ef61a6667b fix: readme Trần Đức Nam 2023-12-19 14:40:12 +07:00
  • 121b04d121 ci : restrict .github/workflows/build.yml ctest to -L main crasm 2023-12-19 02:19:11 -05:00
  • f80ff4dc6a ci : get ci/run.sh working with test-model-load-cancel crasm 2023-12-19 01:43:27 -05:00
  • 7cebaba8e4
    Add paper and bibtex to README.md zeyu 2023-12-19 13:08:26 +08:00
  • f97c587639 update: readme Trần Đức Nam 2023-12-19 11:28:18 +07:00
  • 4cad8d7d7a update: ready for PR Trần Đức Nam 2023-12-19 11:19:01 +07:00
  • 94507911bb Merge remote-tracking branch 'origin/master' into sl/ggml-backend-int slaren 2023-12-19 03:52:34 +01:00
  • dc0552b1b4
    Update README.md zeyu 2023-12-19 10:52:54 +08:00
  • 0808aa5a42 add ggml-metal slaren 2023-12-19 03:23:00 +01:00
  • 8e6735ec60 llama : initial ggml-backend integration slaren 2023-12-17 21:21:07 +01:00
  • 53268cbb52
    Readme reorg (#12) Jeremy Song 2023-12-19 07:48:31 +08:00
  • a7aee47b98
    ggml-cuda: Fix HIP build (#4528) b1661 arlo-phoenix 2023-12-18 22:33:45 +01:00
  • 130fa3232e ggml-cuda: Fix HIP build arlo-phoenix 2023-12-18 21:44:41 +01:00
  • 72e99b73df ui: adding local dependency for prism and markdown preview Deepak Seth 2023-12-18 11:33:35 -08:00
  • aa1967f48a ui: Allow rephrasing last prompt Deepak Seth 2023-12-17 16:33:25 -08:00
  • 622975b742 ui: adding zeromd for markdown handling Deepak Seth 2023-12-17 15:43:26 -08:00
  • ea8d513e6e ui: fix issue with stalechat message Deepak Seth 2023-12-16 21:35:51 -08:00
  • de1bcdad17 ui: Adding Custom Prompts Deepak Seth 2023-12-16 21:19:15 -08:00
  • 810a883444 ui: adding logo and spacing fixes Deepak Seth 2023-12-10 13:39:43 -08:00
  • 7cbaabb38f ui: adding logo and spacing fixes Deepak Seth 2023-12-10 13:39:43 -08:00
  • 0b5a5aeadc ui: updated spacing issue and typo Deepak Seth 2023-12-10 10:38:21 -08:00
  • 8cf2f35223 ui: rebase and resolve conflict Deepak Seth 2023-12-15 14:30:30 -08:00
  • e9e2be33fd Use single queue per device to simplify code 0cc4m 2023-12-18 19:25:46 +01:00
  • 0e18b2e7d0
    llama.swiftui : add tinyllama 1.1B F16 b1660 Georgi Gerganov 2023-12-18 20:17:43 +02:00
  • 6ff39b129d
    llama.swiftui : add more models b1659 Georgi Gerganov 2023-12-18 20:05:12 +02:00
  • b9e74f9bca
    llama : add phi-2 + fix NeoX rope + ggml_mul_mat_set_prec (#4490) b1658 Ebey Abraham 2023-12-18 17:27:47 +00:00
  • 3c734f4941
    plamo : testing gg/plamo-test Georgi Gerganov 2023-12-18 17:06:05 +02:00
  • 602cebba7a
    Merge 21b68f3032 into 3c04bf6da8 Quinten Kock 2023-12-18 14:41:31 +01:00
  • 3c04bf6da8
    llama : fix try_override for bool_value which always return true (#4519) b1657 hankcs 2023-12-18 05:14:58 -08:00
  • c02412c383
    cuda : remove oboslete comment Georgi Gerganov 2023-12-18 15:10:17 +02:00
  • 7ea427dbfa
    cuda : ggml_cuda_op_mul_mat_cublas support F32 precision Georgi Gerganov 2023-12-18 14:24:29 +02:00
  • a462159c43
    cuda : ggml_cuda_op_mul_mat_cublas support F32 precision gg/phi-2-2 Georgi Gerganov 2023-12-18 14:24:29 +02:00
  • 30338c5643
    Update ggml-cuda.cu Georgi Gerganov 2023-12-18 14:21:38 +02:00
  • 3c8d6b160b
    Update ggml-cuda.cu Georgi Gerganov 2023-12-18 14:21:22 +02:00
  • 18c67bdd84
    ggml : add ggml_mul_mat_set_prec Georgi Gerganov 2023-12-18 13:28:10 +02:00
  • aed3cf838c Attempt at writing ctest_with_model crasm 2023-12-18 04:45:39 -05:00
  • 4b63355f45 ci : ctest uses -L main crasm 2023-12-18 04:23:58 -05:00
  • fd9d247dd2 Label all ctest tests crasm 2023-12-18 04:23:20 -05:00
  • 576d28b7f7 fix: Readme Trần Đức Nam 2023-12-18 15:54:39 +07:00
  • eb9a790c11 update: support 4 models Trần Đức Nam 2023-12-18 15:49:15 +07:00
  • 603c771974
    Configurable sparse prediction threshold (#7) Holden X 2023-12-18 16:36:24 +08:00
  • a8d2a6f3ef
    Merge branch 'master' into HEAD Georgi Gerganov 2023-12-18 10:17:55 +02:00
  • 9339ffc96d update README okada 2023-12-18 16:46:51 +09:00
  • 907b92185c remove develop code okada 2023-12-18 16:32:16 +09:00
  • febc63598b update kqv code okada 2023-12-18 00:16:56 +09:00
  • ca8f698638 seems ok okada 2023-12-17 23:28:29 +09:00
  • f76fd39266 use inp_pos okada 2023-12-17 21:53:04 +09:00
  • 86d5348fd0 runnable okada 2023-12-17 18:29:08 +09:00
  • a22040a810 fix norm_rms_eps hparam okada 2023-12-17 18:15:25 +09:00
  • 4a3ef4f2a4 able to compile okada 2023-12-17 17:44:29 +09:00
  • 9d49236570 update norm okada 2023-12-17 15:44:59 +09:00
  • b2330f57e2 plamo convert okada 2023-12-17 15:23:59 +09:00
  • 4c585b4c6c add tensor loading okada 2023-12-16 16:24:54 +09:00
  • feb0966af1 add plamo mock okada 2023-12-16 15:55:58 +09:00
  • 30a1a0a7e0 Fix try_override for bool_value which always return true ignoring override->bool_value hankcs 2023-12-17 22:47:01 -08:00
  • 597ef34ba1
    Update README.md zeyu 2023-12-18 14:21:27 +08:00
  • e851199ad3 update: mistral 7b v1 benchmark Trần Đức Nam 2023-12-18 11:37:09 +07:00
  • 6bba3410fa Simplify .gitignore for tests, clang-tidy fixes crasm 2023-12-17 22:33:38 -05:00
  • fe6a6fb6d1 Revert "Revert "Fail test if model file is missing"" crasm 2023-12-17 22:24:17 -05:00
  • 068e7c408f Add test-model-load-cancel to Makefile crasm 2023-12-17 22:22:42 -05:00
  • 2994f0c5a2
    decode : fix logits_valid for legacy API (#4516) b1656 Jared Van Bortel 2023-12-17 19:39:02 -05:00
  • 1b05817112 decode : fix logits_valid for old API ceb/fix-logit-check Jared Van Bortel 2023-12-17 18:49:21 -05:00
  • 397eef509b Implement credentialed CORS according to MDN Laura 2023-12-17 21:11:46 +01:00
  • 0876952924 Implement credentialed CORS according to MDN Laura 2023-12-17 21:11:46 +01:00
  • 2796953257 Revert "Fail test if model file is missing" crasm 2023-12-17 14:37:01 -05:00