Commit graph

  • 10151bee2e
    server : support for saving templates in browser LocalStorage (#2486) master-10151be staviq 2023-08-17 23:34:01 +00:00
  • af2cd7f8be fix test-llama-grammar Evan Jones 2023-08-17 19:17:43 -04:00
  • 306070c896
    llama.cpp : print kv general.name klosax 2023-08-18 01:06:27 +02:00
  • e029b50351 Merge remote-tracking branch 'upstream/master' into fix-unicode-2 Evan Jones 2023-08-17 19:01:19 -04:00
  • 0bb897c82a
    bug fix: actually use result type passed to ggml_add_cast xaedes 2023-08-17 23:48:30 +02:00
  • 15f7448611 server : update xxd usage for older versions compatibility Jhen 2023-08-18 06:16:43 +08:00
  • 0992a7b8b1
    README: fix LLAMA_CUDA_MMV_Y documentation (#2647) Johannes Gäßler 2023-08-17 23:57:59 +02:00
  • e830035c1c server : use xxd in public/ for simplify func name Jhen 2023-08-18 05:45:08 +08:00
  • 1664826c5a README: fix LLAMA_CUDA_MMV_Y documentation JohannesGaessler 2023-08-17 23:37:54 +02:00
  • d9e6890a51
    test-tokenizer-0.cpp : fix warning klosax 2023-08-17 23:34:21 +02:00
  • 147a99bd3a
    gguf.py : reverse GGUF_MAGIC klosax 2023-08-17 23:24:04 +02:00
  • c20ae49b59
    ggml.h : reverse GGUF_MAGIC klosax 2023-08-17 23:23:17 +02:00
  • 3b4368471a
    server : better default prompt Georgi Gerganov 2023-08-17 23:27:42 +03:00
  • df87dd74a5 formatting slaren 2023-08-17 22:11:40 +02:00
  • 6ddeefad9b
    [Zig] Fixing Zig build and improvements (#2554) Henri Vasserman 2023-08-17 23:11:18 +03:00
  • 3c1b7217a9
    convert-llama-7b-pth-to-gguf.py : fixes klosax 2023-08-17 21:44:34 +02:00
  • 9e2d4dd48e
    convert-llama-hf-to-gguf.py : fixes klosax 2023-08-17 21:43:48 +02:00
  • 640ddc4259
    gguf.py : gptneox mapping klosax 2023-08-17 21:43:10 +02:00
  • b668cd3296
    convert-gptneox-hf-to-gguf.py : fixes klosax 2023-08-17 21:42:26 +02:00
  • fc3a523211
    gguf.py : write tensors in a single pass (#2644) M. Yusuf Sarıgöz 2023-08-17 21:57:39 +03:00
  • 6a9e6375b5
    gguf.py : indentation gguf-write-single-pass Georgi Gerganov 2023-08-17 21:53:15 +03:00
  • 307e09cd85
    Merge branch 'gguf' into gguf-write-single-pass Georgi Gerganov 2023-08-17 21:51:15 +03:00
  • e426b3cfc8
    gguf.py : fix vertical alignment Georgi Gerganov 2023-08-17 21:50:01 +03:00
  • 5484737d58
    llama : fix tensor name grepping during quantization Georgi Gerganov 2023-08-17 21:40:51 +03:00
  • 57eaadb853
    llama : throw error if gguf fails to init from file Georgi Gerganov 2023-08-17 21:31:52 +03:00
  • b3cc182990
    llama.cpp : typo klosax 2023-08-17 20:27:50 +02:00
  • acaa98234a
    convert.py : fix HF tensor permuting / unpacking Georgi Gerganov 2023-08-17 21:06:45 +03:00
  • 78e1e57862
    quantize-stats.cpp : .bin --> .gguf klosax 2023-08-17 19:18:24 +02:00
  • fb11dd3f92
    common.h : .bin --> .gguf klosax 2023-08-17 19:16:35 +02:00
  • e72c8c2124
    ggml : fix bug in gguf_set_kv Georgi Gerganov 2023-08-17 20:13:12 +03:00
  • 4dbce7d009 gguf : rm file_type key and method M. Yusuf Sarıgöz 2023-08-17 20:02:38 +03:00
  • 1d93d04ce2 gguf : refactor pth to gguf conversion script M. Yusuf Sarıgöz 2023-08-17 19:58:27 +03:00
  • 899f9a5350
    llama : fix lambda capture Georgi Gerganov 2023-08-17 19:49:21 +03:00
  • 93f285bdf1
    gptneox : move as a WIP example Georgi Gerganov 2023-08-17 19:38:48 +03:00
  • f71704177f gguf : rename h5 to hf (for HuggingFace) M. Yusuf Sarıgöz 2023-08-17 19:49:15 +03:00
  • 81a2c2a6f4
    llama : fix llama_model_loader memory leak Georgi Gerganov 2023-08-17 19:49:02 +03:00
  • 9f02694c91 gguf : refactor gptneox conversion script M. Yusuf Sarıgöz 2023-08-17 19:45:06 +03:00
  • dd9e2fc988
    ci : update ".bin" to ".gguf" extension Georgi Gerganov 2023-08-17 19:32:14 +03:00
  • c3b739374e
    editorconfig : ignore models folder Georgi Gerganov 2023-08-17 19:17:25 +03:00
  • 22c61c5b45 gguf : style fixes in simple conversion script M. Yusuf Sarıgöz 2023-08-17 19:05:43 +03:00
  • 6d66ef96eb
    Merge branch 'master' into gguf Georgi Gerganov 2023-08-17 19:04:59 +03:00
  • 11bf4366c2
    llama : sync with recent PRs on master Georgi Gerganov 2023-08-17 19:03:15 +03:00
  • 2f8fc92d86 gguf : fix conflicts M. Yusuf Sarıgöz 2023-08-17 18:51:14 +03:00
  • 8ace03ad3d
    convert.py : better always have n_head_kv and default it to n_head Georgi Gerganov 2023-08-17 18:47:06 +03:00
  • b6c81e28cd improve formatting slaren 2023-08-17 17:28:55 +02:00
  • d646c4efce
    convert.py : n_head_kv optional and .gguf file extension klosax 2023-08-17 17:20:36 +02:00
  • 36b0c5b398 fix for incorrect missing backends displayed Concedo 2023-08-17 22:45:49 +08:00
  • 5b8485b6ae Regen index.html.cpp, suggested from code review staviq 2023-08-17 16:42:45 +02:00
  • 4a18c88143
    Merge branch 'ggerganov:master' into master staviq 2023-08-17 14:34:02 +00:00
  • bd815c9c86
    Apply suggestions from code review staviq 2023-08-17 14:29:36 +00:00
  • dd016cc246
    Revert "ci : disable CI temporary to not waste energy" Georgi Gerganov 2023-08-17 17:23:16 +03:00
  • 2ddd9681d6
    convert.py : update to support GGUF output Georgi Gerganov 2023-08-17 17:22:43 +03:00
  • e0429d38e4
    convert-new.py : output gguf (#2635) Georgi Gerganov 2023-08-17 17:19:52 +03:00
  • 663d952abb
    llama : style fixes Georgi Gerganov 2023-08-17 17:19:31 +03:00
  • 3839704062
    convert-new.py : minor fixes Georgi Gerganov 2023-08-17 17:16:26 +03:00
  • 5d044403d3
    Merge branch 'gguf' into gguf-convert Georgi Gerganov 2023-08-17 17:04:49 +03:00
  • 39362f3485
    gguf.py : pick some of the refactoring from #2644 Georgi Gerganov 2023-08-17 17:02:01 +03:00
  • 5f97a48fc1 gguf : single pass for writing tensors + refactoring writer M. Yusuf Sarıgöz 2023-08-17 16:57:50 +03:00
  • 673ae1a17e
    convert-new.py : convert script now works Georgi Gerganov 2023-08-17 16:52:25 +03:00
  • dce07c3121 gguf : single pass for writing tensors + refactoring writer M. Yusuf Sarıgöz 2023-08-17 16:48:49 +03:00
  • 8dae7ce684
    Add --cfg-negative-prompt-file option for examples (#2591) master-8dae7ce Kerfuffle 2023-08-17 07:29:44 -06:00
  • d6fd53afd6
    llama.cpp : use ggml_elements() klosax 2023-08-17 15:24:35 +02:00
  • 5a0a2c5685
    llama.cpp : print actual model size klosax 2023-08-17 15:18:16 +02:00
  • af4960a5a5 server : attempt use valid xxd command on linux Jhen 2023-08-17 20:30:48 +08:00
  • 7eaa315631
    convert-new.py : add map for skipping tensor serialization Georgi Gerganov 2023-08-17 15:40:39 +03:00
  • f31e9230ad gguf : single pass for writing tensors + refactoring writer M. Yusuf Sarıgöz 2023-08-17 15:19:30 +03:00
  • 580e02e11e server : always regenerate asset hpp before compile Jhen 2023-08-17 19:40:27 +08:00
  • 86bc9d2750
    convert-new.py : tensor name mapping Georgi Gerganov 2023-08-17 13:15:17 +03:00
  • 6c26109743 Minor formatting change. KerfuffleV2 2023-08-17 04:48:27 -06:00
  • e970845383
    tests : add new ggml-vocab-llama.gguf Georgi Gerganov 2023-08-17 12:38:34 +03:00
  • 7b6ae89041
    llama : fix tokenizer to use llama_char_to_byte Georgi Gerganov 2023-08-17 12:27:26 +03:00
  • 0ba5d488e5
    convert-new.py : vocab-only option should work now Georgi Gerganov 2023-08-17 12:00:13 +03:00
  • f9db574bbf
    convert-new.py : minor fixes Georgi Gerganov 2023-08-16 23:11:21 +03:00
  • a73ccf1aa3
    llama : replace (permute + reshape + view_1d) with (view_3d) (#2538) master-a73ccf1 Georgi Gerganov 2023-08-17 10:47:09 +03:00
  • ccfe9080cd
    llama : remove commented code Georgi Gerganov 2023-08-17 10:45:21 +03:00
  • 7cf54e1f74
    tests : adds simple llama grammar tests (#2618) master-7cf54e1 drbh 2023-08-17 03:41:01 -04:00
  • a872a2b28e
    ggml-alloc : fix discrepency between measure&eval (#2639) master-a872a2b Shouzheng Liu 2023-08-17 03:35:53 -04:00
  • 42f8fe1927 examples/gguf : no need to keep q option for quantization any more M. Yusuf Sarıgöz 2023-08-17 08:56:42 +03:00
  • d864596e0a Merge branch 'gguf' of https://github.com/ggerganov/llama.cpp into gguf goerch 2023-08-17 04:55:26 +02:00
  • 3bade857c7 cleanup grammar at end of test drbh 2023-08-16 21:48:23 -04:00
  • 1108394acd ggml-alloc: fix discrepency between measure&eval lshzh-ww 2023-08-16 21:24:58 -04:00
  • 9c3866099b cleanup slaren 2023-08-17 03:22:19 +02:00
  • 94218e8ade markdown: add build id slaren 2023-08-17 03:12:03 +02:00
  • 569dc6f3d0 markdown: also show values that differ from the default slaren 2023-08-17 03:06:07 +02:00
  • 9e05cc1d69 avoid dangling pointers in candidate cleanup drbh 2023-08-16 20:58:37 -04:00
  • cac70312e3 add basic cpu and gpu info (linx/cuda only) slaren 2023-08-17 02:50:04 +02:00
  • d49dc3d628 0 terminate code_points drbh 2023-08-16 20:30:26 -04:00
  • 67362d9db0 add sql output slaren 2023-08-17 02:19:52 +02:00
  • 314a6b5422 fix json formatting slaren 2023-08-17 00:12:40 +02:00
  • 89a70f78e7 llama.cpp : fix MEM_REQ_SCRATCH0 reusing the value of n_ctx of the first call slaren 2023-08-16 22:40:53 +02:00
  • 714fec06ee
    use ggml_add_cast in finetuning xaedes 2023-08-16 23:53:12 +02:00
  • 9198b24e4e
    add ggml_add_cast API function xaedes 2023-08-16 23:50:46 +02:00
  • 54113caf0d convert autosave invocation to useEffect staviq 2023-08-16 23:45:59 +02:00
  • 57af6dd320 sync accepted #2409 fix from upstream staviq 2023-08-16 23:19:55 +02:00
  • c88c2a992a
    probably lld is not required Henri Vasserman 2023-08-16 23:17:52 +03:00
  • 0919a0f73d
    cmake : install ggml-meta.metal if LLAMA_METAL (#2449) master-0919a0f Kolen Cheung 2023-08-16 21:09:49 +01:00
  • ed53db86c3
    metal : print error of load pipeline state (#2564) Jhen-Jie Hong 2023-08-17 04:09:03 +08:00
  • f80e245d7b
    add lora finetune support on quantized base model tensors xaedes 2023-08-16 22:06:20 +02:00
  • fc8ef549e5
    metal : enable ggml-alloc (#2627) master-fc8ef54 Shouzheng Liu 2023-08-16 16:08:28 -04:00
  • 13a746c6b9
    Merge branch 'master' into metal-memory-alloc Georgi Gerganov 2023-08-16 23:08:03 +03:00