Commit graph

  • bf83bff674
    metal : matrix-matrix multiplication kernel (#2615) master-bf83bff Shouzheng Liu 2023-08-16 16:07:04 -04:00
  • 83a4ad7986
    remove trailing whitespace xaedes 2023-08-16 22:05:41 +02:00
  • 83cb9ed4f5
    implement ggml_compute_forward_out_prod_q_f32 xaedes 2023-08-16 22:00:37 +02:00
  • 79ad888768
    remove unused call to not existing llama_get_layer_from_model xaedes 2023-08-16 21:56:36 +02:00
  • 1151653b15
    replace llama API functions to get model tensors by one function to get model tensor by name xaedes 2023-08-16 21:36:40 +02:00
  • c40ec5c403
    llama : add hparams.ctx_train + no longer print ftype Georgi Gerganov 2023-08-16 22:05:23 +03:00
  • 8be49fdf9e
    convert-new.py : add gguf key-value pairs Georgi Gerganov 2023-08-16 21:52:06 +03:00
  • bbbc0ce717
    makefile rewrite Henri Vasserman 2023-08-16 21:28:54 +03:00
  • 250cf83847
    convert-new.py : output gguf (WIP) Georgi Gerganov 2023-08-16 20:44:51 +03:00
  • 5765f90f58 better checks for non-optimized builds slaren 2023-08-16 19:24:54 +02:00
  • 5ec18934ad
    convert-new.py : pick #2427 for HF 70B support Georgi Gerganov 2023-08-16 20:16:15 +03:00
  • c8ee87f141
    gguf.py : merge all files in gguf.py Georgi Gerganov 2023-08-16 19:55:49 +03:00
  • 88b5769487
    gguf : deduplicate (#2629) Georgi Gerganov 2023-08-16 19:25:29 +03:00
  • 795ec7070c
    examples : dedup simple Georgi Gerganov 2023-08-16 19:22:58 +03:00
  • c290f3eee6
    ggml : assert when using ggml_mul with non-F32 src1 Georgi Gerganov 2023-08-16 19:19:46 +03:00
  • 3de6a9aed2
    reenable LLAMA_CUDA_FORCE_DMMV Henri Vasserman 2023-08-16 18:35:16 +03:00
  • 68e79cc134
    Merge 'origin/master' into hipblas Henri Vasserman 2023-08-16 18:25:14 +03:00
  • f3e90f27de
    convert-llama-h5-to-gguf.py : support alt ctx param name klosax 2023-08-16 17:10:29 +02:00
  • 39a2d15461
    avoid stack overflow resulting from big ggml_cgraph xaedes 2023-08-16 16:42:25 +02:00
  • 0ab2507ce5
    fix names of lora tensors xaedes 2023-08-16 16:41:20 +02:00
  • 6412e97427
    llama : restore the original load/save session implementation Georgi Gerganov 2023-08-16 17:35:37 +03:00
  • 620275361d
    add debug prints for training memory improvements xaedes 2023-08-16 16:23:21 +02:00
  • be7e564b11
    bug fixes to make finetune compile xaedes 2023-08-16 16:21:43 +02:00
  • 50b1e66200
    remove const model and layer arguments in API functions for accessing model tensors xaedes 2023-08-16 16:21:02 +02:00
  • 3e3396e2e5 remove n_prompt and n_gen from the matrix, use each value separately instead slaren 2023-08-16 15:45:39 +02:00
  • 19e9beabb3 print warning is NDEBUG is not defined slaren 2023-08-16 15:36:56 +02:00
  • 28ee0c8583
    first draft for LORA finetune training xaedes 2023-08-16 15:31:04 +02:00
  • c0a372fd3d
    add API functions to access remaining model parameters: xaedes 2023-08-16 15:30:31 +02:00
  • 5b94b14d5d
    llama : fix strncpy warning + note token_to_str does not write null Georgi Gerganov 2023-08-16 15:28:09 +03:00
  • a49931300a
    llama.cpp : fix line feed and compiler warning klosax 2023-08-16 14:43:48 +02:00
  • dd6eaa32e4
    ggml : fix warnings about unused results Georgi Gerganov 2023-08-16 15:04:13 +03:00
  • 1891c928a4
    dedup : CPU + Metal is working Georgi Gerganov 2023-08-16 14:56:51 +03:00
  • d72a23e2f1
    gguf : better type names Georgi Gerganov 2023-08-16 14:37:07 +03:00
  • 758ff1bbb5
    llama : refactor model loading code (#2620) Georgi Gerganov 2023-08-16 14:34:03 +03:00
  • 6823899f2d
    llama : switch print order of meta data Georgi Gerganov 2023-08-16 14:32:59 +03:00
  • e524750a6c
    llama : improve printing + log meta data Georgi Gerganov 2023-08-16 14:24:04 +03:00
  • f634b292c9
    llama : throw error on missing KV paris in model meta data Georgi Gerganov 2023-08-16 13:44:35 +03:00
  • c1fe0aba72
    llama : fix Windows build + fix norm_rms_eps key Georgi Gerganov 2023-08-16 13:09:43 +03:00
  • ea5615a03a
    convert-llama-h5-to-gguf.py : clarify the reverse permute klosax 2023-08-16 11:23:15 +02:00
  • 31fb56e1d3
    llama : fix shape prints Georgi Gerganov 2023-08-16 11:38:17 +03:00
  • 5339b859ec
    llama : refactor llama_model_loader (WIP) Georgi Gerganov 2023-08-16 00:02:25 +03:00
  • 075d079a72 Merge branch 'master' into concedo_experimental Concedo 2023-08-16 10:43:06 +08:00
  • 444e781f09
    style-fix Shouzheng Liu 2023-08-15 22:24:24 -04:00
  • f9bbc6f281 add missing include slaren 2023-08-16 03:56:52 +02:00
  • 9b9905f9b8 metal: enable ggml-alloc lshzh-ww 2023-08-15 21:35:38 -04:00
  • f2cf01ddd2 improve markdown formatting slaren 2023-08-16 02:39:15 +02:00
  • 52b94f42c8
    add Bessel's correction to stdev calculation slaren 2023-08-16 00:25:59 +02:00
  • 6ab6971242 add missing include slaren 2023-08-15 23:02:07 +02:00
  • 6597d61ad7 fix msvc build slaren 2023-08-15 22:55:05 +02:00
  • 7ec6158eec add to examples CMakeLists.txt slaren 2023-08-15 22:50:38 +02:00
  • cfc7017b5a llama : add benchmark example slaren 2023-08-15 20:53:14 +02:00
  • 23248d7d32
    llama : minor simplifications Georgi Gerganov 2023-08-15 22:41:55 +03:00
  • f477fb069b
    llama : reorder definitions in .cpp to match .h Georgi Gerganov 2023-08-15 22:29:56 +03:00
  • afd135a64c
    llama : merge gguf-util.h in llama.cpp Georgi Gerganov 2023-08-15 22:09:56 +03:00
  • a02b809a2e
    llama : move hparams and vocab from gguf_file_loader to llama_model_loader Georgi Gerganov 2023-08-15 21:09:27 +03:00
  • 4a1741aa2d
    gptneox-main.cpp : add tensor data layout klosax 2023-08-15 19:56:19 +02:00
  • 2ae0e985b3
    convert-llama-7b-pth-to-gguf.py : add tensor data layout klosax 2023-08-15 19:55:13 +02:00
  • 66756c82af
    convert-llama-h5-to-gguf.py : add tensor data layout klosax 2023-08-15 19:54:33 +02:00
  • 6c3f824697
    llama : simplify gguf_file_loader Georgi Gerganov 2023-08-15 20:53:53 +03:00
  • b6056c3db8
    gguf.py : add tensor data layout klosax 2023-08-15 19:53:44 +02:00
  • 2906d5492d
    gguf : remove obosolete gguf_get_arr_xxx API Georgi Gerganov 2023-08-15 20:46:18 +03:00
  • 1751bd4693
    gguf : remove oboslete write methods Georgi Gerganov 2023-08-15 20:41:53 +03:00
  • f7a6aa9911
    gguf : streaming support when writing files Georgi Gerganov 2023-08-15 19:57:37 +03:00
  • a527eccb43 metal: fix bugs for GQA and perplexity test. lshzh-ww 2023-08-15 11:31:13 -04:00
  • 4ef5e792e3
    llama : replace gguf_file_saver with new gguf write API Georgi Gerganov 2023-08-15 16:30:07 +03:00
  • 03297c1a7c simply code and allow to use the directory containing the file as a valid value as intented in the first place Marc 2023-08-15 16:35:06 +02:00
  • 7e88677af4 Add support for q4_1, q5_0, q5_1 and q8_0 0cc4m 2023-08-15 15:38:57 +02:00
  • 35177d735d
    gguf : minor Georgi Gerganov 2023-08-15 16:05:23 +03:00
  • c9b2f7f1bf
    gguf : fixes + simplify example + add ggml_nbytes_pad() Georgi Gerganov 2023-08-15 16:01:38 +03:00
  • 9eb1ef8653
    move and remove code xaedes 2023-08-15 14:03:02 +02:00
  • 5e059ace25
    add stub example for finetuning, based on train-text-from-scratch xaedes 2023-08-15 13:54:28 +02:00
  • 316b0707f4
    add API functions to access llama model tensors xaedes 2023-08-06 17:28:22 +02:00
  • 4463965401
    gguf : fix header write Georgi Gerganov 2023-08-15 14:39:27 +03:00
  • f6ecd15f83
    gguf : initial write API ready + example Georgi Gerganov 2023-08-15 14:35:00 +03:00
  • 85ebfb8e5d
    gguf : write to file API (not tested) Georgi Gerganov 2023-08-15 14:26:28 +03:00
  • 5cb9d9a87f
    gguf : initial write API (not tested yet) Georgi Gerganov 2023-08-15 13:39:10 +03:00
  • 2d87c9c796
    llama : refactor tensor names (#2622) M. Yusuf Sarıgöz 2023-08-15 13:29:30 +03:00
  • 29743cb83b gguf : define tensor names as constants M. Yusuf Sarıgöz 2023-08-15 12:54:11 +03:00
  • 693bd398c5 gguf: update tensor names searched in quantization M. Yusuf Sarıgöz 2023-08-15 12:37:10 +03:00
  • da424b6699
    llama : gguf_file_saver write I32 Georgi Gerganov 2023-08-15 11:31:42 +03:00
  • 9574f41818
    llama : no need to pass full file loader to the file saver Georgi Gerganov 2023-08-15 11:22:37 +03:00
  • 5c85332e99
    llama : simplify write_header() Georgi Gerganov 2023-08-15 11:11:22 +03:00
  • 6e29ed52fb
    llama : fix method names Georgi Gerganov 2023-08-15 11:10:26 +03:00
  • c9c0b758d4
    llama : simplify gguf_file_saver Georgi Gerganov 2023-08-15 11:09:26 +03:00
  • 66ce19aecb
    llama : fix quantization using gguf tool Georgi Gerganov 2023-08-15 10:55:42 +03:00
  • a82e3a4d92
    llama : style formatting + remove helper methods Georgi Gerganov 2023-08-15 08:51:07 +03:00
  • b5ffb2849d
    scripts : add helper script to get wikitext Georgi Gerganov 2023-08-15 10:04:58 +03:00
  • c545d85f83 Merge branch 'gguf' of https://github.com/ggerganov/llama.cpp into gguf goerch 2023-08-15 08:24:56 +02:00
  • 99e0e90718 Improved tokenizer test goerch 2023-08-15 08:23:35 +02:00
  • 469d70be45 add support for precompiled binaries, used as a fallback Concedo 2023-08-15 13:49:05 +08:00
  • d2049bf03f fix lint and add Makefile drbh 2023-08-15 00:07:24 -04:00
  • 995ddb963d adds simple llama grammar tests drbh 2023-08-14 23:55:45 -04:00
  • bfa455de43 metal: fix performance degradation from gqa lshzh-ww 2023-08-14 23:10:27 -04:00
  • 5f6de2a2bb metal: matrix-matrix multiplication kernel lshzh-ww 2023-08-14 21:11:19 -04:00
  • 2dd5d2c92c
    convert-llama-h5-to-gguf.py : add 70b gqa support klosax 2023-08-15 00:43:10 +02:00
  • 006e74a493 Merge branch 'master' into server-probs jhen 2023-08-15 06:14:57 +08:00
  • 3ebb00935f
    server : add missing /json-schema-to-grammar.mjs (#2616) master-3ebb009 Jhen-Jie Hong 2023-08-15 06:14:14 +08:00
  • ca4758290c
    gguf-llama.cpp : fix n_head_kv klosax 2023-08-14 23:18:41 +02:00
  • 6a316fc1ab server : add missing /json-schema-to-grammar.mjs jhen 2023-08-15 04:05:34 +08:00
  • ab2cbd03ca
    convert-llama-7b-pth-to-gguf.py : add token types klosax 2023-08-14 22:10:50 +02:00