Commit graph

  • 6b3a7b9f4f
    Update convert-llama-h5-to-gguf.py klosax 2023-07-31 03:02:00 +02:00
  • 4f5b6224be
    Update convert-gptneox-h5-to-gguf.py klosax 2023-07-31 03:00:20 +02:00
  • 5073b0f5d8 CUDA: fewer memory bank conflicts for mul_mat_q JohannesGaessler 2023-07-30 13:17:29 +02:00
  • 74fb31bd35 move asserts slaren 2023-07-30 19:50:57 +02:00
  • dc6e677c40 Reduce overhead of mul_f32 calls by using a single command buffer 0cc4m 2023-07-30 19:10:15 +02:00
  • 485bbe1a78 fix Metal backend broken from the allocator changes slaren 2023-07-30 18:26:22 +02:00
  • 2a0914673c
    Update convert-gptneox-h5-to-gguf.py klosax 2023-07-30 17:31:11 +02:00
  • 068a8e0fbe
    Update convert-llama-h5-to-gguf.py klosax 2023-07-30 17:29:56 +02:00
  • a37d31f29b use the appropriate format specifier for size_t, which is %zu mendax0110 2023-07-30 17:06:10 +02:00
  • 30c4ea47e6
    add gptneox gguf example klosax 2023-07-30 16:59:26 +02:00
  • 20bd792736 make auto const mendax0110 2023-07-30 16:59:17 +02:00
  • da8fe7ac02
    Merge branch 'ggerganov:master' into master m3ndax 2023-07-30 16:58:17 +02:00
  • 5ea5d19d6a SSE emoji fix Concedo 2023-07-30 22:31:20 +08:00
  • a9a2647536 fixed whitespace maddes8cht 2023-07-30 16:30:53 +02:00
  • 2fabc176ce
    Update convert-llama-h5-to-gguf.py klosax 2023-07-30 16:28:08 +02:00
  • 582c825738 Use single command buffer for matrix vector multiplication ops 0cc4m 2023-07-30 16:25:58 +02:00
  • a113689571
    ggml : add graph tensor allocator (#2411) master-a113689 slaren 2023-07-30 15:58:01 +02:00
  • 570aa7ceeb rename ggml_allocator to ggml_allocr slaren 2023-07-29 15:01:43 +02:00
  • 9df732dae4 introduce validate_params, use it in gpt_params_parse. maddes8cht 2023-07-30 15:28:23 +02:00
  • f175b05872
    Makefile : add gptneox gguf example klosax 2023-07-30 15:08:37 +02:00
  • e9192b0135
    add gptneox gguf example klosax 2023-07-30 15:05:37 +02:00
  • 4ed98bf1ab
    Update convert-llama-h5-to-gguf.py klosax 2023-07-30 15:01:47 +02:00
  • b19c11750b
    ggml.c : add gguf_get_arr_n klosax 2023-07-30 14:58:50 +02:00
  • b4676ee447
    ggml.h : increase GGML_MAX_NAME to 64 klosax 2023-07-30 14:51:37 +02:00
  • ccd81a751b
    gguf.py : add layer norm eps and merges klosax 2023-07-30 14:48:14 +02:00
  • 0790c121aa
    constants.py : add layer norm eps klosax 2023-07-30 14:46:36 +02:00
  • 82d0695f0f Merge commit '9baf9ef304' into concedo_experimental Concedo 2023-07-30 18:18:23 +08:00
  • 90a37d63d5 up ver, added warning for max context Concedo 2023-07-30 18:07:14 +08:00
  • c8af65760f
    Hide unavailable backends & Add tooltip over backend count (#352) YellowRoseCx 2023-07-30 04:50:55 -05:00
  • 45456fa6ca switch noavx2 to not use openblas, as it has incompatible instructions Concedo 2023-07-30 16:47:33 +08:00
  • 23825abee1 fix wrong key Concedo 2023-07-30 14:30:46 +08:00
  • 87c34e4dd4 gguf : update convert-llama-h5-to-gguf.py M. Yusuf Sarıgöz 2023-07-30 01:09:22 +03:00
  • 32e037ffbe gguf : fix set is not subscriptable M. Yusuf Sarıgöz 2023-07-30 01:01:13 +03:00
  • 11f3ca06b8
    CUDA: Quantized matrix matrix multiplication (#2160) master-11f3ca0 Johannes Gäßler 2023-07-29 23:04:44 +02:00
  • 9baf9ef304
    CUDA: faster multi GPU synchronization (#2448) master-9baf9ef Johannes Gäßler 2023-07-29 23:04:10 +02:00
  • 06c3e4a1a7
    Update convert-llama-h5-to-gguf.py klosax 2023-07-29 21:38:01 +02:00
  • d641b80660 CUDA: faster multi GPU synchronization JohannesGaessler 2023-07-29 20:53:30 +02:00
  • 9577821487
    gguf.py : support any type klosax 2023-07-29 21:29:07 +02:00
  • 0b206788dc add static to test-grad0.c internal functions netrunnereve 2023-07-29 15:12:05 -04:00
  • 2c22e3bcdb
    ggml.c : get arr str and f32 klosax 2023-07-29 20:37:47 +02:00
  • 49580fe816 c++11 cannot use designated initializers netrunnereve 2023-07-29 14:36:33 -04:00
  • 34469b9ea7
    ggml.h : get array str and f32 klosax 2023-07-29 20:36:06 +02:00
  • dc9b9f3272 fix hellaswag print format, cast away warning in test-double-float netrunnereve 2023-07-29 13:55:53 -04:00
  • 0bb22bb4df Fix multi GPU out-of-bounds JohannesGaessler 2023-07-29 19:31:30 +02:00
  • 0f5e57f01d gguf : handle already encoded string M. Yusuf Sarıgöz 2023-07-29 19:56:06 +03:00
  • 0b5f989122 Fix CMakeLists.txt JohannesGaessler 2023-07-29 17:45:13 +02:00
  • 4336231a32
    add hipBLAS to README Henri Vasserman 2023-07-29 18:35:56 +03:00
  • 8ad7cd49fb
    Update convert-llama-h5-to-gguf.py klosax 2023-07-29 16:47:00 +02:00
  • c0dfd5a5e0 Fix CMakeLists.txt JohannesGaessler 2023-07-29 16:04:19 +02:00
  • 592594f110
    Merge branch 'ggerganov:master' into develop Stephen Nichols 2023-07-29 08:17:32 -05:00
  • f8e3fc6c74
    rocblas init stuff Henri Vasserman 2023-07-29 14:16:46 +03:00
  • 0317c41d98 gguf : upd gguf conversion script M. Yusuf Sarıgöz 2023-07-29 13:31:07 +03:00
  • cc3dd7f042 gguf : write tokenizer data M. Yusuf Sarıgöz 2023-07-29 13:30:22 +03:00
  • 8a76dd8a85 gguf : write tensors one by one M. Yusuf Sarıgöz 2023-07-29 13:17:28 +03:00
  • d2ade639f4
    Merge 'origin/master' into hipblas Henri Vasserman 2023-07-29 12:59:48 +03:00
  • c861e234f4 gguf : write tensors one by one M. Yusuf Sarıgöz 2023-07-29 12:49:01 +03:00
  • cde3760e52 Merge branch 'master' into concedo_experimental Concedo 2023-07-29 17:47:00 +08:00
  • 0c219fb5b5 gguf : fix writing gguf arrays M. Yusuf Sarıgöz 2023-07-29 12:42:54 +03:00
  • aa4b2c9375 Updated README, CMakeLists JohannesGaessler 2023-07-29 11:40:56 +02:00
  • 93f7f7aef7 gguf : write tensors one by one and code reuse M. Yusuf Sarıgöz 2023-07-29 12:34:35 +03:00
  • 9589d52079 added help link Concedo 2023-07-29 17:33:15 +08:00
  • aa99562d70 Merge branch 'gguf' of https://github.com//ggerganov/llama.cpp into gguf M. Yusuf Sarıgöz 2023-07-29 12:26:11 +03:00
  • ea5f9ad2ca gguf : fix writing gguf arrays M. Yusuf Sarıgöz 2023-07-29 12:25:43 +03:00
  • 999431c4b6
    quick and dirty conversion example klosax 2023-07-29 11:20:05 +02:00
  • d54f53ca51 gguf : add tokenization constants M. Yusuf Sarıgöz 2023-07-29 12:04:45 +03:00
  • a4e9c92292
    Merge branch 'ggerganov:master' into master m3ndax 2023-07-29 10:15:57 +02:00
  • 06f423a8e1 gguf : write sample tensors to read M. Yusuf Sarıgöz 2023-07-29 10:26:26 +03:00
  • 08dc8fd884 gguf : do not hardcode tensor names to read M. Yusuf Sarıgöz 2023-07-29 10:24:46 +03:00
  • 656c1ab302 DMMV_F16 -> F16 JohannesGaessler 2023-07-29 08:28:14 +02:00
  • 495c898171 Update Makefile JohannesGaessler 2023-07-29 08:22:25 +02:00
  • 038ed63195 Updated Makefile JohannesGaessler 2023-07-29 08:03:30 +02:00
  • 3c09e11c97 GGML_CUDA_MMQ_Y JohannesGaessler 2023-07-29 07:31:38 +02:00
  • e4b42e5b15 fixed gui bugs Concedo 2023-07-29 11:15:57 +08:00
  • 22cb368dd9
    remove trailing whitespace xaedes 2023-07-28 23:55:30 +02:00
  • 9475cdb7a3 Merge branch 'gguf-write-tokenization' into gguf M. Yusuf Sarıgöz 2023-07-29 00:36:35 +03:00
  • 1495735aac gguf : fix writing tensors M. Yusuf Sarıgöz 2023-07-29 00:26:22 +03:00
  • c1a5e116a4
    llama training : fix ggml_rms_norm_back calls to pass configurable eps xaedes 2023-07-28 23:10:55 +02:00
  • ecdc16163e
    ggml : update ggml_rms_norm_back with configurable eps xaedes 2023-07-28 23:09:56 +02:00
  • 87035b96f7
    remove out-commented vectorized code of opt_adam xaedes 2023-07-03 18:56:05 +02:00
  • 0f6a8ab519
    tighten abs error bounds for sqrt in test-grad0 xaedes 2023-07-03 18:48:57 +02:00
  • 47055c929f
    tighten abs error bounds for flash_attn in test-grad0 xaedes 2023-07-03 18:45:54 +02:00
  • dbbc263313
    add conditional compilation of using F16 exp in flash attention xaedes 2023-07-03 18:45:18 +02:00
  • 1065c3b7b9
    tighten abs error bounds for cross_entropy_loss in test-grad0 xaedes 2023-07-03 18:35:11 +02:00
  • 24a4b099f3
    change sampling parameters for prediction after training to defaults of common.h xaedes 2023-07-03 18:24:57 +02:00
  • 17a0898d50
    fix increase of model.train_samples and model.train_tokens xaedes 2023-07-03 17:58:09 +02:00
  • 58024d3e5f
    rename training parameter cos-decay-alpha to cos-decay-min and clarify that adam-min-alpha also applies to warmup xaedes 2023-07-03 17:57:08 +02:00
  • e6ff0728e0
    add minimum number of tensor dimensions to apply weight decay (default 2) xaedes 2023-07-02 23:01:38 +02:00
  • d7aa4d9576
    use optimization callback in training xaedes 2023-07-02 22:18:50 +02:00
  • bfc3119139
    add optimization callback to ggml_opt_resume_g xaedes 2023-07-02 22:15:08 +02:00
  • e843d6e71c
    measure and print total training time xaedes 2023-07-02 21:38:52 +02:00
  • ff759d957c
    remove unused function argument from get_example_targets_batch xaedes 2023-07-02 21:38:03 +02:00
  • ce937bc431
    replace memcpy with reshape operation so that the graph is not cut at the input xaedes 2023-07-02 21:36:56 +02:00
  • c6a18e15c1
    add more training parameters: xaedes 2023-07-02 21:33:47 +02:00
  • d0fbb7d328
    llama : fix rope usage in train-text-from-scratch after ChatGLM change xaedes 2023-07-28 23:05:02 +02:00
  • fc379a2de3
    disable gradient checkpointing debug output xaedes 2023-07-02 21:12:25 +02:00
  • 3744a9be74
    improve gradient checkpointing xaedes 2023-07-02 21:11:11 +02:00
  • 51dc77092f
    change cross_entropy_loss to output average over all rows xaedes 2023-07-02 21:05:12 +02:00
  • 87febeec91
    improve finite differences of test-grad0 by using double instead of float xaedes 2023-07-02 20:59:36 +02:00
  • 864e7e3aa1
    fix test-grad0 for soft_max xaedes 2023-07-02 20:58:52 +02:00
  • 2d1e6e0675
    fix test-grad0 for cross_entropy_loss xaedes 2023-07-02 20:57:58 +02:00