Commit graph

  • 58b367c2d7
    cuBLAS: refactor and optimize f16 mat mul performance (#1259) master-58b367c slaren 2023-05-01 18:11:07 +02:00
  • ea3a0ad6b6
    llama : update stubs for systems without mmap and mlock (#1266) master-ea3a0ad xloem 2023-05-01 08:58:51 -04:00
  • a79756b210 cuBLAS: update block_q5_1 Slaren 2023-05-01 14:51:03 +02:00
  • 1c4dc1e498
    update quantization types in switch-case of add_at and add1 xaedes 2023-05-01 14:30:29 +02:00
  • 72bcfb50c8
    successfully test backward pass of repeat xaedes 2023-05-01 01:11:41 +02:00
  • 8b5b2f089e
    fix backward pass for repeat xaedes 2023-05-01 01:11:12 +02:00
  • ba62c79bd5
    add missing GGML_OP_SUM_ROWS xaedes 2023-05-01 14:29:52 +02:00
  • c4539ede53
    add operation ggml_sum_rows xaedes 2023-05-01 01:10:30 +02:00
  • 2277053839
    add todos for llama backward pass xaedes 2023-04-30 21:42:52 +02:00
  • 2ecc690980
    successfully test backward pass of rms_norm xaedes 2023-04-30 21:39:03 +02:00
  • 84a4b39917
    fix backward pass for rms_norm xaedes 2023-04-30 21:34:21 +02:00
  • b18b72da00
    successfully test backward pass of view_1d, view_2d and view_3d xaedes 2023-04-30 17:22:25 +02:00
  • 84436383eb
    fix view backward pass xaedes 2023-04-30 17:21:21 +02:00
  • f0302fa71b
    successfully test get_rows backward xaedes 2023-04-28 20:32:00 +02:00
  • 96e773bbde
    fix get rows backward pass xaedes 2023-04-28 20:31:36 +02:00
  • 7281f60572
    move dup call into the actual add_at functions xaedes 2023-04-28 20:30:42 +02:00
  • 3dbd649cf9
    fix diag_mask to work with non-inplace input xaedes 2023-04-28 20:03:56 +02:00
  • b9920e5c3e
    test-grad0 : fix test for div xaedes 2023-04-28 20:00:25 +02:00
  • 19f51592b5
    successfully test diag_mask_inf and diag_mask_zero backward xaedes 2023-04-28 18:43:58 +02:00
  • d42531fa56
    fix comments xaedes 2023-04-28 18:22:40 +02:00
  • 1997152f7f
    test-grad0.c add TODO for view_2d and view_3d xaedes 2023-04-28 18:16:55 +02:00
  • c601df973c
    successfully test transpose backward and permute for all permutations xaedes 2023-04-28 18:14:37 +02:00
  • 3d21f2646e
    implement ggml_cont backward pass xaedes 2023-04-28 18:12:25 +02:00
  • 02d3fd0894
    fix sub, mul and div functions to work correctly with transposed tensors xaedes 2023-04-28 18:11:26 +02:00
  • b0555fce95
    some minor test-grad0 fixes xaedes 2023-04-28 17:47:53 +02:00
  • a7a837047c
    successfully test permute backward xaedes 2023-04-28 17:47:23 +02:00
  • 86b44a02e4
    test-grad0.c : add print_elements to help with debugging xaedes 2023-04-28 17:46:55 +02:00
  • 339b2adf48
    fix ggml_forward_add1 functions to work correctly with transposed tensors xaedes 2023-04-28 17:43:50 +02:00
  • b9416d71f8
    fix ggml_forward_add functions to work correctly with transposed tensors xaedes 2023-04-28 17:42:24 +02:00
  • 410a47a79e
    minor code format improvement xaedes 2023-04-27 17:00:40 +02:00
  • 124fdca973
    successfully test view backward xaedes 2023-04-28 18:36:07 +02:00
  • cecd6c7665
    bug fix for add_at forward xaedes 2023-04-27 16:58:22 +02:00
  • 83fa6b3bcb
    fix ggml_compute_forward_dup_same_cont for when nelements < nthreads xaedes 2023-05-01 14:42:44 +02:00
  • 2bdc09646d
    ggml : fix ggml_used_mem() (#1264) master-2bdc096 Kerfuffle 2023-05-01 05:56:07 -06:00
  • 70269cae37
    llama : fix session load / save (#1263) master-70269ca Georgi Gerganov 2023-05-01 14:54:59 +03:00
  • 3e023c59b8 update stubs for systems without mmap and mlock John Doe 2023-05-01 07:45:54 -04:00
  • 4cd0a480bf fix build Slaren 2023-05-01 00:12:46 +02:00
  • a9ad140c17 cuBLAS: use multiple streams, choose smartly between mul_mat_q and mul_mat_f16 Slaren 2023-04-30 23:46:19 +02:00
  • cf93fdcfda cuBLAS: refactor, convert fp16 to fp32 on device Slaren 2023-04-30 20:38:12 +02:00
  • b925f1f1b0
    cuBLAS: fall back to pageable memory if pinned alloc fails (#1233) master-b925f1f slaren 2023-05-01 13:32:22 +02:00
  • acae41c7ac ggml_used_mem can segfault if called before any objects are created. KerfuffleV2 2023-05-01 04:21:33 -06:00
  • c0335b51f9
    llama : fix session load / save Georgi Gerganov 2023-05-01 11:08:09 +03:00
  • 90b19bd6ee
    llama : let context be const when accessing const data (#1261) master-90b19bd Alex Klinkhamer 2023-05-01 00:24:20 -07:00
  • 4d38795563 add UI for token unbanning Concedo 2023-05-01 12:10:21 +08:00
  • 3de34ee492 Merge branch 'master' into concedo_experimental Concedo 2023-05-01 12:03:46 +08:00
  • 560dacedbd update readme Concedo 2023-05-01 11:41:25 +08:00
  • 52892512cd Short hash, less fancy Makefile, and don't modify build-info.h if it wouldn't change it Danny Daemonic 2023-04-30 18:39:33 -07:00
  • efe84ca371
    llama : let context be const when accessing const data grencez 2023-04-30 19:50:17 -07:00
  • ddb3e88376 4 space indenting for cmake, attempt to clean up my mess in Makefile Danny Daemonic 2023-04-30 00:48:23 -07:00
  • 3e7ec6e1e2 Broke out build-info.cmake, added find_package fallback, and added build into to all examples, added dependencies to Makefile Danny Daemonic 2023-04-29 15:20:21 -07:00
  • d2b6d2ce39 Fix conditional dependency on missing target Danny Daemonic 2023-04-29 12:13:57 -07:00
  • eac5d689dc Redo "CMAKE_CURRENT_SOURCE_DIR" and clearer build messages Danny Daemonic 2023-04-29 09:57:22 -07:00
  • db0b8357b6 "build (hash)" and "CMAKE_SOURCE_DIR" changes Danny Daemonic 2023-04-29 09:09:51 -07:00
  • fbcfc8446c macOS fix Danny Daemonic 2023-04-29 06:28:39 -07:00
  • 8272f4aed8 Add git-based build information for better issue tracking Danny Daemonic 2023-04-29 05:55:22 -07:00
  • c1a8893de3
    de-duplicate ggml_forward_dup code taking care of contiguous tensors of same type. xaedes 2023-04-27 16:55:22 +02:00
  • 38675e537c
    add shape annotations for llama xaedes 2023-04-27 16:39:41 +02:00
  • 93106504fd
    align shape annotations xaedes 2023-04-27 00:21:31 +02:00
  • fea42be47a
    successfully test soft_max backward xaedes 2023-04-27 00:16:18 +02:00
  • 1a80e9a0fa
    correctly implement softmax backward pass using new operation ggml_diag xaedes 2023-04-27 00:13:43 +02:00
  • 54ab300cc4
    add test-opt.c xaedes 2023-04-26 21:35:20 +02:00
  • ecf949b10f
    successfully test reshape backward xaedes 2023-04-26 20:34:33 +02:00
  • c483a7dac5
    bug fix for reshape backward pass xaedes 2023-04-26 20:34:08 +02:00
  • b2bd8222da
    successfully test cpy backward xaedes 2023-04-26 20:14:52 +02:00
  • 0ea8201c86
    bug fix for cpy backward pass xaedes 2023-04-26 20:14:33 +02:00
  • 7571147242
    successfully test rope backward xaedes 2023-04-26 00:46:49 +02:00
  • b583136cfa
    improve performance of sqr backward pass xaedes 2023-04-26 00:46:20 +02:00
  • bfe507213c
    improve performance of sum backward pass xaedes 2023-04-26 00:43:02 +02:00
  • 0197bcb0ff
    successfully test scale backward xaedes 2023-04-25 22:26:26 +02:00
  • a367eb9eda
    bug fix for scale backward pass xaedes 2023-04-25 22:25:53 +02:00
  • 671e5922e2
    successfully test silu backward xaedes 2023-04-25 22:06:05 +02:00
  • 6fb08b4554
    bug fixes for silu_back xaedes 2023-04-25 22:05:45 +02:00
  • 9d6fc28f18
    disable graph dot export as it floods console xaedes 2023-04-25 22:05:22 +02:00
  • 9345f4c3a5
    test both gradients of mul_mat xaedes 2023-04-24 22:37:09 +02:00
  • 20e3c1d2b4
    use GGML_PRINT_DEBUG for debug messages which will otherwise flood the console xaedes 2023-04-24 21:21:50 +02:00
  • 0da26753fd
    add test-grad0.c xaedes 2023-04-25 21:32:05 +02:00
  • 4e1f81d32f
    implement backward pass for ggml_get_rows and for new operation ggml_get_rows_back xaedes 2023-04-24 22:49:34 +02:00
  • 488decfdc5
    implement backward pass of ggml_rope and ggml_rope_back xaedes 2023-04-24 19:06:16 +02:00
  • 36d8a051d4
    remove already resolved TODO xaedes 2023-04-24 05:54:51 +02:00
  • b908007471
    norm & rms_norm can not be threaded: xaedes 2023-04-24 04:13:33 +02:00
  • b164343529
    implement 5 of 6 missing backward pass operations used by llama xaedes 2023-05-01 02:20:14 +02:00
  • 73ac18d856
    implement 8 of 14 missing backward pass operations used by llama xaedes 2023-05-01 02:39:54 +02:00
  • c9fdebc02c Hotfix prompt caching introduced in #1169, fixes #1257 Ivan Stepanov 2023-05-01 03:34:34 +03:00
  • 635327c355 Bump version Ivan Stepanov 2023-04-30 23:33:18 +03:00
  • dd88594585 Save prompt after initial prompt eval (fixes #1257) Ivan Stepanov 2023-04-30 23:16:41 +03:00
  • 7ff0dcd320
    ggml : fix UB (int << 31) master-7ff0dcd Georgi Gerganov 2023-04-30 22:28:51 +03:00
  • 6f79699286
    build: add armv{6,7,8} support to cmake (#1251) master-6f79699 Pavol Rusnak 2023-04-30 20:48:38 +02:00
  • a5d30b1f53
    common : better default number of threads (#934) master-a5d30b1 jon-chuang 2023-04-30 14:41:35 -04:00
  • 76a884920a
    ggml : add CLBlast q5_0, q5_1, q8_0 dequant kernels (#1225) master-76a8849 0cc4m 2023-04-30 20:34:52 +02:00
  • 6bc4400e67
    ggml : add Q5 WASM SIMD + GGML_FTYPE master-6bc4400 Georgi Gerganov 2023-04-30 19:07:00 +03:00
  • 25201233ca fixed unbantokens not following EOS Concedo 2023-05-01 00:02:45 +08:00
  • 294a5d00b1 Merge remote-tracking branch 'occam/clblast-further-dequant-kernels' into concedo_experimental Concedo 2023-04-30 23:56:24 +08:00
  • 3b5df18dbb temp fix for compilation issues on OSX (M1) Concedo 2023-04-30 23:48:46 +08:00
  • c73def129a
    Merge 'origin/master' into hipblas Henri Vasserman 2023-04-30 18:40:42 +03:00
  • 979010cdba minor jon-chuang 2023-04-30 21:02:55 +08:00
  • e112522aa9 Merge branch 'master' of https://github.com/ggerganov/llama.cpp into jon/tall-and-skinny-matmul jon-chuang 2023-04-30 20:57:32 +08:00
  • 470cc4c5d1 minor jon-chuang 2023-04-30 20:56:46 +08:00
  • f0d70f147d
    Various fixes to mat_mul benchmark (#1253) master-f0d70f1 Stephan Walter 2023-04-30 12:32:37 +00:00
  • 363f72de85 Rename benchmark-q4_0-matmult.cpp -> benchmark-matmult.cpp Stephan Walter 2023-04-30 13:55:42 +02:00
  • f5a5cc9e6a
    build: add armv{6,7,8} support to cmake Pavol Rusnak 2023-04-30 11:13:03 +02:00