Commit graph

  • 4e49714d20 Merge remote-tracking branch 'origin/master' into sl/dl-backend slaren 2024-11-14 16:21:00 +01:00
  • 08318dc178 ggml: separate musa into its own section in the Makefile Xiaodong Ye 2024-11-14 22:45:06 +08:00
  • 4fc8409a71
    Update llama.cpp FirstTimeEZ 2024-11-15 03:36:03 +13:00
  • 3eff3c311e Call syclcompat::dp4a inside dpct::dp4a romain.biessy 2024-11-14 14:13:05 +00:00
  • d205ee9273
    llama_model_n_params FirstTimeEZ 2024-11-15 03:11:35 +13:00
  • 0a7fecc17c
    llama_model_n_params FirstTimeEZ 2024-11-15 03:11:14 +13:00
  • 78a526e39c
    Merge 02c75452c1 into 4a8ccb37ad rhjdvsgsgks 2024-11-14 14:05:15 +00:00
  • 8e2e630405 fix mem leakage based on leaks tool (still WIP) 李为 2024-11-14 22:04:01 +08:00
  • 1c16516004 Reword doc romain.biessy 2024-11-14 12:23:20 +00:00
  • 970f1515c9
    total size of tensors is size_t FirstTimeEZ 2024-11-15 01:23:17 +13:00
  • aeb803200b
    Merge branch 'ggerganov:master' into patch-2 FirstTimeEZ 2024-11-15 01:00:25 +13:00
  • 4a8ccb37ad
    CUDA: no -sm row for very small matrices (#10185) b4079 Johannes Gäßler 2024-11-14 13:00:15 +01:00
  • 229cd05f0d
    removes the implicit cast FirstTimeEZ 2024-11-15 00:58:53 +13:00
  • 443f5cb5bd
    Merge branch 'ggerganov:master' into patch-2 FirstTimeEZ 2024-11-15 00:48:38 +13:00
  • 76355779d0
    save number of parameters and the size in llama_model FirstTimeEZ 2024-11-14 22:51:47 +13:00
  • 2a82891a85
    speculative : fix out-of-bounds access (#10289) b4078 Georgi Gerganov 2024-11-14 11:44:15 +02:00
  • c8a15ce1d0
    Merge 5e6dad9322 into 33bdee667e Georgi Gerganov 2024-11-14 09:42:04 +00:00
  • 5e6dad9322
    speculative : experimenting with Qwen2.5 gg/speculative-experiments Georgi Gerganov 2024-11-14 11:31:31 +02:00
  • 33bdee667e
    speculative : fix out-of-bounds access gg/speculative-fix-oob Georgi Gerganov 2024-11-14 11:23:45 +02:00
  • 0a0f91df61
    save number of parameters and the size in llama_model FirstTimeEZ 2024-11-14 20:40:14 +13:00
  • aad0167bc3 audio embedding free() (but still memory leakage detected) 李为 2024-11-14 14:50:49 +08:00
  • eb03ae33b8
    Merge a0bd8f0343 into af148c9386 Michael Podvitskiy 2024-11-14 14:04:27 +08:00
  • f263838447
    Merge 256707309f into af148c9386 LIU Xiao 2024-11-14 14:03:00 +08:00
  • 8373e94db5
    Merge 1e8646b3e8 into af148c9386 qlylangyu 2024-11-14 14:02:06 +08:00
  • 73d5939c67
    Merge b979fc97ba into af148c9386 Jared Van Bortel 2024-11-14 14:02:03 +08:00
  • a6016508ec
    Merge fa5b31a5ca into af148c9386 Yifan Gu 2024-11-14 14:01:42 +08:00
  • c834937068
    Merge afc4a7de65 into af148c9386 Diego Devesa 2024-11-14 14:01:11 +08:00
  • af148c9386
    vulkan: Optimize binary ops (#10270) b4077 Jeff Bolz 2024-11-13 23:22:55 -06:00
  • 9d08ae32c3
    Merge branch 'ggerganov:master' into server-chat-templates MaggotHATE 2024-11-14 09:21:50 +05:00
  • b9845b4f63
    Merge pull request #24 from NexaAI/weili/master-release Zack Li 2024-11-13 17:16:39 -08:00
  • 74d660ab19
    Update ggml/CMakeLists.txt Diego Devesa 2024-11-14 02:07:00 +01:00
  • fc25544867 [memory leakage] fixed a leakage by projector free 李为 2024-11-14 08:32:55 +08:00
  • dc2313564d Merge remote-tracking branch 'origin/master' into sl/dl-backend slaren 2024-11-14 00:57:44 +01:00
  • fc66c4bf6d only use AMX on x86 slaren 2024-11-14 00:33:16 +01:00
  • 0b71c6c38f vulkan: Optimize binary ops Jeff Bolz 2024-11-11 22:58:15 -06:00
  • 66798e42fb
    vulkan: Use macros to make the mat mul pipeline creation more concise (#10259) b4076 Jeff Bolz 2024-11-13 14:59:47 -06:00
  • e503ad101d fix sanitizers build slaren 2024-11-13 21:31:57 +01:00
  • 796f05be62 fix editorconfig slaren 2024-11-13 20:34:11 +01:00
  • bc4f6cb64d update cuda & musa dockerfiles slaren 2024-11-13 20:33:53 +01:00
  • e0b321b89b add missing libraries to ggml cmake install slaren 2024-11-13 20:33:38 +01:00
  • e541f7ffbe
    Update cmake/llama-config.cmake.in Diego Devesa 2024-11-13 19:41:33 +01:00
  • 125c23575b Merge remote-tracking branch 'origin/master' into sl/dl-backend slaren 2024-11-13 19:09:30 +01:00
  • 4a80bb95e7
    ggml: build musa backend library (cmake) (#10280) R0CKSTAR 2024-11-14 02:07:24 +08:00
  • fb4a0ec083
    llama : propagate the results of graph_compute (#9525) b4075 Michael Podvitskiy 2024-11-13 20:00:35 +02:00
  • 9ef5d08927
    llama : add comments about KV cache state after error Georgi Gerganov 2024-11-13 19:59:20 +02:00
  • ee76375b64 Update CI Windows oneAPI version to 2025.0 romain.biessy 2024-11-13 17:33:05 +00:00
  • b665ffd4b8 Update news section romain.biessy 2024-11-13 17:28:24 +00:00
  • 5ea926dad7
    sync : ggml Georgi Gerganov 2024-11-13 18:11:54 +02:00
  • 235a268f96
    Update test-tokenizer-random.py Robert 2024-11-13 07:49:38 -08:00
  • 42eb364db5
    metal : fix build and swift package (#10279) Georgi Gerganov 2024-11-13 15:57:24 +02:00
  • 051a445048 ggml: build musa backend library (cmake) Xiaodong Ye 2024-11-13 21:22:24 +08:00
  • 551edceaf6
    metal : fix build and swift package Georgi Gerganov 2024-11-13 14:51:58 +02:00
  • 1ee9eea094
    docs : update bindings list (#10261) b4073 Small Grass Forest 2024-11-13 19:17:10 +08:00
  • ff7fb670d0
    server : add missing docs (#10269) Alexey Parfenov 2024-11-13 11:16:30 +00:00
  • 0e712a5acb
    server : fix incorrect res in validate_model_chat_template (#10272) b4071 Jhen-Jie Hong 2024-11-13 19:15:23 +08:00
  • a0ec17b32e
    metadata: Detailed Dataset Authorship Metadata (#8875) Brian 2024-11-13 21:10:38 +11:00
  • 2e82ffa4af
    sycl : Fixes to broken builds and test-backend-ops (#10257) b4069 Alberto Cabrera Pérez 2024-11-13 09:40:57 +00:00
  • a1ee42d521
    Merge branch 'ggerganov:master' into server-chat-templates MaggotHATE 2024-11-13 12:15:29 +05:00
  • 80dd7ff22f
    vulkan: Optimize contiguous copies (#10254) b4068 Jeff Bolz 2024-11-13 00:58:57 -06:00
  • 5edd022d6a
    Update test-tokenizer-random.py Robert 2024-11-12 22:28:52 -08:00
  • 3275e29360
    Update test-tokenizer-random.py Robert 2024-11-12 22:28:30 -08:00
  • 18489671bf
    Update test-tokenizer-random.py Robert 2024-11-12 22:27:41 -08:00
  • 1574884483
    Update test-tokenizer-random.py Robert 2024-11-12 22:26:56 -08:00
  • 82a4012c2a
    Update test-tokenizer-random.py Robert 2024-11-12 22:24:35 -08:00
  • db26ba5b5c
    Update test-tokenizer-random.py Robert 2024-11-12 22:24:03 -08:00
  • 60fd27b68d Update test-tokenizer-random.py Robert 2024-11-12 22:16:34 -08:00
  • 9bb2f9b63d
    Merge branch 'ggerganov:master' into test-tokenizer-0-rewrite Robert 2024-11-12 22:02:44 -08:00
  • 0b069a4710 server : fix chat res Jhen-Jie Hong 2024-11-13 10:51:00 +08:00
  • 71c2c7fb8b server : fix validate_model_chat_template Jhen-Jie Hong 2024-11-13 10:31:58 +08:00
  • dd49f08852 fixes slaren 2024-11-13 01:01:40 +01:00
  • c54b67c028
    Merge branch 'ggerganov:master' into avx_opt Eve 2024-11-12 23:48:18 +00:00
  • a847973656 16 bit add for q4_0 only Eve 2024-11-12 18:47:23 -05:00
  • d5dd7ed7ee metal install fix slaren 2024-11-12 23:34:50 +01:00
  • 307ef9a588 update Makefile slaren 2024-11-12 23:07:50 +01:00
  • dddf3771c2 use reference quantization fns in AMX until moved to CPU backend slaren 2024-11-12 22:06:00 +01:00
  • 5cfaecd34c remove remaining GGML_EXTRA_* and GGML_CDEF_* uses slaren 2024-11-12 21:39:00 +01:00
  • c8da7d0f70 add hip slaren 2024-11-12 20:52:46 +01:00
  • 710822f32e add amx, cann, sycl slaren 2024-11-12 19:40:25 +01:00
  • 646e91a642 add vulkan and kompute slaren 2024-11-12 18:44:18 +01:00
  • 1d6bd2d953
    server : add missing docs ZXED 2024-11-12 20:32:31 +03:00
  • 90cb61d692 sycl: Use syclcompat::dp4a romain.biessy 2024-11-07 10:41:14 +00:00
  • 8c1b186cb5
    metal : minor Q4_0 optimization gg/metal-q4_0-opt Georgi Gerganov 2024-11-12 15:30:51 +02:00
  • efdd713023 more build fixes slaren 2024-11-12 13:56:28 +01:00
  • a2112f095c Removed unused function MaggotHATE 2024-11-12 15:21:59 +05:00
  • 86ed72d20c
    ggml : add ggml-metal-impl.h Georgi Gerganov 2024-11-10 18:29:09 +02:00
  • 63bab93c48
    metal : add TODOs for rest of ops Georgi Gerganov 2024-11-10 17:56:12 +02:00
  • 964206a780
    metal : GGML_OP_NORM Georgi Gerganov 2024-11-10 17:17:18 +02:00
  • e9ecd5d4de
    metal : GGML_OP_RMS_NORM Georgi Gerganov 2024-11-10 15:31:43 +02:00
  • 647a7044f5
    metal : GGML_OP_CPY Georgi Gerganov 2024-11-10 13:55:26 +02:00
  • f46f710ca6
    metal : GGML_OP_REPEAT Georgi Gerganov 2024-11-10 13:21:59 +02:00
  • 3250c98bf6
    metal : GGML_OP_ADD, GGML_OP_SUB, GGML_OP_MUL, GGML_OP_DIV Georgi Gerganov 2024-11-10 13:16:54 +02:00
  • 9058c51d9d
    metal : GGML_OP_CONCAT Georgi Gerganov 2024-11-10 13:03:25 +02:00
  • bb821e4854
    cont : int safety + register optimizations Georgi Gerganov 2024-11-10 11:05:10 +02:00
  • c5cf1d74f0
    cont : mul mm id Georgi Gerganov 2024-11-10 10:32:15 +02:00
  • 15a7105967
    cont : thread counters style Georgi Gerganov 2024-11-10 09:57:41 +02:00
  • cacc4c225f
    cont : shmem style Georgi Gerganov 2024-11-10 09:45:06 +02:00
  • a1a201c1a9
    cont : use char ptr Georgi Gerganov 2024-11-10 09:26:53 +02:00
  • c81640a5fc
    cont : args is first argument Georgi Gerganov 2024-11-10 08:47:30 +02:00
  • b65e4c1e10
    cont : pass by reference Georgi Gerganov 2024-11-10 08:10:22 +02:00
  • c59a13d93f
    cont : mul mat vec Georgi Gerganov 2024-11-09 22:56:39 +02:00