Commit graph

  • 5589921ef8
    readme : minor (#5204) Romain Neutron 2024-01-30 10:16:38 +01:00
  • 3d46754298
    Merge branch 'master' into sycl_win_build Neo Zhang Jianyu 2024-01-30 17:16:26 +08:00
  • 49f44b5c55
    readme : update hot topics Georgi Gerganov 2024-01-30 11:14:44 +02:00
  • 6685cc41c2
    server : improve README (#5209) Wu Jian Ping 2024-01-30 17:11:46 +08:00
  • 0ca32f712a fix grammer issue Zhang 2024-01-30 16:48:39 +08:00
  • ead00f69c7 enhance README for explames/server Wu Jian Ping 2024-01-30 16:48:18 +08:00
  • cb9c35aa45 fix no new line issue, add -j Zhang 2024-01-30 16:44:14 +08:00
  • ed62b08be7 rm no new line Zhang 2024-01-30 16:34:18 +08:00
  • 61379ddabf restore as base Zhang 2024-01-30 15:55:16 +08:00
  • aa6698a80a Speed up computing sign bits in AVX2 iq2_xs dot product Peter Reid 2024-01-29 15:22:52 -05:00
  • 07fb462ac8 restore other CI part Zhang 2024-01-30 15:14:09 +08:00
  • 05da43b910 fix win build Zhang 2024-01-30 14:37:09 +08:00
  • 379f89fbbe fix bug in pool2d kernel zhangjidong 2024-01-30 11:06:37 +08:00
  • 68fd9f46bb fix win build Zhang 2024-01-30 10:50:59 +08:00
  • 3a9480e4af fix win build Zhang 2024-01-30 10:33:10 +08:00
  • 1556d4ca17 fix pool2d_kernel zhangjidong 2024-01-30 10:28:18 +08:00
  • 41a34cb3de code clean zhangjidong 2024-01-30 09:53:07 +08:00
  • 10c743d2ff
    build vulkan as object Eve 2024-01-30 01:24:40 +00:00
  • 824ba83a96
    Makefile to generate .a library for static linking Ali Nehzat 2024-01-30 11:01:56 +11:00
  • 76ddf85515 fix install cmd Zhang 2024-01-30 07:54:47 +08:00
  • a4c777250e
    Fix code formatting in README.md Romain Neutron 2024-01-30 00:02:03 +01:00
  • ceebbb5b21
    ggml alloc: Fix for null dereference on alloc failure (#5200) b2008 Paul Tsochantaris 2024-01-29 22:19:29 +00:00
  • 9581b8cf85 Fixed the fix of the fix Paul Tsochantaris 2024-01-29 22:18:35 +00:00
  • 118d2d7443 Freeing the allocated buffers rather than the pointer in ggml-alloc.c Paul Tsochantaris 2024-01-29 22:13:34 +00:00
  • 6daa69ee81
    kompute : fix fallback to CPU (#5201) b2007 Jared Van Bortel 2024-01-29 17:11:27 -05:00
  • 039210ff3d kompute : fix fallback to CPU Jared Van Bortel 2024-01-29 17:03:04 -05:00
  • 4dc07c1c97 Merge branch 'master' into null-dereference-on-alloc-failure Paul Tsochantaris 2024-01-29 21:00:34 +00:00
  • e9dcc37498 Fix for a null pointer dereference if a metal GGML buffer fails to be allocated Paul Tsochantaris 2024-01-29 20:59:43 +00:00
  • d685360c91 Remove outdated comment 0cc4m 2024-01-29 21:51:22 +01:00
  • fbf1ddec69
    Nomic Vulkan backend (#4456) b2006 Jared Van Bortel 2024-01-29 15:50:50 -05:00
  • 8182ed1de6 Remove unnecessary warning message 0cc4m 2024-01-29 21:48:57 +01:00
  • f185d860e9 Also fix UMA handling for prealloc buffers 0cc4m 2024-01-29 21:22:16 +01:00
  • 299821140a fix incorrect memcpy Jared Van Bortel 2024-01-29 15:20:45 -05:00
  • 6154282347
    remove c++17, file_is_empty divinity76 2024-01-29 20:44:25 +01:00
  • 1f98dff7a9 fix trailing whitespace Jared Van Bortel 2024-01-29 14:16:56 -05:00
  • 48db724bc7 minor fixup Jared Van Bortel 2024-01-29 14:15:18 -05:00
  • b932cd7428 vulkan : correctly fix use-after-free in ggml_vk_current_device Jared Van Bortel 2023-11-30 16:50:20 -05:00
  • 54fb5c6b6c Fix UMA handling 0cc4m 2024-01-29 19:57:19 +01:00
  • 7980178a17 Merge branch 'gg/flash-attn' of https://github.com/ggerganov/llama.cpp into flash-attn-cuda FSSRepo 2024-01-29 13:17:39 -05:00
  • a1d5a12bc5 fix compiler error FSSRepo 2024-01-29 13:15:33 -05:00
  • 7e11fe0880 kompute : remove llama_load_model_from_file_internal Jared Van Bortel 2024-01-29 12:52:54 -05:00
  • 5fcb9c1c5a
    metal : faster inner loop for C == 32 Georgi Gerganov 2024-01-29 19:46:22 +02:00
  • dc08e512cc kompute : fix merge issues Jared Van Bortel 2024-01-29 12:41:02 -05:00
  • da1dc66659 Merge branch 'master' of https://github.com/ggerganov/llama.cpp into ceb/nomic-vulkan Jared Van Bortel 2024-01-29 12:22:42 -05:00
  • be7c0559d3 kompute : better device management Jared Van Bortel 2024-01-29 12:07:35 -05:00
  • c6c1132e5e
    tests : more Georgi Gerganov 2024-01-29 18:22:28 +02:00
  • fe2160eec9 iq3_xxs: failing tests Iwan Kawrakow 2024-01-29 18:21:04 +02:00
  • abeaf0d90e
    metal : disable buffer allocation logs Georgi Gerganov 2024-01-29 18:12:24 +02:00
  • 8f29f046a4 fix install cmd Zhang 2024-01-29 22:52:55 +08:00
  • 2aed77eb06
    fix typo "RLIMIT_MLOCK" (#5175) b2005 divinity76 2024-01-29 15:45:41 +01:00
  • 4794821a31
    tests : add ATTN tests Georgi Gerganov 2024-01-29 16:44:55 +02:00
  • 7be0c36f94 fix install cmd Zhang 2024-01-29 22:38:31 +08:00
  • 62623434af iq3_xxs: hopefully fix ROCm Iwan Kawrakow 2024-01-29 16:36:19 +02:00
  • 6efbc690dc iq3_xxs: fix failing quantization test Iwan Kawrakow 2024-01-29 16:33:03 +02:00
  • aef97b56ba fix install cmd Zhang 2024-01-29 22:26:42 +08:00
  • 9c6b64624b fix install cmd Zhang 2024-01-29 22:11:40 +08:00
  • c82d18e863
    server : embeddings compatibility for OpenAI (#5190) b2004 Wu Jian Ping 2024-01-29 21:48:10 +08:00
  • 14fef85e2d
    py : fix except (#5194) Georgi Gerganov 2024-01-29 15:35:54 +02:00
  • 9ef9316f73
    py : fix except (#5189) Georgi Gerganov 2024-01-29 14:26:56 +02:00
  • 1e8c42097f fix ci Zhang 2024-01-29 20:17:11 +08:00
  • b51e79422c
    Add documents about Vulkan calvinweb 2024-01-29 20:09:36 +08:00
  • 21fbdd7130 fix install issue Zhang 2024-01-29 19:45:44 +08:00
  • 94613299b4 embeddings embeddingsfor OpenAI Wu Jian Ping 2024-01-29 17:36:59 +08:00
  • 455d17de4a correct install oneMKL Zhang 2024-01-29 17:36:59 +08:00
  • e76627bcce
    py : improve BPE tokenizer support (#5189) b2002 Sang-Kil Park 2024-01-29 18:24:19 +09:00
  • fd02bddc94 add for win build CI Zhang 2024-01-29 17:23:08 +08:00
  • df4a9c99c2
    Support for all cases that have/haven't ["model"]["vocab"]. Sang-Kil Park 2024-01-29 18:20:28 +09:00
  • 7e4e7488ae iq3_xxs: add some quant mix Iwan Kawrakow 2024-01-29 11:19:59 +02:00
  • d05845a348 add windows build in CI Zhang 2024-01-29 17:17:32 +08:00
  • 0fcdefc346
    formatting divinity76 2024-01-29 10:06:12 +01:00
  • 838f8ea131 support SYCL backend windows build Zhang 2024-01-29 16:39:48 +08:00
  • fbe7dfa53c
    ggml : add max buffer sizes to opencl and metal backends (#5181) b2001 slaren 2024-01-29 09:05:13 +01:00
  • 172ac82629
    cmake : fix Vulkan build (#5182) b2000 Eve 2024-01-29 08:04:47 +00:00
  • 56701f336d
    Update tests/test-llama-grammar.cpp Michael Klimenko 2024-01-29 08:56:18 +01:00
  • 4e53e000d7 Revert httplib.h changes due to being external Michael Klimenko 2024-01-29 08:55:28 +01:00
  • 2dcf37c10b
    Apply suggestions from code review Michael Klimenko 2024-01-29 08:54:01 +01:00
  • f9d22dab25 iq2xs: small AVX2 imrovement Iwan Kawrakow 2024-01-29 08:52:08 +02:00
  • 76f7befaa1 iq2xs: faster AVX2 dot product Iwan Kawrakow 2024-01-29 07:21:30 +02:00
  • 1a82788028 ADD POOL2D test case in test-backend-ops.cpp zhangjidong 2024-01-29 11:21:03 +08:00
  • 440eb07095
    fix vulkan cmake Eve 2024-01-29 02:42:24 +00:00
  • ba5592c653 CUDA POOL2D zhangjidong 2024-01-29 10:31:41 +08:00
  • b45620d3df add max buffer sizes to opencl and metal backends slaren 2024-01-29 03:16:40 +01:00
  • 1db22d7032
    metal : support Q > 8 Georgi Gerganov 2024-01-28 23:08:31 +02:00
  • 80859445af Add basic UMA memory handling 0cc4m 2024-01-28 21:53:35 +01:00
  • 134c81c78d
    metal : minor Georgi Gerganov 2024-01-28 22:23:40 +02:00
  • 0ad44baf33
    Merge branch 'master' into gg/flash-attn Georgi Gerganov 2024-01-28 21:53:51 +02:00
  • d2f650cb5b
    metal : free metal objects (#5161) b1999 Paul Tsochantaris 2024-01-28 19:50:16 +00:00
  • 6c348978c7
    allow empty --prompt-cache file divinity76 2024-01-28 20:13:23 +01:00
  • c41239df0e add Vulkan support to Nix flake Martin Schwaighofer 2024-01-28 12:59:43 +01:00
  • 4eafb8334e Whitespace fix Paul Tsochantaris 2024-01-28 18:19:37 +00:00
  • 35dec26cc2
    sync : ggml b1998 Georgi Gerganov 2024-01-28 19:48:05 +02:00
  • d460510c72
    ggml : minor type fix (int64_t -> size_t) Georgi Gerganov 2024-01-28 18:44:58 +02:00
  • e75d00797e Add fixes to newer changes Michael Klimenko 2024-01-28 18:34:27 +01:00
  • 024e566389 Merge remote-tracking branch 'fork/master' into cpp_fixes Michael Klimenko 2024-01-28 18:21:21 +01:00
  • 9bfafcca23
    Merge 10fbb1f33d into 2307523d32 0cc4m 2024-01-28 17:04:01 +00:00
  • 2307523d32
    ggml : add Vulkan backend (#2059) b1996 0cc4m 2024-01-28 18:03:59 +01:00
  • 43a6d1d2dc
    fix typo "RLIMIT_MLOCK" divinity76 2024-01-28 18:00:27 +01:00
  • 10fbb1f33d
    llama : fix trailing whitespace Georgi Gerganov 2024-01-28 18:58:35 +02:00
  • e1349fb4b0 Restore faulty merge p.2 Michael Klimenko 2024-01-28 17:44:08 +01:00
  • 92f8f64332 Restore faulty merge Michael Klimenko 2024-01-28 17:34:58 +01:00