llama.cpp

History

Robert Ormandi 86a1934978 metal : Extend how Llama.cpp locates metal resources (#10676 ) * metal : Extend how Llama.cpp locates metal resources (#10675) * It searches the resource file in the directory where the current binary is located as well. * Resolves symbolic links. Rationale: When we plug this dependency into a Bazel build and run it in the context of Bazel (e.g. testing): * the execution directory is often very different from where the files are located and no direct control over this (Bazel sandboxing), * the Bazel sandbox often use symbolic links to make files available. With this patch, we can have the resource file added to the target, can build and run tests in the context of Bazel. * Update ggml/src/ggml-metal/ggml-metal.m Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * Update ggml/src/ggml-metal/ggml-metal.m Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>		2024-12-07 09:55:01 +02:00
..
ggml-blas	ggml : add support for dynamic loading of backends (#10469 )	2024-11-25 15:13:39 +01:00
ggml-cann	CANN: RoPE operator optimization (#10563 )	2024-11-29 14:46:55 +08:00
ggml-cpu	ggml: add `GGML_SET` Metal kernel + i32 CPU kernel (ggml/1037)	2024-12-05 13:27:33 +02:00
ggml-cuda	CUDA: remove unnecessary warp reduce in FA (ggml/1032)	2024-12-03 20:04:49 +02:00
ggml-hip	ggml : add support for dynamic loading of backends (#10469 )	2024-11-25 15:13:39 +01:00
ggml-kompute	kompute : improve backend to pass test_backend_ops (#10542 )	2024-11-28 12:51:38 +01:00
ggml-metal	metal : Extend how Llama.cpp locates metal resources (#10676 )	2024-12-07 09:55:01 +02:00
ggml-musa	mtgpu: Add MUSA_DOCKER_ARCH in Dockerfiles && update cmake and make (#10516 )	2024-11-26 17:00:41 +01:00
ggml-rpc	ggml : add support for dynamic loading of backends (#10469 )	2024-11-25 15:13:39 +01:00
ggml-sycl	SYCL : Move to compile time oneMKL interface backend selection for NVIDIA backend (#10584 )	2024-12-04 09:29:20 +08:00
ggml-vulkan	vulkan: Add VK_NV_cooperative_matrix2 support for mul_mat and flash attention (#10206 )	2024-12-05 20:15:05 +01:00
CMakeLists.txt	ggml : add predefined list of CPU backend variants to build (#10626 )	2024-12-04 14:45:40 +01:00
ggml-aarch64.c	ggml : optimize Q4_0 into Q4_0_X_Y repack (#10324 )	2024-11-16 01:53:37 +01:00
ggml-aarch64.h	ggml : build backends as libraries (#10256 )	2024-11-14 18:04:35 +01:00
ggml-alloc.c	ggml: new optimization interface (ggml/988)	2024-11-17 08:30:29 +02:00
ggml-backend-impl.h	ggml : automatic selection of best CPU backend (#10606 )	2024-12-01 16:12:41 +01:00
ggml-backend-reg.cpp	ggml : add predefined list of CPU backend variants to build (#10626 )	2024-12-04 14:45:40 +01:00
ggml-backend.cpp	ggml : move AMX to the CPU backend (#10570 )	2024-11-29 21:54:58 +01:00
ggml-common.h	ggml-cpu: support IQ4_NL_4_4 by runtime repack (#10541 )	2024-11-28 13:52:03 +01:00
ggml-impl.h	Avoid using __fp16 on ARM with old nvcc (#10616 )	2024-12-04 01:41:37 +01:00
ggml-opt.cpp	ggml-opt: fix data corruption (ggml/1022)	2024-11-21 09:22:02 +02:00
ggml-quants.c	ggml : build backends as libraries (#10256 )	2024-11-14 18:04:35 +01:00
ggml-quants.h	ggml : build backends as libraries (#10256 )	2024-11-14 18:04:35 +01:00
ggml-threading.cpp	ggml : build backends as libraries (#10256 )	2024-11-14 18:04:35 +01:00
ggml-threading.h	ggml : build backends as libraries (#10256 )	2024-11-14 18:04:35 +01:00
ggml.c	ggml : add `GGML_PAD_REFLECT_1D` operation (ggml/1034)	2024-12-05 13:27:31 +02:00