llama.cpp

History

leo-pony c18610b4ee CANN: Support Ascend310P to accelerate F32 and F16 Model (#10216 ) * CANN Support Ascend310P to accelerate F32 and F16 Model * Add compile option soc type macro ASCEND_310P to ggml-cann lib * Remove unused code * Remove the ascend soc_type hard code compile option in CMakelist.txt		2024-11-22 14:07:20 +08:00
..
ggml-amx	ggml : adapt AMX to tensor->grad removal (#0 )	2024-11-17 08:30:29 +02:00
ggml-blas	cuda : fix CUDA_FLAGS not being applied (#10403 )	2024-11-19 14:29:38 +01:00
ggml-cann	CANN: Support Ascend310P to accelerate F32 and F16 Model (#10216 )	2024-11-22 14:07:20 +08:00
ggml-cpu	add cmake rvv support (#10411 )	2024-11-19 21:10:31 +01:00
ggml-cuda	cuda : optimize argmax (#10441 )	2024-11-21 18:18:50 +01:00
ggml-hip	CUDA: remove DMMV, consolidate F16 mult mat vec (#10318 )	2024-11-17 09:09:55 +01:00
ggml-kompute	ggml : build backends as libraries (#10256 )	2024-11-14 18:04:35 +01:00
ggml-metal	metal : fox offset integer overflows in im2col (ggml/1015)	2024-11-19 20:03:21 +02:00
ggml-musa	CUDA: remove DMMV, consolidate F16 mult mat vec (#10318 )	2024-11-17 09:09:55 +01:00
ggml-rpc	ggml : build backends as libraries (#10256 )	2024-11-14 18:04:35 +01:00
ggml-sycl	sycl : Add option to set the SYCL architecture for all targets (#10266 )	2024-11-19 08:02:23 +00:00
ggml-vulkan	vulkan: predicate max operation in soft_max shaders/soft_max (#10437 )	2024-11-20 20:47:36 +01:00
CMakeLists.txt	Add required ggml-base and backend libs to cmake pkg (#10407 )	2024-11-19 17:10:30 +01:00
ggml-aarch64.c	ggml : optimize Q4_0 into Q4_0_X_Y repack (#10324 )	2024-11-16 01:53:37 +01:00
ggml-aarch64.h	ggml : build backends as libraries (#10256 )	2024-11-14 18:04:35 +01:00
ggml-alloc.c	ggml: new optimization interface (ggml/988)	2024-11-17 08:30:29 +02:00
ggml-backend-impl.h	llama : refactor model loader with backend registry (#10026 )	2024-10-30 02:01:23 +01:00
ggml-backend-reg.cpp	ggml : build backends as libraries (#10256 )	2024-11-14 18:04:35 +01:00
ggml-backend.cpp	ggml/sched : do not skip views in pre-assignments	2024-11-21 09:22:05 +02:00
ggml-common.h	ggml-quants : ternary packing for TriLMs and BitNet b1.58 (#8151 )	2024-09-05 21:48:47 -04:00
ggml-impl.h	ggml-opt: fix data corruption (ggml/1022)	2024-11-21 09:22:02 +02:00
ggml-opt.cpp	ggml-opt: fix data corruption (ggml/1022)	2024-11-21 09:22:02 +02:00
ggml-quants.c	ggml : build backends as libraries (#10256 )	2024-11-14 18:04:35 +01:00
ggml-quants.h	ggml : build backends as libraries (#10256 )	2024-11-14 18:04:35 +01:00
ggml-threading.cpp	ggml : build backends as libraries (#10256 )	2024-11-14 18:04:35 +01:00
ggml-threading.h	ggml : build backends as libraries (#10256 )	2024-11-14 18:04:35 +01:00
ggml.c	cuda : optimize argmax (#10441 )	2024-11-21 18:18:50 +01:00