llama.cpp

History

leo-pony c18610b4ee CANN: Support Ascend310P to accelerate F32 and F16 Model (#10216 ) * CANN Support Ascend310P to accelerate F32 and F16 Model * Add compile option soc type macro ASCEND_310P to ggml-cann lib * Remove unused code * Remove the ascend soc_type hard code compile option in CMakelist.txt		2024-11-22 14:07:20 +08:00
..
ascendc_kernels.h	cann: support q4_0 model (#8822 )	2024-08-05 12:22:30 +08:00
CMakeLists.txt	CANN: Support Ascend310P to accelerate F32 and F16 Model (#10216 )	2024-11-22 14:07:20 +08:00
dup.cpp	CANN: Support Ascend310P to accelerate F32 and F16 Model (#10216 )	2024-11-22 14:07:20 +08:00
get_row_f16.cpp	CANN: Support Ascend310P to accelerate F32 and F16 Model (#10216 )	2024-11-22 14:07:20 +08:00
get_row_f32.cpp	CANN: Support Ascend310P to accelerate F32 and F16 Model (#10216 )	2024-11-22 14:07:20 +08:00
get_row_q4_0.cpp	CANN: Support Ascend310P to accelerate F32 and F16 Model (#10216 )	2024-11-22 14:07:20 +08:00
get_row_q8_0.cpp	[CANN] Add Ascend NPU backend (#6035 )	2024-07-17 14:23:50 +03:00
quantize_f16_q8_0.cpp	[CANN] Add Ascend NPU backend (#6035 )	2024-07-17 14:23:50 +03:00
quantize_f32_q8_0.cpp	[CANN] Add Ascend NPU backend (#6035 )	2024-07-17 14:23:50 +03:00
quantize_float_to_q4_0.cpp	cann: fix buffer_num and runtime speed slowly error (#8865 )	2024-08-05 21:10:37 +08:00