llama.cpp

Author	SHA1	Message	Date
Pierrick HYMBERT	672d98f6f0	server: tests: CORS and api key checks scenario	2024-02-21 01:51:33 +01:00
Pierrick HYMBERT	6dcbcfe6ba	server: tests: simplify completion scenario	2024-02-21 00:43:50 +01:00
Pierrick HYMBERT	19664b9f01	server: tests: detokenize endpoint issue reference added	2024-02-21 00:17:38 +01:00
Pierrick HYMBERT	1065f6d41b	server: tests: add tokenize/detokenize scenario	2024-02-21 00:13:53 +01:00
Pierrick HYMBERT	e6d482088d	server: tests: add embeddings scenario	2024-02-21 00:02:30 +01:00
Pierrick HYMBERT	1ecda0d13e	server: tests: disable issue 3969 scenario	2024-02-20 23:35:44 +01:00
Pierrick HYMBERT	b0b6d83c76	server: tests: add infinite loop scenario	2024-02-20 23:17:00 +01:00
Pierrick HYMBERT	68574c6f98	server: tests: add infinite loop scenario	2024-02-20 23:11:59 +01:00
Pierrick HYMBERT	6b9dc4f291	server: tests: add infinite loop	2024-02-20 23:05:27 +01:00
Pierrick HYMBERT	0772884b06	server: tests: add a constant seed in completion request	2024-02-20 22:55:29 +01:00
Pierrick HYMBERT	b9f8390d28	server: tests: check for infinite loops	2024-02-20 22:49:36 +01:00
Pierrick HYMBERT	367b59a15c	server: tests: check for infinite loops	2024-02-20 22:45:30 +01:00
Pierrick HYMBERT	c355f76427	server: tests: slots endpoint checks	2024-02-20 22:32:11 +01:00
Pierrick HYMBERT	11adf1d864	server: tests: add OAI multi user scenario	2024-02-20 22:00:09 +01:00
Pierrick HYMBERT	9b7ea97979	server: tests: add OAI stream test, fix file end of line, fast fail behave	2024-02-20 21:34:35 +01:00
Pierrick HYMBERT	56583bee41	server: tests: refactor steps and vocabulary	2024-02-20 20:52:24 +01:00
Pierrick HYMBERT	6c95ec6587	server: tests: change model to: @karpathy's tinyllamas	2024-02-20 20:50:14 +01:00
Pierrick HYMBERT	8bb586bf06	server: tests: add health check and concurrent request example	2024-02-20 19:05:21 +01:00
Pierrick HYMBERT	1680599b01	server: tests: build only the server	2024-02-20 19:05:21 +01:00
Pierrick HYMBERT	fe9866a52d	server: tests: use ngxson llama_xs_q4.bin	2024-02-20 19:05:21 +01:00
Pierrick HYMBERT	30aa323fb9	server: tests: fix ci workflow	2024-02-20 19:05:21 +01:00
Pierrick HYMBERT	4e5245e6b8	server: tests: fix ci workflow	2024-02-20 19:05:21 +01:00
Pierrick HYMBERT	6497755de5	server: tests: fix ci workflow	2024-02-20 19:05:21 +01:00
Pierrick HYMBERT	9b63d7057a	server: tests: reduce number of files, all in one tests shell script	2024-02-20 19:05:21 +01:00
Pierrick HYMBERT	157bcf2286	server: init functional test	2024-02-20 19:05:21 +01:00
Daniel Bevenius	4ed8e4fbef	llava : add explicit instructions for llava-1.6 (#5611 ) This commit contains a suggestion for the README.md in the llava example. The suggestion adds explicit instructions for how to convert a llava-1.6 model and run it using llava-cli. The motivation for this is that having explicit instructions similar to the 1.5 instructions will make it easier for users to try this out. Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>	2024-02-20 19:30:27 +02:00
Xuan Son Nguyen	9c405c9f9a	Server: use llama_chat_apply_template (#5593 ) * server: use llama_chat_apply_template * server: remove trailing space * server: fix format_chat * server: fix help message Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * server: fix formatted_chat --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2024-02-20 15:58:27 +01:00
Dane Madsen	5207b3fbc5	readme : update UI list (#5605 ) * Add maid to ui list * Specify licence	2024-02-20 12:00:23 +02:00
Haoxiang Fei	8dbbd75754	metal : add build system support for embedded metal library (#5604 ) * add build support for embedded metal library * Update Makefile --------- Co-authored-by: Haoxiang Fei <feihaoxiang@idea.edu.cn> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2024-02-20 11:58:36 +02:00
Pierrick Hymbert	c0a8c6db37	server : health endpoint configurable failure on no slot (#5594 )	2024-02-20 09:48:19 +02:00
AidanBeltonS	b9111bd209	Update ggml_sycl_op_mul_mat_vec_q (#5502 ) * Update ggml_sycl_op_mul_mat_vec_q * Apply suggestions from code review Co-authored-by: Abhilash Majumder <30946547+abhilash1910@users.noreply.github.com> * revert suggestion on macro * fix bug * Add quant type GGML_TYPE_IQ1_S to unsupported * fix format --------- Co-authored-by: Abhilash Majumder <30946547+abhilash1910@users.noreply.github.com>	2024-02-20 12:31:25 +05:30
Mathijs de Bruin	633782b8d9	nix: now that we can do so, allow MacOS to build Vulkan binaries Author: Philip Taron <philip.taron@gmail.com> Date: Tue Feb 13 20:28:02 2024 +0000	2024-02-19 14:49:49 -08:00
0cc4m	22f83f0c38	Enable Vulkan MacOS CI	2024-02-19 14:49:49 -08:00
0cc4m	bb9dcd560a	Refactor validation and enumeration platform checks into functions to clean up ggml_vk_instance_init()	2024-02-19 14:49:49 -08:00
0cc4m	f50db6ae0b	Add check for VK_KHR_portability_enumeration for MoltenVK support	2024-02-19 14:49:49 -08:00
Mathijs de Bruin	d8c054517d	Add preprocessor checks for Apple devices. Based on work by @rbourgeat in https://github.com/ggerganov/llama.cpp/pull/5322/files	2024-02-19 14:49:49 -08:00
Mathijs de Bruin	42f664a382	Resolve ErrorIncompatibleDriver with Vulkan on MacOS. Refs: - https://chat.openai.com/share/7020ce72-65fc-45ec-b7be-9d9d798a5f3f - https://github.com/SaschaWillems/Vulkan/issues/954 - https://github.com/haasn/libplacebo/issues/128 - https://github.com/KhronosGroup/Vulkan-Samples/issues/476	2024-02-19 14:49:49 -08:00
Mathijs de Bruin	5dde540897	Allow for Vulkan build with Accelerate. Closes #5304	2024-02-19 14:49:49 -08:00
slaren	40c3a6c1e1	cuda : ignore peer access already enabled errors (#5597 ) * cuda : ignore peer access already enabled errors * fix hip	2024-02-19 23:40:26 +01:00
Jared Van Bortel	f24ed14ee0	make : pass CPPFLAGS directly to nvcc, not via -Xcompiler (#5598 )	2024-02-19 15:54:12 -05:00
nopperl	9d679f0fcc	examples : support minItems/maxItems in JSON grammar converter (#5039 ) * support minLength and maxLength in JSON schema grammar converter * Update examples/json-schema-to-grammar.py --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2024-02-19 16:14:07 +02:00
Georgi Gerganov	1387cf60f7	llava : remove extra cont (#5587 )	2024-02-19 15:23:17 +02:00
slaren	6fd413791a	llava : replace ggml_cpy with ggml_cont	2024-02-19 15:09:43 +02:00
Georgi Gerganov	337c9cbd52	sync : ggml ggml-ci	2024-02-19 15:09:43 +02:00
Georgi Gerganov	a3145bdc30	ggml-alloc : apply ggml/731	2024-02-19 15:09:43 +02:00
Didzis Gosko	890559ab28	metal : option to embed MSL source into compiled binary (whisper/1842) * ggml : embed Metal library source (ggml-metal.metal) into binary enable by setting WHISPER_EMBED_METAL_LIBRARY * rename the build option * rename the preprocessor directive * generate Metal library embedding assembly on-fly during build process	2024-02-19 15:09:43 +02:00
Georgi Gerganov	d0e3ce51f4	ci : enable -Werror for CUDA builds (#5579 ) * cmake : pass -Werror through -Xcompiler ggml-ci * make, cmake : enable CUDA errors on warnings ggml-ci	2024-02-19 14:45:41 +02:00
Georgi Gerganov	68a6b98b3c	make : fix CUDA build (#5580 )	2024-02-19 13:41:51 +02:00
valiray	70d45af0ef	readme : fix typo in README-sycl.md (#5353 )	2024-02-19 12:37:10 +02:00
Abhilash Majumder	13e2c771aa	cmake : remove obsolete sycl compile flags (#5581 ) * rm unwanted sycl compile options * fix bug * fix bug * format fix	2024-02-19 11:15:18 +02:00

1 2 3 4 5 ...

2243 commits