llama.cpp

Author	SHA1	Message	Date
teddybear082	7d120f2794	Add context size parameter to google colab notebook (#489 ) -add configurable context size to parameters along with models and layers for ease of use -this can already be done with a simple edit by experienced llm users but new users may not know this is a parameter they should set. Co-authored-by: LostRuins <39025047+LostRuins@users.noreply.github.com>	2023-10-24 17:13:01 +08:00
Concedo	7744aa6a9c	updated colab	2023-10-24 15:37:47 +08:00
Concedo	5f1f8a5a89	adjust	2023-10-22 21:53:54 +08:00
Concedo	ccf8334651	remove script (+8 squashed commit) Squashed commit: [bde2e3da] should be working [1cde82c0] update [bb6c8676] wip [66b698d1] wip colab [9953466a] wip colab [ae0bedea] json fix [0aac144f] wip on optimized colab [ec9f8e96] prepare colab binaries notebook	2023-10-22 21:38:38 +08:00
Concedo	fafe999ff9	update lite and colab (+1 squashed commits) Squashed commits: [06b6ca6d] updated lite and colab	2023-10-22 14:03:18 +08:00
Concedo	cff75061fe	fixed some old models failing due to tokenizer changes, update lite (+1 squashed commits) Squashed commits: [9dee81ec] fixed some old models failing due to tokenizer changes, update lite tooltip (+3 squashed commit) Squashed commit: [5ab95a79] fixes [a561d5e2] fixed some old models failing due to tokenizer changes [95e65daf] lite updates	2023-10-22 11:04:59 +08:00
Concedo	dd1d61ea6b	colab is fixed (+1 squashed commits) Squashed commits: [0b2a51f3] fix colab (+1 squashed commits) Squashed commits: [a6b832d0] fix colab (+1 squashed commits) Squashed commits: [8f88f210] updated colab (+1 squashed commits) Squashed commits: [75552e0d] try new colab	2023-10-21 10:08:32 +08:00
Concedo	6119a2b5b2	revert lite change	2023-10-20 22:13:56 +08:00
Concedo	6fa681b692	fixed a race condition with SSE streaming	2023-10-20 22:01:09 +08:00
Concedo	5f5d5f1d86	quick fix	2023-10-20 19:43:56 +08:00
Concedo	012c53367d	minor lite fixes	2023-10-20 18:41:17 +08:00
Concedo	d3c7b7cc71	colab fix	2023-10-20 16:34:45 +08:00
Concedo	d5016fdc8f	updated lite bug	2023-10-20 16:03:06 +08:00
Concedo	ee93213218	updated lite	2023-10-20 15:44:52 +08:00
Concedo	cd3bb3ede2	update colab link	2023-10-20 13:49:34 +08:00
Concedo	8947142c46	updated lite and colab	2023-10-20 11:35:44 +08:00
Concedo	8d31550d48	fix groupchat	2023-10-19 23:40:15 +08:00
Concedo	957e245285	Merge branch 'master' into concedo_experimental # Conflicts: # Makefile # README.md	2023-10-19 23:32:52 +08:00
kalomaze	ddce116ec9	Fix for Top K disabling (#480 ) * Update gpttype_adapter.cpp * use n_vocab instead of 32000 for when top k is off	2023-10-19 23:20:44 +08:00
Concedo	8c6001de2a	updated lite	2023-10-19 23:18:14 +08:00
Concedo	fd770bb105	patch	2023-10-19 23:04:26 +08:00
Concedo	4382e51719	updated lite and default horde ctx amount	2023-10-19 22:49:59 +08:00
M. Yusuf Sarıgöz	60abea9798	llava : avoid segfault in case of non-existent mmproj file (#3674 )	2023-10-19 16:59:11 +03:00
Georgi Gerganov	004797f6ac	readme : update hot topics	2023-10-18 21:44:43 +03:00
Georgi Gerganov	4e82b2ea3f	speculative : bug fixes	2023-10-18 18:49:40 +03:00
Georgi Gerganov	0e89203b51	speculative : add tree-based sampling example (#3624 ) * sampling : one sequence per sampling context ggml-ci * speculative : add tree-based sampling support ggml-ci * speculative : reuse the n_parallel CLI param * speculative : refactor sampling * examples : fix build after sampling refactoring ggml-ci * batched : fix n_seq_id * sampling : fix malloc ggml-ci * swift : fix build ggml-ci * swift : try to fix build ggml-ci * prompts : add assistant.txt * common : add llama_batch_add() and llama_batch_clear() helpers * speculative : minor refactor ggml-ci * minor : comments + rename ggml-ci * speculative : fix off-by-one for n_drafted * speculative : fix the n_drafted fix + p constants	2023-10-18 16:21:57 +03:00
Jhen-Jie Hong	c67fe68e41	metal : implement q5_0 and q5_1 kernels (#3648 ) * metal : implement dequantize_q5_0 * metal : block_q_n_dot_y for block_q5_0 (broken) * metal : revert unnecessary change * metal : implement dequantize_q5_1 * metal : block_q_n_dot_y for q5_1 (broken) * metal : fix block_q_n_dot_y * minor : spaces / formatting --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-10-18 15:21:48 +03:00
shibe2	1117d06607	opencl : fix element-wise multiplication (#3656 )	2023-10-18 15:09:22 +03:00
Concedo	c1ca1de2ac	fixed support for old falcon models	2023-10-18 17:20:44 +08:00
Concedo	700951dbd4	Merge branch 'master' into concedo_experimental # Conflicts: # README.md	2023-10-18 16:33:09 +08:00
Concedo	53b7cdf8a3	Merge branch 'concedo' into concedo_experimental	2023-10-18 13:51:13 +08:00
slaren	cb33f43a2a	fix embeddings when using CUDA (#3657 )	2023-10-17 22:24:50 +02:00
Georgi Gerganov	e1675d133c	llama : avoid fprintf in favor of LLAMA_LOG (#3538 )	2023-10-17 22:34:26 +03:00
BarfingLemurs	8402566a7c	readme : update hot-topics & models, detail windows release in usage (#3615 ) * Update README.md * Update README.md * Update README.md * move "Running on Windows" section below "Prepare data and run" --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-10-17 21:13:21 +03:00
LostRuins	6e34d31c44	Update README.md (#479 )	2023-10-18 01:24:14 +08:00
shibe2	40e5ce054f	CLBlast: Fix temporary buffer size for f16 conversion (wsize) Fix buffer overflow. Reduce the size to fit just one 2D slice. Assert sufficient size.	2023-10-17 21:02:30 +04:00
slaren	a5e8c1d8c7	train-text-from-scratch : fix assert failure in ggml-alloc (#3618 )	2023-10-17 20:00:58 +03:00
Georgi Gerganov	e74c705e15	editorconfig : remove trailing spaces	2023-10-17 19:52:53 +03:00
coezbek	3ad1e3f1a1	server : documentation of JSON return value of /completion endpoint (#3632 ) * Added documentation of JSON return value of /completion endpoint * Update examples/server/README.md --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-10-17 19:51:02 +03:00
Georgi Gerganov	1142013da4	save-load-state : fix example + add ci test (#3655 ) * save-load-state : fix example (close #3606) * ci : add test for save-load-state example ggml-ci	2023-10-17 19:12:46 +03:00
ldwang	5fe268a4d9	readme : add Aquila2 links (#3610 ) Signed-off-by: ldwang <ftgreat@gmail.com> Co-authored-by: ldwang <ftgreat@gmail.com>	2023-10-17 18:52:33 +03:00
staviq	1a159553f9	tokenizer : special token handling (#3538 ) * Rewrite special token handling from #1931 * shorten param name, add st verification by type * use offsets instead of copy by substr * formatting, remove copying iterator on delete * llama : normalize code-style * swift fix * print pfx/sfx if verb, main: split pfx input sfx * dont add space when using special tokens * minor : comment + spacing --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-10-17 18:11:01 +03:00
Concedo	6f8fe88f10	fix for lite (+5 squashed commit) Squashed commit: [f9ce9855] catch more exceptions [8cdaf149] tweaked horde worker timeouts, updated lite [619ebef4] fixed abort no response if failed [a54a66a2] fixed time overflow [9affdc3e] updated lite	2023-10-17 23:04:32 +08:00
Georgi Gerganov	281ef73c25	k-quants : fix quantization ranges (#3646 )	2023-10-17 09:19:28 +03:00
Georgi Gerganov	940efa95fe	llava : fix tokenization to not add bos between image embeddings and user prompt (#3645 ) * llava : fix tokenization to not add bos after system prompt * set seed --------- Co-authored-by: M. Yusuf Sarıgöz <yusufsarigoz@gmail.com>	2023-10-16 23:58:00 +03:00
Concedo	ee0681f0d9	convert some asserts into non-terminating since they are ovezealous	2023-10-15 16:12:20 +08:00
Concedo	5cfabaee25	Merge branch 'master' into concedo_experimental # Conflicts: # CMakeLists.txt # Makefile # README.md # docs/BLIS.md	2023-10-15 15:50:20 +08:00
cebtenzzre	11bff29045	MPT : support GQA for replit-code-v1.5 (#3627 )	2023-10-15 09:32:06 +03:00
M. Yusuf Sarıgöz	11dc1091f6	Honor -ngl option for Cuda offloading in llava (#3621 )	2023-10-14 04:52:44 -06:00
Daniel Bevenius	2a4bcbacea	llama : remove n_threads from llama_decode_internal (#3614 ) This commit removes `n_threads` from the `llama_decode_internal` functions doc comment as it does not exist anymore. It looks like this parameter was removed in Commit `16bc66d947` ("llama.cpp : split llama_context_params into model and context params"). Signed-off-by: Daniel Bevenius <daniel.bevenius@gmail.com>	2023-10-13 13:33:16 +03:00

1 2 3 4 5 ...

2399 commits