llama.cpp

Author	SHA1	Message	Date
Georgi Gerganov	78af3e92c9	ggml : fix compiler warnings + cosmetic changes	2023-05-08 18:37:17 +03:00
xaedes	0d72207ac3	c++ in baby-llama example use c++ includes instead of c includes use std::min, std::max instead of MIN, MAX macros	2023-05-08 16:56:55 +02:00
xaedes	dea9c9359a	c++ in baby-llama example use c++ includes instead of c includes use std::min, std::max instead of MIN, MAX macros	2023-05-08 16:40:31 +02:00
xaedes	1ecbece752	disable slow tests grad0 and opt to avoid exceeding timeouts	2023-05-08 02:29:36 +02:00
xaedes	f5301061b6	remove busy loop that was used as sleep for slower sinus wave generation	2023-05-08 01:12:37 +02:00
xaedes	4997bc5819	reduce number of test-grad0 iterations avoid exceeding timeout of automated tests	2023-05-08 00:57:41 +02:00
xaedes	2936dd60a4	remove trailing whitespace	2023-05-08 00:04:54 +02:00
xaedes	7c8768f819	add missing include for strcmp, etc	2023-05-07 23:43:43 +02:00
xaedes	660836f0ff	fix call to ggml_set_name	2023-05-07 23:39:57 +02:00
xaedes	9dd8e405fb	rename print functions in baby-llama example	2023-05-07 22:43:23 +02:00
xaedes	47ad186628	revert disabling of threading for rms_norm and norm	2023-05-07 21:56:10 +02:00
xaedes	5d9fed7e7f	remove shape annotations in llama_eval_internal	2023-05-07 21:45:29 +02:00
xaedes	d20ba6f6e6	update static assert of GGML_OP_COUNT	2023-05-07 21:42:42 +02:00
xaedes	e643fa1619	smaller default values for baby llama model parameters	2023-05-07 21:38:00 +02:00
xaedes	ee565f34e3	Merge branch 'master' into train-example # Conflicts: # ggml.c # llama.cpp	2023-05-07 21:24:12 +02:00
xaedes	4764842120	change name of GGML_OP_ADD_AT to GGML_OP_ACC	2023-05-07 21:14:57 +02:00
xaedes	e0de09d77e	shorten code using a variable	2023-05-07 19:48:38 +02:00
xaedes	49d6daa11e	vastly improve training results instead of logit targets 0 and 1 use -1 and +1.	2023-05-07 19:46:05 +02:00
xaedes	93201abdb7	add trainable lora-only model with all big matrices C split into A,B with A*B=C this is not a lora-finetune, but the whole model changed to have only low-rank "lora" matrices. training this instead of the normal model resulted in much worse results though...	2023-05-07 19:44:51 +02:00
Henri Vasserman	e1295513a4	CI: add Windows CLBlast and OpenBLAS builds (#1277 ) * Add OpenCL and CLBlast support * Add OpenBLAS support * Remove testing from matrix * change build name to 'clblast'	2023-05-07 13:20:09 +02:00
swittk	1b0fd45465	ggml : Allow usage of CLBlast alongside Accelerate.framework (#1336 ) Minor edit in ggml.c which originally would prevent OpenCL from loading completely if GGML_USE_ACCELERATE was defined. Minor speedup in prompt eval time.	2023-05-06 23:03:23 -04:00
xaedes	e91b83b899	add GGML_ASSERT to catch ggml_rope and back value errors	2023-05-07 01:47:14 +02:00
xaedes	561fbe0d1b	replace inplace operations for training with copying operations to allow gradient propagation	2023-05-07 01:33:42 +02:00
xaedes	956511b248	fix kv_self gradients for training use ggml_set instead of ggml_cpy to set kv_self cache with properly propagating gradients	2023-05-07 01:32:46 +02:00
xaedes	47561de7d8	add ggml_set(ctx, a, b) to set b in view of a and return modified a necessary to set values into kv_self cache and properly propagate the gradients	2023-05-07 01:30:34 +02:00
xaedes	48bcc4dcf9	fix backward pass for add_at and change arguments to have same order as in view	2023-05-07 01:27:11 +02:00
xaedes	226521a4f1	optimize loss over multiple samples this increases computation graph, need parallel batched forward for more efficiency.	2023-05-07 01:23:51 +02:00
xaedes	7a5dec24f8	add square_error_loss and cross_entropy_loss functions	2023-05-07 01:21:26 +02:00
xaedes	73fd66e9e5	fix training get_example_targets predict the next token, not the current token!	2023-05-07 01:18:17 +02:00
Jed Fox	3924088512	Remove default arguments from sampling functions (#1343 )	2023-05-06 17:01:47 -04:00
xaedes	80223d98fd	add test for ggml_sum_rows gradients	2023-05-06 18:01:32 +02:00
xaedes	e6186d98a5	implement ggml_repeat support for rank > 2 tensors	2023-05-06 18:01:17 +02:00
xaedes	7a15a8370c	implement backward pass for ggml_sum_rows, necessary for cross entropy loss	2023-05-06 17:37:51 +02:00
xaedes	5724628d31	add test for ggml_log gradients	2023-05-06 17:36:21 +02:00
xaedes	65d9f7349d	add ggml_log operation necessary for cross entropy loss	2023-05-06 17:35:40 +02:00
xaedes	8cf04fec9d	fix soft_max backward pass for input->ne[1] != 1	2023-05-06 17:32:13 +02:00
xaedes	b4c273f7a3	add ggml_reshape_1d, ggml_reshape_4d and ggml_view_4d	2023-05-06 17:29:41 +02:00
xaedes	f1d51d144b	train on multiple examples, generate & print tokens with trained model afterwards ctx0 for evaluation and optimization is renewed for each sample	2023-05-06 14:16:40 +02:00
xaedes	83ee1cd741	fix bug when using ggml_opt to optimize params in one context and use a renewable context for eval and opt when not keeping gradients of model parameters they are overwritten by tensors created by opt, which may be invalid after opt context is renewed. so we need to keep the original gradients and make dups for opt	2023-05-06 13:05:29 +02:00
DaniAndTheWeb	173d0e6419	makefile: automatic Arch Linux detection (#1332 ) This commit is a port of a detection method used in koboldcpp's Makefile in order to automatically set the -lcblas option on Arch Linux	2023-05-05 23:57:14 +02:00
Erik Scholz	a3b85b28da	ci : add cublas to windows release (#1271 )	2023-05-05 22:56:09 +02:00
Pavol Rusnak	921dcee00a	readme: add missing info (#1324 )	2023-05-05 16:43:36 +02:00
Ionoclast Laboratories	2d13786e91	Fix for OpenCL / clbast builds on macOS. (#1329 )	2023-05-05 14:18:21 +02:00
Benjamin Lecaillon	a90e96b266	Convert.py @staticmethod (#1327 ) * Line 698 has one #staticmethod and should not otherwise throw error at unpickle.load() as not callable * Update convert.py --------- Co-authored-by: Ivan Stepanov <ivanstepanovftw@gmail.com>	2023-05-05 03:17:07 +03:00
slaren	94c5652fc0	quantize: make output filename optional, default to ggml-model-<ftype>.bin (#1301 )	2023-05-05 00:58:56 +02:00
Ivan Stepanov	34d9f22f44	Wrap exceptions in std::exception to verbose output on exception. (#1316 )	2023-05-04 18:56:27 +02:00
Ivan Stepanov	d3e8093e9b	convert: support DT_BF16 tensors (#1309 ) Co-authored-by: Pavol Rusnak <pavol@rusnak.io>	2023-05-04 18:54:37 +02:00
44670	360cfe5bec	readme : add OpenBuddy link (#1321 )	2023-05-04 19:33:31 +03:00
44670	2edbdb0f99	main : add --in-suffix option (#1318 ) * adding --in-suffix option * print input suffix before generation	2023-05-04 18:41:12 +03:00
Ron Jailall	20fbf2a2a0	ggml : change immintrin.h to intrin.h for compatibility (#1307 ) * change immintrin.h to intrin.h for compatibility Building on windows11 arm throws an error on this line. Seems like using intrin.h covers x86 and and arm * conditional def of intrin.h * fix typo in ggml.c	2023-05-04 18:05:59 +03:00

1 2 3 4 5 ...

614 commits