llama.cpp

Author	SHA1	Message	Date
xaedes	5d9fed7e7f	remove shape annotations in llama_eval_internal	2023-05-07 21:45:29 +02:00
xaedes	d20ba6f6e6	update static assert of GGML_OP_COUNT	2023-05-07 21:42:42 +02:00
xaedes	e643fa1619	smaller default values for baby llama model parameters	2023-05-07 21:38:00 +02:00
xaedes	ee565f34e3	Merge branch 'master' into train-example # Conflicts: # ggml.c # llama.cpp	2023-05-07 21:24:12 +02:00
xaedes	4764842120	change name of GGML_OP_ADD_AT to GGML_OP_ACC	2023-05-07 21:14:57 +02:00
xaedes	e0de09d77e	shorten code using a variable	2023-05-07 19:48:38 +02:00
xaedes	49d6daa11e	vastly improve training results instead of logit targets 0 and 1 use -1 and +1.	2023-05-07 19:46:05 +02:00
xaedes	93201abdb7	add trainable lora-only model with all big matrices C split into A,B with A*B=C this is not a lora-finetune, but the whole model changed to have only low-rank "lora" matrices. training this instead of the normal model resulted in much worse results though...	2023-05-07 19:44:51 +02:00
Henri Vasserman	e1295513a4	CI: add Windows CLBlast and OpenBLAS builds (#1277 ) * Add OpenCL and CLBlast support * Add OpenBLAS support * Remove testing from matrix * change build name to 'clblast'	2023-05-07 13:20:09 +02:00
swittk	1b0fd45465	ggml : Allow usage of CLBlast alongside Accelerate.framework (#1336 ) Minor edit in ggml.c which originally would prevent OpenCL from loading completely if GGML_USE_ACCELERATE was defined. Minor speedup in prompt eval time.	2023-05-06 23:03:23 -04:00
xaedes	e91b83b899	add GGML_ASSERT to catch ggml_rope and back value errors	2023-05-07 01:47:14 +02:00
xaedes	561fbe0d1b	replace inplace operations for training with copying operations to allow gradient propagation	2023-05-07 01:33:42 +02:00
xaedes	956511b248	fix kv_self gradients for training use ggml_set instead of ggml_cpy to set kv_self cache with properly propagating gradients	2023-05-07 01:32:46 +02:00
xaedes	47561de7d8	add ggml_set(ctx, a, b) to set b in view of a and return modified a necessary to set values into kv_self cache and properly propagate the gradients	2023-05-07 01:30:34 +02:00
xaedes	48bcc4dcf9	fix backward pass for add_at and change arguments to have same order as in view	2023-05-07 01:27:11 +02:00
xaedes	226521a4f1	optimize loss over multiple samples this increases computation graph, need parallel batched forward for more efficiency.	2023-05-07 01:23:51 +02:00
xaedes	7a5dec24f8	add square_error_loss and cross_entropy_loss functions	2023-05-07 01:21:26 +02:00
xaedes	73fd66e9e5	fix training get_example_targets predict the next token, not the current token!	2023-05-07 01:18:17 +02:00
Jed Fox	3924088512	Remove default arguments from sampling functions (#1343 )	2023-05-06 17:01:47 -04:00
xaedes	80223d98fd	add test for ggml_sum_rows gradients	2023-05-06 18:01:32 +02:00
xaedes	e6186d98a5	implement ggml_repeat support for rank > 2 tensors	2023-05-06 18:01:17 +02:00
xaedes	7a15a8370c	implement backward pass for ggml_sum_rows, necessary for cross entropy loss	2023-05-06 17:37:51 +02:00
xaedes	5724628d31	add test for ggml_log gradients	2023-05-06 17:36:21 +02:00
xaedes	65d9f7349d	add ggml_log operation necessary for cross entropy loss	2023-05-06 17:35:40 +02:00
xaedes	8cf04fec9d	fix soft_max backward pass for input->ne[1] != 1	2023-05-06 17:32:13 +02:00
xaedes	b4c273f7a3	add ggml_reshape_1d, ggml_reshape_4d and ggml_view_4d	2023-05-06 17:29:41 +02:00
xaedes	f1d51d144b	train on multiple examples, generate & print tokens with trained model afterwards ctx0 for evaluation and optimization is renewed for each sample	2023-05-06 14:16:40 +02:00
xaedes	83ee1cd741	fix bug when using ggml_opt to optimize params in one context and use a renewable context for eval and opt when not keeping gradients of model parameters they are overwritten by tensors created by opt, which may be invalid after opt context is renewed. so we need to keep the original gradients and make dups for opt	2023-05-06 13:05:29 +02:00
DaniAndTheWeb	173d0e6419	makefile: automatic Arch Linux detection (#1332 ) This commit is a port of a detection method used in koboldcpp's Makefile in order to automatically set the -lcblas option on Arch Linux	2023-05-05 23:57:14 +02:00
Erik Scholz	a3b85b28da	ci : add cublas to windows release (#1271 )	2023-05-05 22:56:09 +02:00
Pavol Rusnak	921dcee00a	readme: add missing info (#1324 )	2023-05-05 16:43:36 +02:00
Ionoclast Laboratories	2d13786e91	Fix for OpenCL / clbast builds on macOS. (#1329 )	2023-05-05 14:18:21 +02:00
Benjamin Lecaillon	a90e96b266	Convert.py @staticmethod (#1327 ) * Line 698 has one #staticmethod and should not otherwise throw error at unpickle.load() as not callable * Update convert.py --------- Co-authored-by: Ivan Stepanov <ivanstepanovftw@gmail.com>	2023-05-05 03:17:07 +03:00
slaren	94c5652fc0	quantize: make output filename optional, default to ggml-model-<ftype>.bin (#1301 )	2023-05-05 00:58:56 +02:00
Ivan Stepanov	34d9f22f44	Wrap exceptions in std::exception to verbose output on exception. (#1316 )	2023-05-04 18:56:27 +02:00
Ivan Stepanov	d3e8093e9b	convert: support DT_BF16 tensors (#1309 ) Co-authored-by: Pavol Rusnak <pavol@rusnak.io>	2023-05-04 18:54:37 +02:00
44670	360cfe5bec	readme : add OpenBuddy link (#1321 )	2023-05-04 19:33:31 +03:00
44670	2edbdb0f99	main : add --in-suffix option (#1318 ) * adding --in-suffix option * print input suffix before generation	2023-05-04 18:41:12 +03:00
Ron Jailall	20fbf2a2a0	ggml : change immintrin.h to intrin.h for compatibility (#1307 ) * change immintrin.h to intrin.h for compatibility Building on windows11 arm throws an error on this line. Seems like using intrin.h covers x86 and and arm * conditional def of intrin.h * fix typo in ggml.c	2023-05-04 18:05:59 +03:00
DannyDaemonic	db1080876a	Only escape prompts when used with `-e` (#1311 )	2023-05-04 05:08:25 -07:00
DannyDaemonic	c65a7fbfa9	Update main's README.md with new features (#1296 )	2023-05-04 03:02:59 -07:00
Tomas	f647ce040f	fix #1224 reverse prompt and multi line (#1297 ) * fix reverse prompt and multi line * Code Formatting Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-05-04 03:02:30 -07:00
Georgi Gerganov	799fdc1b5d	ggml : vectorize Q8_0 quantization https://github.com/ggerganov/ggml/pull/127#issuecomment-1533648531	2023-05-03 23:24:20 +03:00
khimaros	6daa09d879	examples : read chat prompts from a template file (#1196 )	2023-05-03 20:58:11 +03:00
Georgi Gerganov	bca9ad938a	minor : fix whitespaces (#1302 )	2023-05-03 20:09:42 +03:00
Georgi Gerganov	e2a937ca6a	minor : fix trailing whitespaces	2023-05-03 18:43:23 +03:00
KASR	b0c71c7b6d	scripts : platform independent script to verify sha256 checksums (#1203 ) * python script to verify the checksum of the llama models Added Python script for verifying SHA256 checksums of files in a directory, which can run on multiple platforms. Improved the formatting of the output results for better readability. * Update README.md update to the readme for improved readability and to explain the usage of the python checksum verification script * update the verification script I've extended the script based on suggestions by @prusnak The script now checks the available RAM, is there is enough to check the file at once it will do so. If not the file is read in chunks. * minor improvment small change so that the available ram is checked and not the total ram * remove the part of the code that reads the file at once if enough ram is available based on suggestions from @prusnak i removed the part of the code that checks whether the user had enough ram to read the entire model at once. the file is now always read in chunks. * Update verify-checksum-models.py quick fix to pass the git check	2023-05-03 18:31:28 +03:00
CRD716	a8a2efdc81	examples : various prompt and example fixes (#1298 ) * fix dan.txt * miku prompt improvements * use common characters	2023-05-03 18:26:47 +03:00
Evan Jones	e216aa0463	llama : only copy used KV cache in get / set state (#1272 ) * llama : only copy used KV cache in get / set state * switch to ggml for copying k, v * avoid designated initializers	2023-05-02 22:26:13 -04:00
DannyDaemonic	2485d7a4d3	Process escape sequences given in prompts (#1173 )	2023-05-02 18:46:20 -07:00

1 2 3 4 5 ...

603 commits