llama.cpp

Author	SHA1	Message	Date
xaedes	6cbf55a64b	add finetune to Makefile	2023-09-01 16:02:45 +02:00
xaedes	5bba329e58	finetune: automatically allocate all memory and changes to command line options remove '--n_examples N' parameter, as it no longer makes sense to call optimization process multiple times in a loop. add '--only_write_lora' command line option: will skip tokenization and training, to only write a llama.cpp comptabile LORA adapter. remove memory buffer related command line options. improve iteration console output.	2023-09-01 15:58:52 +02:00
xaedes	7e01d11a28	add ggml-alloc API function 'ggml_allocr_max_size' to get max size of alloc GGML_API size_t ggml_allocr_max_size(struct ggml_allocr * alloc);	2023-09-01 15:42:40 +02:00
xaedes	d554a70f11	initialize opt ggml context if none was provided	2023-09-01 15:41:57 +02:00
xaedes	4914f855c7	add tensor checkpoints only when gradient checkpointing is enabled	2023-08-31 16:46:21 +02:00
xaedes	e0da1684db	remove finetune option to disable allocator the allocator should always be used. by making sure that it is always used it gets easier to implement automatic memory requirements computation	2023-08-31 16:45:47 +02:00
xaedes	4fd51c4616	fix warnings	2023-08-30 17:12:23 +02:00
xaedes	0c57f9f0b3	fix warnings	2023-08-30 16:55:49 +02:00
xaedes	4e986ac4bc	update README.md	2023-08-30 16:29:09 +02:00
xaedes	b26bd4c34c	add option to save train-text-from-scratch output every N iterations	2023-08-30 16:26:05 +02:00
xaedes	f3590ad8d9	remove trailing whitespace	2023-08-30 16:01:08 +02:00
xaedes	fc456edda6	train-text-from-scratch can train (full finetune) gguf models just pass the gguf model via `--checkpoint-in FN`. after this, to continue training, pass the generated checkpoint instead of the original gguf model. tested with smaller models, bigger models may exceed available memory. use (LORA) finetune for those.	2023-08-30 15:57:17 +02:00
xaedes	e6b7158123	replace custom data getters and setters by ggml functions	2023-08-30 15:21:27 +02:00
xaedes	d487e0531f	move gradient checkpointing code into ggml, new API function: // build gradient checkpointing backward graph gb for gf using provided checkpoints // gb_tmp will contain original backward graph with rewritten backward process nodes, // but without the second forward pass nodes. GGML_API void ggml_build_backward_gradient_checkpointing( struct ggml_context * ctx, struct ggml_cgraph * gf, struct ggml_cgraph * gb, struct ggml_cgraph * gb_tmp, struct ggml_tensor * * checkpoints, int n_checkpoints);	2023-08-30 15:21:27 +02:00
xaedes	2392b6725b	use tensor->view_src instead of ggml_is_view and get_view_source	2023-08-30 14:46:12 +02:00
xaedes	b1709f2d25	Merge branch 'master' into finetune-lora	2023-08-30 13:28:29 +02:00
slaren	06abf8eeba	ggml : add view_src and view_offs to ggml_tensor for views (#2874 ) * ggml : add view_src and view_offs * update ggml-alloc to use view_src * update ggml_diag_mask to work correctly with automatic inplace * exclude other ops that set an inplace flag from automatic inplace	2023-08-29 23:24:42 +02:00
slaren	c03a243abf	remove outdated references to -eps and -gqa from README (#2881 )	2023-08-29 23:17:34 +02:00
xaedes	bf70e27cd6	fix check_gradient ggml_build_backward_expand was previously replaced by ggml_build_backward, but the assignment of forward graph to backward graph missing	2023-08-29 23:08:30 +02:00
Kawrakow	fa3582f509	Tell users attmepting to run perplexity with too few tokens to use more (#2882 ) Closes #2858 Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>	2023-08-29 23:55:45 +03:00
Kawrakow	e37e69dcc3	10X faster BPE tokenizer (#2876 ) * 10X faster BPE tokenizer * Remove comment that no longer applies --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>	2023-08-29 23:55:03 +03:00
xaedes	5854f51188	fix error message in ggml_allocr_alloc to display actual max_avail	2023-08-29 22:49:01 +02:00
xaedes	281245a48f	Merge branch 'master' into finetune-lora	2023-08-29 21:47:28 +02:00
xaedes	8a96d4c2aa	add missing argument 'int i0' to ggml_get_i32_nd & ggml_set_i32_nd header declarations	2023-08-29 21:24:37 +02:00
xaedes	dd4e4bca09	remove unused 'inplace' argument from ggml_compute_backward function inplace operations to add gradients are no longer created by ggml_compute_backward use allocator to automatically make inplace operations	2023-08-29 21:21:10 +02:00
xaedes	a76e66ac8d	fix ggml_acc_or_set to return tensor of correct shape	2023-08-29 21:02:10 +02:00
xaedes	b1aa26f718	add sanity check to ggml_compute_backward, asserting the correct shape of gradients	2023-08-29 21:01:17 +02:00
xaedes	5fcfa7e49e	increase test-grad0 context mem size to accommodate for bigger cgraph	2023-08-29 21:00:19 +02:00
xaedes	82c5247a20	add ggml API functions ggml_unravel_index, ggml_get_i32_nd and its analogs for set and for f32 ggml_get_i32_1d, ggml_set_i32_1d, ggml_get_f32_1d, ggml_set_f32_1d now support non-contiguous tensors. in case of non-contiguous tensor, the 1d index is unraveled into a multi index using ggml_unravel_index to be passed to '_nd' function equivalent. this fixes a bug in test-grad0 which happens due to ggml_build_backward not building purely contiguous tensors anymore	2023-08-29 20:59:31 +02:00
xaedes	5f0a4e971f	avoid stack overflow of large cgraphs in test-grad0	2023-08-29 19:59:41 +02:00
xaedes	794bb7ea42	implement ggml_compute_forward_repeat_f16	2023-08-29 19:59:14 +02:00
xaedes	e28cf7e9ce	update README.md	2023-08-29 19:38:23 +02:00
xaedes	a6165dafcd	remove trailing whitespace	2023-08-29 19:30:42 +02:00
xaedes	5813ac832f	omit tokenization when training is disabled, only save llama lora adapter training can be disabled by passing '-n 0' to finetune	2023-08-29 19:21:45 +02:00
xaedes	ebff3a14c3	remove code to print data checksums which was used to verify correctness of new gguf code	2023-08-29 18:31:20 +02:00
xaedes	1425968ead	remove old checkpoint save & load code	2023-08-29 18:30:16 +02:00
xaedes	6134ad4de7	add python script to convert old finetune checkpoint files to gguf	2023-08-29 18:24:06 +02:00
xaedes	0564f4ed1f	add load & save lora finetune checkpoints via gguf	2023-08-29 18:20:39 +02:00
maddes8cht	53885d7256	py : fix "usage" messages (#2873 ) convert-to-gguf python scripts	2023-08-29 16:51:02 +03:00
jameswu2014	bcce96ba4d	convert.py : fix baichuan7B support (#2870 ) * [Fix]: convert.py support baichuan7B * convert.py : fix trailing whitespaces --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-08-29 12:48:41 +03:00
Jhen-Jie Hong	74e0caeb82	readme : add react-native binding (#2869 )	2023-08-29 12:30:10 +03:00
Cebtenzzre	d4b5e16c32	make : fix clang tests build, add missing examples (#2859 ) * make : do not pass headers to the compiler This fixes building tests with clang. * make : add missing examples * make : fix build-info.h dependencies	2023-08-29 11:42:41 +03:00
Georgi Gerganov	3a007648f2	metal : add option to disable debug logs (close #2764 )	2023-08-29 11:33:46 +03:00
Georgi Gerganov	611363ac79	scripts : add pipefail	2023-08-29 10:50:30 +03:00
Marcus Dunn	95b6e5212f	added `struct` to llama_dump_timing_info_yaml's `llama_context` (#2857 ) fixes C compat.	2023-08-29 09:33:27 +03:00
xaedes	ecb1b20c85	add gguf constants and load/save functions from train-text-from-scratch	2023-08-29 01:40:02 +02:00
xaedes	e030f7b2c5	add LLM_KV_TRAINING_TYPE to train-text-from-scratch checkpoints so that they can be differentiated from lora finetune checkpoints	2023-08-29 01:27:28 +02:00
xaedes	ca97583f0b	remove vocab related code as it is unnecessary	2023-08-29 01:19:45 +02:00
xaedes	a3b45298f1	remove unused code	2023-08-29 01:12:51 +02:00
xaedes	1faee64db9	handle rms_norm and rope parameters the same as in train-text-from-scratch	2023-08-29 01:09:35 +02:00

1 2 3 4 5 ...

1268 commits