llama.cpp

Author	SHA1	Message	Date
xaedes	934ad8d35d	move some params from lora hparams into model hparams and load model params from gguf this equalizes the model definition in finetune and text-from-scratch and removes the need for additional llama api functions to get model parameters	2023-09-17 16:51:15 +02:00
xaedes	b0ee563748	assert correct base model tensor shapes	2023-09-17 16:43:12 +02:00
xaedes	5ed309810e	align code	2023-09-17 16:41:25 +02:00
xaedes	1dbd6bc3d5	remove n_rot hparam, as it must always be hparam.n_embd_head()	2023-09-17 16:40:40 +02:00
xaedes	56a03faf5f	deduplicate code into function	2023-09-17 16:37:21 +02:00
xaedes	d1bb6fb349	add train option "--sample-random-offsets" Use samples beginning at random offsets. The offset is only applied to the first sample in each batch context window. Together with "--fill-with-next-samples" this may help for training endless text generation. For example given a dataset containing samples "abcd", "ABCD", "0123". With context size of 8 and options "--fill-with-next-samples", "--no-separate-with-eos", "--no-separate-with-bos", the context windows of batches could only be filled with "abcdABCD", "ABCDabcd", "0123abcd", etc. With "--sample-random-offsets" it can also be filled with "23abcdAB", "bcd0123A", etc.	2023-09-17 14:37:41 +02:00
xaedes	bf2ad65836	fix frand to return value in interval [0,1)	2023-09-17 14:28:58 +02:00
xaedes	151bfe9ee1	assert that sample_count > 0, avoiding division by zero	2023-09-17 13:07:17 +02:00
xaedes	ddf5ac257a	use new/delete for train_state instead of malloc/free using malloc may result in seg faults when trying to assign string fields	2023-09-17 12:48:17 +02:00
xaedes	8721785c52	fix compile warnings	2023-09-16 22:28:23 +02:00
xaedes	83061fbdbe	fix compile warnings	2023-09-16 22:19:46 +02:00
xaedes	dd3e7634f0	remove terminating '\0' from tokenization (llama_tokenize is now passed the string length instead of relying on terminating '\0')	2023-09-16 21:31:50 +02:00
xaedes	9db2664dd1	fix saving and loading of training type	2023-09-16 21:21:04 +02:00
xaedes	1d09965179	use die("msg") instead of replace GGML_ASSERT(!"msg") or throw std::runtime_error("msg")	2023-09-16 21:13:03 +02:00
xaedes	1d33ec5b1c	fix condition in load_train_state_gguf	2023-09-16 21:13:02 +02:00
xaedes	9139fec7ff	fix code formating of long function declarations	2023-09-16 20:38:23 +02:00
xaedes	8d82d4c8e6	remove static from process_escape since we need it exposed in header	2023-09-16 20:37:56 +02:00
xaedes	7930caf24c	fix usage of llama_tokenize	2023-09-16 20:36:43 +02:00
xaedes	d3e06d3e73	Merge branch 'master' into finetune-lora # Conflicts: # Makefile # examples/baby-llama/baby-llama.cpp # examples/train-text-from-scratch/train-text-from-scratch.cpp # llama.cpp	2023-09-16 20:31:58 +02:00
xaedes	571dc94da9	increase train_samples by used_samples instead of number of batches on batch can contain more than one sample when option "fill_with_next_samples" is used	2023-09-16 20:23:05 +02:00
xaedes	48d3509190	save and load head_count_kv in lora checkpoints	2023-09-16 20:20:23 +02:00
IsaacDynamo	b541b4f0b1	Enable BUILD_SHARED_LIBS=ON on all Windows builds (#3215 )	2023-09-16 19:35:25 +02:00
xaedes	7aa9ea7f20	fix consume_common_train_arg	2023-09-16 19:08:51 +02:00
xaedes	bef1e97875	move common opt_callback into common/train	2023-09-16 18:54:57 +02:00
xaedes	e9758ae1d2	move common train params into common/train	2023-09-16 18:45:59 +02:00
xaedes	ee27333b16	move train data saving code into callback to unify code of opt_callback train_params are still different in finetune and train-text-from-scratch, so it can't yet be moved to train.h\|cpp	2023-09-16 17:50:16 +02:00
xaedes	a8c8907c62	move train state into struct train_state	2023-09-16 17:30:38 +02:00
Vlad	5dbc2b3213	Enable build with CUDA 11.0 (make) (#3132 ) * CUDA 11.0 fixes * Cleaner CUDA/host flags separation Also renamed GGML_ASSUME into GGML_CUDA_ASSUME	2023-09-16 16:55:43 +02:00
xaedes	9f4b1bf88d	move common train functions into common/train.[h\|cpp]	2023-09-16 16:17:13 +02:00
xaedes	00b656f6db	remove lbfgs related train parameters	2023-09-16 15:59:46 +02:00
goerch	b08e75baea	Fixing the last deviations from sentencepiece indicated by test-tokenizer-1 (#3170 ) * Fix für #2721 * Reenable tokenizer test for LLaMa * Add `console.cpp` dependency * Fix dependency to `common` * Fixing wrong fix. * Make console usage platform specific Work on compiler warnings. * Adapting makefile * Remove trailing whitespace * Adapting the other parts of the makefile * Fix typo. * Fixing the last deviations from sentencepiece indicated by test-tokenizer-1 * Simplify logic * Add missing change... * Fix ugly compiler warning * llama_tokenize should accept strings containing NUL now * Adding huichen's test case	2023-09-16 13:41:33 +02:00
xaedes	ab56b63b27	update train-text-from-scratch with tokenization, sample selection and shuffling from finetune	2023-09-15 23:45:54 +02:00
xaedes	cc60b3f639	remove outcommented old code	2023-09-15 23:45:05 +02:00
xaedes	4f2ce91b9e	add static keywords	2023-09-15 23:44:53 +02:00
Cebtenzzre	e6616cf0db	examples : add compiler version and target to build info (#2998 )	2023-09-15 16:59:49 -04:00
Cebtenzzre	3aefaab9e5	check C++ code with -Wmissing-declarations (#3184 )	2023-09-15 15:38:27 -04:00
Cebtenzzre	69eb67e282	fix build numbers by setting fetch-depth=0 (#3197 )	2023-09-15 15:18:15 -04:00
Meng Zhang	4fe09dfe66	llama : add support for StarCoder model architectures (#3187 ) * add placeholder of starcoder in gguf / llama.cpp * support convert starcoder weights to gguf * convert MQA to MHA * fix ffn_down name * add LLM_ARCH_STARCODER to llama.cpp * set head_count_kv = 1 * load starcoder weight * add max_position_embeddings * set n_positions to max_positioin_embeddings * properly load all starcoder params * fix head count kv * fix comments * fix vram calculation for starcoder * store mqa directly * add input embeddings handling * add TBD * working in cpu, metal buggy * cleanup useless code * metal : fix out-of-bounds access in soft_max kernels * llama : make starcoder graph build more consistent with others * refactor: cleanup comments a bit * add other starcoder models: 3B, 7B, 15B * support-mqa-directly * fix: remove max_position_embeddings, use n_train_ctx * Update llama.cpp Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * Update llama.cpp Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * Apply suggestions from code review Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * fix: switch to space from tab --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-09-15 22:02:13 +03:00
Cebtenzzre	80291a1d02	common : do not use GNU zero-length __VA_ARGS__ extension (#3195 )	2023-09-15 21:02:01 +03:00
Georgi Gerganov	c6f1491da0	metal : fix bug in soft_max kernels (out-of-bounds access) (#3194 )	2023-09-15 20:17:24 +03:00
Cebtenzzre	e3d87a6c36	convert : make ftype optional in simple scripts (#3185 )	2023-09-15 12:29:02 -04:00
Georgi Gerganov	8c00b7a6ff	sync : ggml (Metal F32 support + reduce ggml-alloc size) (#3192 ) * sync : ggml (Metal F32 support + reduce ggml-alloc size) ggml-ci * llama-bench : fix ggml_cpu_has_metal() duplicate function ggml-ci	2023-09-15 19:06:03 +03:00
Engininja2	7e50d34be6	cmake : fix building shared libs for clang (rocm) on windows (#3176 )	2023-09-15 15:24:30 +03:00
Evgeny Kurnevsky	235f7c193b	flake : use pkg-config instead of pkgconfig (#3188 ) pkgconfig is an alias, it got removed from nixpkgs: `295a5e1e2b/pkgs/top-level/aliases.nix (L1408)`	2023-09-15 11:10:22 +03:00
Georgi Gerganov	a51b687657	metal : relax conditions on fast matrix multiplication kernel (#3168 ) * metal : relax conditions on fast matrix multiplication kernel * metal : revert the concurrnecy change because it was wrong * llama : remove experimental stuff	2023-09-15 11:09:24 +03:00
Andrei	76164fe2e6	cmake : fix llama.h location when built outside of root directory (#3179 )	2023-09-15 11:07:40 +03:00
Ali Tariq	c2ab6fe661	ci : Cloud-V for RISC-V builds (#3160 ) * Added Cloud-V File * Replaced Makefile with original one --------- Co-authored-by: moiz.hussain <moiz.hussain@10xengineers.ai>	2023-09-15 11:06:56 +03:00
Roland	2d770505a8	llama : remove mtest (#3177 ) * Remove mtest * remove from common/common.h and examples/main/main.cpp	2023-09-15 10:28:45 +03:00
Cebtenzzre	98311c4277	llama : make quantize example up to 2.7x faster (#3115 )	2023-09-14 21:09:53 -04:00
xaedes	76804fab1d	exclude some more known zero values from computations in flash_attn_f32 & flash_attn_back_f32	2023-09-14 22:19:39 +02:00

1 2 3 4 5 ...

1479 commits