llama.cpp

Author	SHA1	Message	Date
CRD716	a8a2efdc81	examples : various prompt and example fixes (#1298 ) * fix dan.txt * miku prompt improvements * use common characters	2023-05-03 18:26:47 +03:00
Evan Jones	e216aa0463	llama : only copy used KV cache in get / set state (#1272 ) * llama : only copy used KV cache in get / set state * switch to ggml for copying k, v * avoid designated initializers	2023-05-02 22:26:13 -04:00
DannyDaemonic	2485d7a4d3	Process escape sequences given in prompts (#1173 )	2023-05-02 18:46:20 -07:00
DannyDaemonic	13b0c68ed7	Handle signals properly on Windows (#1123 )	2023-05-02 18:01:57 -07:00
DannyDaemonic	55bc5f0900	Call sh on build-info.sh (#1294 )	2023-05-02 17:52:35 -07:00
kuvaus	9daff419f6	fix build-info.h for git submodules (#1289 ) * make git build info work with submodules --------- Co-authored-by: Green Sky <green@g-s.xyz>	2023-05-03 02:43:43 +02:00
slaren	bf4b22ffe4	fix missing parameters in `llama_init_from_gpt_params` (#1293 )	2023-05-03 01:36:45 +02:00
Ron Evans	67c77799e0	examples : add llama_init_from_gpt_params() common function (#1290 ) Signed-off-by: deadprogram <ron@hybridgroup.com>	2023-05-02 23:39:51 +03:00
Georgi Gerganov	0e6cbff1b7	llama : fix compile warnings	2023-05-02 23:09:08 +03:00
Georgi Gerganov	5d5817ca60	ggml : fix 32-bit ARM	2023-05-02 22:14:50 +03:00
Ron Evans	8c9be35ff9	examples : improve vertical alignment of a few variables (#1286 ) Signed-off-by: deadprogram <ron@hybridgroup.com>	2023-05-02 20:53:52 +03:00
Marvin Gießing	cc0bb7235c	ggml : fix ppc64le build error and make cmake detect Power processors (#1284 ) * Fix ppc64le build issue * Added support to detect ppc64* processors	2023-05-02 19:42:16 +03:00
Robert Brisita	2bb992f034	llama : allow 0 as a seed number. (#1275 )	2023-05-02 19:23:44 +03:00
Ron Evans	e2cd506999	main : switch input_noecho to input_echo to remove negation (#979 ) Signed-off-by: deadprogram <ron@hybridgroup.com>	2023-05-02 19:13:26 +03:00
slaren	2d099e5193	ggml: add names to tensors (#1268 ) * ggml: add names to tensors * minor improvements to dot file formatting	2023-05-02 16:03:00 +02:00
xaedes	bc1c13bb66	train with two examples, creating new tensors each time..	2023-05-01 22:22:00 +02:00
xaedes	5f23052eb2	switching from training with adam to lbfgs produces much better results in the baby-llama example	2023-05-01 21:01:17 +02:00
xaedes	29a0f8b940	fix softmax in baby-llama example	2023-05-01 20:02:48 +02:00
xaedes	8fde656d24	add baby-llama example training a very small llama model from scratch to output a sinusoidal wave. had to increase maximum number of optimization parameters to train from scratch.	2023-05-01 19:30:46 +02:00
DannyDaemonic	f4cef87edf	Add git-based build information for better issue tracking (#1232 ) * Add git-based build information for better issue tracking * macOS fix * "build (hash)" and "CMAKE_SOURCE_DIR" changes * Redo "CMAKE_CURRENT_SOURCE_DIR" and clearer build messages * Fix conditional dependency on missing target * Broke out build-info.cmake, added find_package fallback, and added build into to all examples, added dependencies to Makefile * 4 space indenting for cmake, attempt to clean up my mess in Makefile * Short hash, less fancy Makefile, and don't modify build-info.h if it wouldn't change it	2023-05-01 18:23:47 +02:00
slaren	58b367c2d7	cuBLAS: refactor and optimize f16 mat mul performance (#1259 ) * cuBLAS: refactor, convert fp16 to fp32 on device * cuBLAS: use multiple streams, choose smartly between mul_mat_q and mul_mat_f16 * fix build * cuBLAS: update block_q5_1	2023-05-01 18:11:07 +02:00
xloem	ea3a0ad6b6	llama : update stubs for systems without mmap and mlock (#1266 ) Co-authored-by: John Doe <john.doe@example.com>	2023-05-01 15:58:51 +03:00
xaedes	1c4dc1e498	update quantization types in switch-case of add_at and add1	2023-05-01 14:43:50 +02:00
xaedes	72bcfb50c8	successfully test backward pass of repeat	2023-05-01 14:43:50 +02:00
xaedes	8b5b2f089e	fix backward pass for repeat requires ggml_sum_rows	2023-05-01 14:43:50 +02:00
xaedes	ba62c79bd5	add missing GGML_OP_SUM_ROWS	2023-05-01 14:43:50 +02:00
xaedes	c4539ede53	add operation ggml_sum_rows ggml_sum_rows(shape[a,b,c,d]) -> shape[1,b,c,d]	2023-05-01 14:43:49 +02:00
xaedes	2277053839	add todos for llama backward pass - implementation for ADD1 backward pass should probably use sum instead of mean (but this backward pass is not required) - repeat is not yet tested and looks like it only works for single element src0 inputs.	2023-05-01 14:43:49 +02:00
xaedes	2ecc690980	successfully test backward pass of rms_norm some tests may fail when gradients are large. could not find a satisfying configuration to check for abs error and relative error that passes all tests while still actually testing the results with tight enough error bounds. when looking at the values the "failed" tests look actually ok. for example: rms_norm: ndims=2, i=0, k=2, x0=0.000153, xm=0.000053, xp=0.000253, f0=0.278594, f1=0.086213, g0=961.905457, g1=966.064941, eps=0.000100, error_abs=4.159485, error_rel=0.004324 it is due to the test logic in check_gradients that they fail.	2023-05-01 14:43:49 +02:00
xaedes	84a4b39917	fix backward pass for rms_norm I would have used formulas from other frameworks, but they differed so I could not decide which is correct. Instead it was derived here in comment using manual forward-backward automatic differention of rms_norm and simplification.	2023-05-01 14:43:49 +02:00
xaedes	b18b72da00	successfully test backward pass of view_1d, view_2d and view_3d	2023-05-01 14:43:49 +02:00
xaedes	84436383eb	fix view backward pass add nb parameters to add_at like in view. together with offset they define how to view dst and src0 during the add_at operation.	2023-05-01 14:43:49 +02:00
xaedes	f0302fa71b	successfully test get_rows backward	2023-05-01 14:43:49 +02:00
xaedes	96e773bbde	fix get rows backward pass	2023-05-01 14:43:48 +02:00
xaedes	7281f60572	move dup call into the actual add_at functions	2023-05-01 14:43:48 +02:00
xaedes	3dbd649cf9	fix diag_mask to work with non-inplace input	2023-05-01 14:43:48 +02:00
xaedes	b9920e5c3e	test-grad0 : fix test for div nargs and ndims was swapped, corrupting the stack	2023-05-01 14:43:48 +02:00
xaedes	19f51592b5	successfully test diag_mask_inf and diag_mask_zero backward	2023-05-01 14:43:48 +02:00
xaedes	d42531fa56	fix comments	2023-05-01 14:43:48 +02:00
xaedes	1997152f7f	test-grad0.c add TODO for view_2d and view_3d add_at (required for view backward pass) is a bit tricky for n_dims > 1.	2023-05-01 14:43:48 +02:00
xaedes	c601df973c	successfully test transpose backward and permute for all permutations also test sub, mul and div up to max n_dims	2023-05-01 14:43:47 +02:00
xaedes	3d21f2646e	implement ggml_cont backward pass	2023-05-01 14:43:47 +02:00
xaedes	02d3fd0894	fix sub, mul and div functions to work correctly with transposed tensors uses the same logic as in add	2023-05-01 14:43:47 +02:00
xaedes	b0555fce95	some minor test-grad0 fixes	2023-05-01 14:43:47 +02:00
xaedes	a7a837047c	successfully test permute backward	2023-05-01 14:43:47 +02:00
xaedes	86b44a02e4	test-grad0.c : add print_elements to help with debugging	2023-05-01 14:43:47 +02:00
xaedes	339b2adf48	fix ggml_forward_add1 functions to work correctly with transposed tensors uses the same logic as in ggml_compute_forward_add1_q_f32, but make it consistent across all ggml_compute_forward_add1_... functions. this also slightly changes the mem access pattern of the different threads to works as in ggml_compute_forward_add1_q_f32.	2023-05-01 14:43:47 +02:00
xaedes	b9416d71f8	fix ggml_forward_add functions to work correctly with transposed tensors uses the same logic as in ggml_compute_forward_add_q_f32, but make it consistent across all ggml_compute_forward_add_... functions. this also slightly changes the mem access pattern of the different threads to works as in ggml_compute_forward_add_q_f32.	2023-05-01 14:43:46 +02:00
xaedes	410a47a79e	minor code format improvement	2023-05-01 14:43:46 +02:00
xaedes	124fdca973	successfully test view backward	2023-05-01 14:43:46 +02:00

1 2 3 4 5 ...

606 commits