llama.cpp

Author	SHA1	Message	Date
Georgi Gerganov	c90d135eb4	examples : fix underscore in beam-search + .gitignore (close #2900 )	2023-08-30 12:53:24 +03:00
M. Yusuf Sarıgöz	0d1c706181	gguf : add workflow for Pypi publishing (#2896 ) * gguf : add workflow for Pypi publishing * gguf : add workflow for Pypi publishing * fix trailing whitespace	2023-08-30 12:47:40 +03:00
alonfaraj	9509294420	make : add test and update CI (#2897 ) * build ci: run make test * makefile: - add all - add test * enable tests/test-tokenizer-0-llama * fix path to model * remove gcc-8 from macos build test * Update Makefile * Update Makefile	2023-08-30 12:42:51 +03:00
Gilad S	35092fb547	docs : add `node-llama-cpp` to `README.md` (#2885 )	2023-08-30 11:40:12 +03:00
Kerfuffle	dc07dc492e	convert : various script cleanups/fixes + merges and special token handling (#2842 ) * convert: Fix permute calls and method/func definitions * Cleanups for gguf-py * Minor types cleanups. * Initial implementation of handling merges and special tokens * convert: Handle special tokens and merges in vocab only mode convert: Vocab only mode no longer requires loading model tensors * gguf: Refactor tensor name mapping * convert: Fix type hint for special_token_types in SpecialVocab * Use common special vocab handling in various conversion scripts * First pass at implementing suggested changes * Second pass * gguf: SpecialVocab: Fix issue with special token content not in a dict gguf: SpecialVocab: Allow skipping handling of merges * convert-falcon-hf-to-gguf: Support --vocab-only option, bail out if no tokenizer.json * convert-gptneox-hf-to-gguf and convert: Only handle merges for BPE tokenizer * gguf: SpecialVocab: Actually set load_merges in object * Uniform args parsing and vocab only mode for convert examples * convert.py: Set gpt2 as tokenizer model when using BPE * Squish last type warning in gguf.py - yay!	2023-08-30 11:25:50 +03:00
chaihahaha	ad9ddcff6e	llm.vim : stop generation at multiple linebreaks, bind to <F2> (#2879 )	2023-08-30 09:50:55 +03:00
staviq	8341a25957	main : log file (#2748 ) * initial, base LOG macro * add .log to .gitignore added basic log file handler * reverted log auto endline to better mimic printf * remove atomics and add dynamic log target * log_enable/disable, LOG_TEE, basic usage doc * update .gitignore * mv include to common, params, help msg * log tostring helpers, token vectors pretty prints * main: replaced fprintf/LOG_TEE, some trace logging * LOG_DISABLE_LOGS compile flag, wrapped f in macros * fix LOG_TEELN and configchecker * stub LOG_DUMP_CMDLINE for WIN32 for now * fix msvc * cleanup main.cpp:273 * fix stray whitespace after master sync * log : fix compile warnings - do not use C++20 stuff - use PRIu64 to print uint64_t - avoid string copies by using const ref - fix ", ##__VA_ARGS__" warnings - compare strings with == and != * log : do not append to existing log + disable file line func by default * log : try to fix Windows build * main : wip logs * main : add trace log * review: macro f lowercase, str append to sstream * review: simplify ifs and str comparisons * fix MSVC, formatting, FMT/VAL placeholders * review: if/else cleanup * review: if/else cleanup (2) * replace _ prefix with _impl suffix --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-08-30 09:29:32 +03:00
Cebtenzzre	849408957c	tests : add a C compliance test (#2848 ) * tests : add a C compliance test * make : build C compliance test by default * make : fix clean and make sure C test fails on clang * make : move -Werror=implicit-int to CFLAGS	2023-08-30 09:20:26 +03:00
slaren	06abf8eeba	ggml : add view_src and view_offs to ggml_tensor for views (#2874 ) * ggml : add view_src and view_offs * update ggml-alloc to use view_src * update ggml_diag_mask to work correctly with automatic inplace * exclude other ops that set an inplace flag from automatic inplace	2023-08-29 23:24:42 +02:00
slaren	c03a243abf	remove outdated references to -eps and -gqa from README (#2881 )	2023-08-29 23:17:34 +02:00
xaedes	bf70e27cd6	fix check_gradient ggml_build_backward_expand was previously replaced by ggml_build_backward, but the assignment of forward graph to backward graph missing	2023-08-29 23:08:30 +02:00
Kawrakow	fa3582f509	Tell users attmepting to run perplexity with too few tokens to use more (#2882 ) Closes #2858 Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>	2023-08-29 23:55:45 +03:00
Kawrakow	e37e69dcc3	10X faster BPE tokenizer (#2876 ) * 10X faster BPE tokenizer * Remove comment that no longer applies --------- Co-authored-by: Iwan Kawrakow <iwan.kawrakow@gmail.com>	2023-08-29 23:55:03 +03:00
xaedes	5854f51188	fix error message in ggml_allocr_alloc to display actual max_avail	2023-08-29 22:49:01 +02:00
xaedes	281245a48f	Merge branch 'master' into finetune-lora	2023-08-29 21:47:28 +02:00
xaedes	8a96d4c2aa	add missing argument 'int i0' to ggml_get_i32_nd & ggml_set_i32_nd header declarations	2023-08-29 21:24:37 +02:00
xaedes	dd4e4bca09	remove unused 'inplace' argument from ggml_compute_backward function inplace operations to add gradients are no longer created by ggml_compute_backward use allocator to automatically make inplace operations	2023-08-29 21:21:10 +02:00
xaedes	a76e66ac8d	fix ggml_acc_or_set to return tensor of correct shape	2023-08-29 21:02:10 +02:00
xaedes	b1aa26f718	add sanity check to ggml_compute_backward, asserting the correct shape of gradients	2023-08-29 21:01:17 +02:00
xaedes	5fcfa7e49e	increase test-grad0 context mem size to accommodate for bigger cgraph	2023-08-29 21:00:19 +02:00
xaedes	82c5247a20	add ggml API functions ggml_unravel_index, ggml_get_i32_nd and its analogs for set and for f32 ggml_get_i32_1d, ggml_set_i32_1d, ggml_get_f32_1d, ggml_set_f32_1d now support non-contiguous tensors. in case of non-contiguous tensor, the 1d index is unraveled into a multi index using ggml_unravel_index to be passed to '_nd' function equivalent. this fixes a bug in test-grad0 which happens due to ggml_build_backward not building purely contiguous tensors anymore	2023-08-29 20:59:31 +02:00
xaedes	5f0a4e971f	avoid stack overflow of large cgraphs in test-grad0	2023-08-29 19:59:41 +02:00
xaedes	794bb7ea42	implement ggml_compute_forward_repeat_f16	2023-08-29 19:59:14 +02:00
xaedes	e28cf7e9ce	update README.md	2023-08-29 19:38:23 +02:00
xaedes	a6165dafcd	remove trailing whitespace	2023-08-29 19:30:42 +02:00
xaedes	5813ac832f	omit tokenization when training is disabled, only save llama lora adapter training can be disabled by passing '-n 0' to finetune	2023-08-29 19:21:45 +02:00
xaedes	ebff3a14c3	remove code to print data checksums which was used to verify correctness of new gguf code	2023-08-29 18:31:20 +02:00
xaedes	1425968ead	remove old checkpoint save & load code	2023-08-29 18:30:16 +02:00
xaedes	6134ad4de7	add python script to convert old finetune checkpoint files to gguf	2023-08-29 18:24:06 +02:00
xaedes	0564f4ed1f	add load & save lora finetune checkpoints via gguf	2023-08-29 18:20:39 +02:00
maddes8cht	53885d7256	py : fix "usage" messages (#2873 ) convert-to-gguf python scripts	2023-08-29 16:51:02 +03:00
jameswu2014	bcce96ba4d	convert.py : fix baichuan7B support (#2870 ) * [Fix]: convert.py support baichuan7B * convert.py : fix trailing whitespaces --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-08-29 12:48:41 +03:00
Jhen-Jie Hong	74e0caeb82	readme : add react-native binding (#2869 )	2023-08-29 12:30:10 +03:00
Cebtenzzre	d4b5e16c32	make : fix clang tests build, add missing examples (#2859 ) * make : do not pass headers to the compiler This fixes building tests with clang. * make : add missing examples * make : fix build-info.h dependencies	2023-08-29 11:42:41 +03:00
Georgi Gerganov	3a007648f2	metal : add option to disable debug logs (close #2764 )	2023-08-29 11:33:46 +03:00
Georgi Gerganov	611363ac79	scripts : add pipefail	2023-08-29 10:50:30 +03:00
Marcus Dunn	95b6e5212f	added `struct` to llama_dump_timing_info_yaml's `llama_context` (#2857 ) fixes C compat.	2023-08-29 09:33:27 +03:00
xaedes	ecb1b20c85	add gguf constants and load/save functions from train-text-from-scratch	2023-08-29 01:40:02 +02:00
xaedes	e030f7b2c5	add LLM_KV_TRAINING_TYPE to train-text-from-scratch checkpoints so that they can be differentiated from lora finetune checkpoints	2023-08-29 01:27:28 +02:00
xaedes	ca97583f0b	remove vocab related code as it is unnecessary	2023-08-29 01:19:45 +02:00
xaedes	a3b45298f1	remove unused code	2023-08-29 01:12:51 +02:00
xaedes	1faee64db9	handle rms_norm and rope parameters the same as in train-text-from-scratch	2023-08-29 01:09:35 +02:00
xaedes	007280c82f	make default value of float member a float literal	2023-08-29 01:04:57 +02:00
xaedes	49af7fbe12	add comment explaining why finetune checkpoints are allocated in one block	2023-08-29 00:57:39 +02:00
xaedes	9a28bce29a	reduce large memory overhead in train-text-from-scratch all gradients had to be pinned so that graph_reset works correctly. this is no longer necessary with the changes to ggml_compute_backward introduced in this PR.	2023-08-29 00:56:44 +02:00
xaedes	271c0300de	remove prediction related code to reduce duplicated code with main use main instead	2023-08-29 00:50:59 +02:00
xaedes	5ce92aed37	finetune bug fixes to compile with merged in code from master	2023-08-29 00:41:19 +02:00
xaedes	daedc6f419	replace llama_n_mult by llama_n_ff	2023-08-29 00:40:53 +02:00
xaedes	aa8016e95d	bug fix: replace GGML_TYPE_SIZE[t] by ggml_type_size(t)	2023-08-29 00:40:30 +02:00
xaedes	aecc3b3890	fix dump_non_result_info_yaml to output multiple lora adapters	2023-08-29 00:39:59 +02:00

1 2 3 4 5 ...

1360 commits