llama.cpp

Author	SHA1	Message	Date
Georgi Gerganov	f477fb069b	llama : reorder definitions in .cpp to match .h	2023-08-15 22:29:56 +03:00
Georgi Gerganov	afd135a64c	llama : merge gguf-util.h in llama.cpp	2023-08-15 22:09:56 +03:00
Georgi Gerganov	a02b809a2e	llama : move hparams and vocab from gguf_file_loader to llama_model_loader	2023-08-15 21:09:27 +03:00
Georgi Gerganov	6c3f824697	llama : simplify gguf_file_loader	2023-08-15 20:53:53 +03:00
Georgi Gerganov	2906d5492d	gguf : remove obosolete gguf_get_arr_xxx API	2023-08-15 20:46:18 +03:00
Georgi Gerganov	1751bd4693	gguf : remove oboslete write methods	2023-08-15 20:41:53 +03:00
Georgi Gerganov	f7a6aa9911	gguf : streaming support when writing files	2023-08-15 19:57:37 +03:00
Georgi Gerganov	4ef5e792e3	llama : replace gguf_file_saver with new gguf write API	2023-08-15 18:29:42 +03:00
Georgi Gerganov	35177d735d	gguf : minor	2023-08-15 16:05:23 +03:00
Georgi Gerganov	c9b2f7f1bf	gguf : fixes + simplify example + add ggml_nbytes_pad()	2023-08-15 16:01:38 +03:00
Georgi Gerganov	4463965401	gguf : fix header write	2023-08-15 14:39:27 +03:00
Georgi Gerganov	f6ecd15f83	gguf : initial write API ready + example	2023-08-15 14:35:00 +03:00
Georgi Gerganov	85ebfb8e5d	gguf : write to file API (not tested)	2023-08-15 14:26:28 +03:00
Georgi Gerganov	5cb9d9a87f	gguf : initial write API (not tested yet)	2023-08-15 13:40:07 +03:00
M. Yusuf Sarıgöz	2d87c9c796	llama : refactor tensor names (#2622 ) * gguf: update tensor names searched in quantization * gguf : define tensor names as constants	2023-08-15 13:29:30 +03:00
Georgi Gerganov	da424b6699	llama : gguf_file_saver write I32	2023-08-15 11:31:42 +03:00
Georgi Gerganov	9574f41818	llama : no need to pass full file loader to the file saver just gguf_ctx	2023-08-15 11:28:02 +03:00
Georgi Gerganov	5c85332e99	llama : simplify write_header()	2023-08-15 11:28:02 +03:00
Georgi Gerganov	6e29ed52fb	llama : fix method names	2023-08-15 11:28:02 +03:00
Georgi Gerganov	c9c0b758d4	llama : simplify gguf_file_saver	2023-08-15 11:28:02 +03:00
Georgi Gerganov	66ce19aecb	llama : fix quantization using gguf tool	2023-08-15 11:28:02 +03:00
Georgi Gerganov	a82e3a4d92	llama : style formatting + remove helper methods	2023-08-15 11:28:02 +03:00
klosax	2dd5d2c92c	convert-llama-h5-to-gguf.py : add 70b gqa support	2023-08-15 00:43:10 +02:00
klosax	ca4758290c	gguf-llama.cpp : fix n_head_kv	2023-08-14 23:18:41 +02:00
klosax	ab2cbd03ca	convert-llama-7b-pth-to-gguf.py : add token types	2023-08-14 22:10:50 +02:00
klosax	cedb4870c6	gguf.py : add token types	2023-08-14 22:08:40 +02:00
klosax	5d518d421f	constants.py : add token types	2023-08-14 22:07:53 +02:00
klosax	7ec125b1dc	convert-llama-h5-to-gguf.py : add token types	2023-08-14 22:06:33 +02:00
Georgi Gerganov	6c63550f63	llama : update tokenizer style	2023-08-14 22:11:57 +03:00
Georgi Gerganov	7494c78428	llama : sync gguf-llama with llama (#2613 ) * llama : sync gguf-llama with llama * tests : fix build + warnings (test-tokenizer-1 still fails) * tests : fix wstring_convert * convert : fix layer names * llama : sync gguf-llama.cpp * convert : update HF converter to new tokenizer voodoo magics	2023-08-14 21:33:33 +03:00
goerch	afc4ca2889	convert : update convert-new.py with tokenizer fixes (#2614 ) * Merge tokenizer fixes into the gguf branch. * Add test vocabularies * Adapt convert-new.py (and fix a clang-cl compiler error on windows)	2023-08-14 20:20:04 +03:00
goerch	ec1b100720	llama : tokenizer fixes (#2549 ) * Merge tokenizer fixes into the gguf branch. * Add test vocabularies	2023-08-14 19:30:28 +03:00
Georgi Gerganov	8af3a99ff1	Merge branch 'master' into gguf	2023-08-14 16:39:18 +03:00
Georgi Gerganov	6f14854880	gitignore : add gptneox-main	2023-08-14 16:39:02 +03:00
Jhen-Jie Hong	d783f7982e	metal : return null instead of exit(1) (#2573 )	2023-08-14 16:37:39 +03:00
Cheng Shao	d75561df20	server : add --numa support (#2524 )	2023-08-14 16:36:42 +03:00
Kamil Tomšík	348acf188c	llama : add missing enum keyword in function signatures (#2610 )	2023-08-14 16:35:16 +03:00
Georgi Gerganov	f00780b2ee	llama : sync gguf-llama.cpp with latest llama.cpp (#2608 ) * llama : sync gguf-llama.cpp with latest llama.cpp * minor : indentation + assert * llama : refactor gguf_buffer and gguf_ctx_buffer * llama : minor	2023-08-14 16:28:44 +03:00
klosax	6f64b6c0f8	Create convert-llama-7b-pth-to-gguf.py	2023-08-14 13:51:09 +02:00
Georgi Gerganov	62490f1380	gguf : use UNIX line ending	2023-08-14 13:04:35 +03:00
Georgi Gerganov	0c19ae70d5	simple : minor style changes	2023-08-14 12:58:12 +03:00
klosax	5c5a95ba2d	gguf.py : dont add empty strings	2023-08-14 11:22:06 +02:00
klosax	a7d226f871	convert-llama-h5-to-gguf.py : fixes	2023-08-14 11:14:24 +02:00
klosax	d753dfbcc8	gptneox-main.cpp : tensor name map changes	2023-08-14 10:59:18 +02:00
klosax	806a15749d	Delete gguf_tensor_map.py	2023-08-14 10:57:19 +02:00
klosax	51939d7d1b	Create gguf_namemap.py : tensor name map changes	2023-08-14 10:56:59 +02:00
klosax	5d22a9db13	convert-gptneox-h5-to-gguf.py : tensor name map changes	2023-08-14 10:55:44 +02:00
Johannes Gäßler	1cd06fa25e	CUDA: launch_bounds, small q4_K, q5_K mmq refactor (#2596 )	2023-08-14 10:41:22 +02:00
Jhen-Jie Hong	2feb8934eb	server : fix default grammar by use empty string in the UI (#2604 )	2023-08-14 16:20:17 +08:00
Jhen-Jie Hong	5517d6e692	server : implement json-schema-to-grammar.mjs & add grammar param in the UI (#2588 ) * server : implement json-schema-to-grammar.mjs by follow python impl * server : add grammar support in chat.mjs * server : implement grammer param in the UI * server : generate .hpp * server : remove trailing whitespaces * server : generate .hpp * server : fix sort of prop pairs * server : optimize regex & iteration	2023-08-14 15:16:54 +08:00

1 2 3 4 5 ...

1170 commits