llama.cpp

Author	SHA1	Message	Date
M. Yusuf Sarıgöz	56ccf97b4a	handle default n_predict	2023-10-12 14:34:53 +03:00
M. Yusuf Sarıgöz	dc913ea3c4	minor	2023-10-12 10:48:11 +03:00
M. Yusuf Sarıgöz	1403d87cca	Merge master and fix conflicts	2023-10-12 00:00:57 +03:00
M. Yusuf Sarıgöz	2bc1710e2b	command line: use gpt_params_parse()	2023-10-11 23:17:50 +03:00
Michael Coppola	a8bdd65525	server : add parameter -tb N, --threads-batch N (#3584 ) Co-authored-by: Michael Coppola <info@michaeljcoppola.com>	2023-10-11 22:42:22 +03:00
Kerfuffle	70c29da118	common : fix mirostat state when using multiple sequences (#3543 ) * Fix mirostat state when using multiple sequences * Fix mirostat by completely refactoring sampling! * Try to fix zig build. * Export function to fetch/create default sampler states Code formatting cleanups and add some comments Silence a warning about id not being used when logging is disabled * Apply some renaming suggestions. Fix comments that were out of sync with the pull. * Use more consistant naming convention for sampling contexts	2023-10-11 22:35:46 +03:00
Georgi Gerganov	8c70a5ff25	batched : add bench tool (#3545 ) * batched : add bench tool * batched : minor fix table * batched-bench : add readme + n_kv_max is now configurable * batched-bench : init warm-up batch * batched-bench : pass custom set of PP, TG and PL * batched-bench : add mmq CLI arg	2023-10-11 21:25:33 +03:00
M. Yusuf Sarıgöz	f0f78345f2	Use temperature = 0.1 by default	2023-10-11 15:03:01 +03:00
Zane Shannon	24ba3d829e	examples : add batched.swift + improve CI for swift (#3562 )	2023-10-11 06:14:05 -05:00
M. Yusuf Sarıgöz	0409ae00b6	are you happy editorconfig?	2023-10-11 08:21:29 +03:00
M. Yusuf Sarıgöz	ab2158796f	Check if apples are compared to apples	2023-10-11 08:15:51 +03:00
M. Yusuf Sarıgöz	f1564bb2eb	Merge branch 'master' into llava	2023-10-11 06:59:37 +03:00
M. Yusuf Sarıgöz	587bde8e0c	Maybe seed is unlucky?	2023-10-11 06:40:52 +03:00
Galunid	9f6ede19f3	Add MPT model to supported models in README.md (#3574 )	2023-10-10 19:02:49 -04:00
goerch	233fc1c69f	Minor improvements in GPT2 tokenizer (#3567 ) * Fixing minor bugs in bpe_gpt2_preprocess * Don't add bos token in test	2023-10-10 18:59:52 +02:00
Xingchen Song(宋星辰)	c5b49360d0	readme : add bloom (#3570 )	2023-10-10 19:28:50 +03:00
Xingchen Song(宋星辰)	02d2875def	llm : add bloom models (#3553 ) * feat: Support bloom models * fix(bloom): fix model size --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-10-10 17:48:21 +03:00
Jhen-Jie Hong	0aa6595ae0	swift : improvements and fixes (#3564 ) * swift : use macOS 12 as minimum requirement * swift : add missing ggml-backend.c source * swift : add -O3 -DNDEBUG unsafe flags	2023-10-10 14:31:13 +03:00
M. Yusuf Sarıgöz	d640aae755	add support for 13b model variant	2023-10-10 13:02:24 +03:00
Jan Ploski	f5f9121de1	llm : add MPT support (#3417 ) * CUDA: added support for ggml_clamp (see also: https://github.com/ggerganov/ggml/issues/545) * mpt : added an implementation based (mostly) on falcon integration, modified with deltas from ggml/examples/mpt * mpt : protect against "clip_qkv": null in mpt-7b * mpt : quick fix to avoid "Strange model" warning when quantizing MPT models * mpt : addendum to changeset:84e30e8 - leave parameter clamp_kqv out from metadata rather than use 0.0 to indicate "no clamping" (more compliant with the current GGUF spec?) * mpt : standardized all tensor names to follow GGUF spec * mpt : addendum to changeset:1be89c40 - use "req" parameter of GGUF_GET_KEY macro instead of duplicate code * mpt : fixed comment s/gptneox/mpt/ * mpt : remove tabs, trailing whitespace * mpt : removed ne01 + n_past == ne00 assertion from alibi (cuda/f32) and rope_shift from build_mpt * mpt : updated convert-mpt-hf-to-gguf.py to reflect changes made to convert-gptneox-hf-to-gguf.py in pr:3252 * comment out n_past instead of marking it unused * mpt : removed hardcoded +178 from convert script in favor of utilizing hparams["vocab_size"] * mpt : remove unused tokenizer_json in convert script * ggml : remove obsolete n_past assert in ggml_alibi * llama : print clam_kqv and max_alibi_bias hparams --------- Co-authored-by: Cebtenzzre <cebtenzzre@gmail.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-10-10 10:50:23 +03:00
vvhg1	11ea5c7d96	infill. : fix tokenization (#3508 ) * infill tokens correction * serverinfill tokens correction * removing any leading whitespace from infill suffix and removing leeading space token from suffix when params.escape * removing any leading whitespace from infill suffix and removing leeading space token from suffix when params.escape * only rm when params.escape, rm space if possible which is added back or rm added space token * only rm when params.escape, rm space if possible which is added back or rm added space token * Revert "only rm when params.escape, rm space if possible which is added back or rm added space token" This reverts commit `63ba0b621f`. * fix interactive prompt escaping and fix server infill leading space handling * rm unnecessary bool check	2023-10-10 10:31:21 +03:00
M. Yusuf Sarıgöz	96171de5ef	add llava target to Makefile	2023-10-10 01:50:02 +03:00
M. Yusuf Sarıgöz	5009ae90ef	Handle cases where image file does not exist	2023-10-10 01:49:35 +03:00
M. Yusuf Sarıgöz	ae01c859e5	gitignore /llava	2023-10-10 01:13:12 +03:00
M. Yusuf Sarıgöz	d75a0315f0	are you happy editorconfig?	2023-10-09 23:56:07 +03:00
M. Yusuf Sarıgöz	325d240061	introduce pad-to-square mode for non-square images	2023-10-09 23:53:29 +03:00
M. Yusuf Sarıgöz	4759bfd64c	fix: rm designated initializers	2023-10-09 15:54:55 +03:00
slaren	95bd60a0a6	ggml-alloc : fix assert in debug builds (#3555 )	2023-10-09 15:44:58 +03:00
M. Yusuf Sarıgöz	d78e816365	rm unused import	2023-10-09 14:44:35 +03:00
Georgi Gerganov	fcca0a7004	refact : fix convert script + zero out KV cache to avoid nans (#3523 ) * refact : fix convert script + zero out KV cache to avoid nans * ggml : silu(-inf) should never happen * metal : assert various kernel requirements	2023-10-09 14:32:17 +03:00
Georgi Gerganov	dcc09d2596	metal : do not use mul_mm kernels when ne00 < 64 (#3542 )	2023-10-09 14:28:27 +03:00
M. Yusuf Sarıgöz	8278a7364a	rm unused batch image preprocessing	2023-10-09 14:22:18 +03:00
M. Yusuf Sarıgöz	9b0ec4d2cc	Are you happy editorconfig?	2023-10-09 13:42:04 +03:00
M. Yusuf Sarıgöz	54495c9474	Some cleanup	2023-10-09 13:38:48 +03:00
M. Yusuf Sarıgöz	8af7e2103c	Update readme	2023-10-09 11:10:09 +03:00
M. Yusuf Sarıgöz	444dbce888	Add readme	2023-10-09 09:47:56 +03:00
Georgi Gerganov	db3abcc114	sync : ggml (ggml-backend) (#3548 ) * sync : ggml (ggml-backend) ggml-ci * zig : add ggml-backend to the build	2023-10-08 20:19:14 +03:00
Matheus C. França	eee42c670e	ci : add Zig CI/CD and fix build (#2996 ) * zig CI/CD and fix build Signed-off-by: Matheus Catarino França <matheus-catarino@hotmail.com> * fix build_compiler * ci : remove trailing whitespace --------- Signed-off-by: Matheus Catarino França <matheus-catarino@hotmail.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-10-08 16:59:20 +03:00
M. Yusuf Sarıgöz	2a04d0b5a1	Merge branch 'master' into llava	2023-10-08 15:40:39 +03:00
M. Yusuf Sarıgöz	95da79e740	fix: trailing whitespace	2023-10-08 15:38:47 +03:00
M. Yusuf Sarıgöz	204d08be3d	fix: new line at EoF	2023-10-08 15:24:13 +03:00
M. Yusuf Sarıgöz	0c2bd79781	fix: crlf -> lf	2023-10-08 15:20:39 +03:00
M. Yusuf Sarıgöz	94eeac358a	Use ggml_allocr + rm unnecessary code	2023-10-08 14:58:47 +03:00
Ryder Wishart	8e6716a102	api_like_OAI.py : compat with Microsoft Guidance (#2746 ) Check for None in addition to empty string check in all request params Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	2023-10-08 13:55:58 +03:00
arcrank	9c38d181d4	api_like_OAI.py : simplify function (#2796 ) Simplify function	2023-10-08 13:52:57 +03:00
Johannes Rudolph	a1202a31ed	k-quants : fix comments about block sizing (#3499 )	2023-10-08 13:21:19 +03:00
Georgi Gerganov	94e502dfb7	ci : enable on obj-c changes + fix metal build (#3540 )	2023-10-08 11:24:50 +03:00
Luo Tian	7d8b24932f	zig : fix build by introducing train.cpp (#3539 )	2023-10-08 11:24:01 +03:00
Georgi Gerganov	b0ec5218c3	metal : support MTLGPUFamily < Apple7, formatting, style (#3524 ) * metal : improve decoding speed for batches of 2-16 * metal : rename kernels mul_mat_ to mul_mv_ * metal : indentations * minor * metal : print more GPU info + disable mul_mm for MTLGPUFamiliy < Apple7	2023-10-08 10:01:53 +03:00
Kerfuffle	63d3b06a43	llama : fix missing break in Persimmon arch case statements (#3535 )	2023-10-08 08:22:17 +03:00

1 2 3 4 5 ...

1400 commits